Search My Blog

Thursday, April 7, 2011

Technology News: Data Management: CERN Battling Severe Case of Data Indigestion

CERN Battling Severe Case of Data Indigestion

Print Version
E-Mail Article
CERN       Battling Severe Case of Data Indigestion

As a miles-wide research facility recording 40 million sub-atomic events per second, CERN's Large Hadron Collider must deal with massive amounts of data, and it's still struggling with system failures. The need for many disparate systems to communicate with each other creates a high degree of complexity, "and because it's complicated, it fails," said Tony Cass, leader of CERN's database services group.

Tony Cass, the leader of the European Organization for Nuclear Research's (CERN's) database services group, outlined some of the challenges the organization's computer system faces during his keynote speech Wednesday at LISA, the 24th Large Installation System Administration Conference, being held in San Jose, Calif., through Friday.

Smashing beams of protons and ions together at high speeds in CERN's Large Hadron Collider generates staggering amounts of data that requires a sophisticated computer system to handle.

The CERN computing system has to winnow out a few hundred good events from the 40 million events generated every second by the particle collisions, store the data and analyze it, manage and control the high-energy beams used and send and receive gigabytes of data every day.

Numbers, Numbers, Numbers

The accelerator generates 40 million particle collisions, or events, every second. CERN's computers pick out a "few hundred" per second of these that are good, then begins processing the data, Cass said.

These good events are recorded on disks and magnetic tapes at 100 to 150 Mbps (megabits per second). That comes up to 15 petabytes of data a year for all four CERN detectors -- Alice, Atlas, CMS and LHCb. The data is transferred at 2 Gbps (gigabits per second) and CERN requires three full Oracle (Nasdaq: ORCL) SL8500 tape robots a year.

CERN forecasts it will store 23 to 25 petabytes of data per year, which is 100 million to 120 million files. That requires 20,000 to 25,000 1-terabyte tapes a year. The archives will need to store 0.1 exabytes, or 1 billion files, in 2015.

"IBM and StorageTek and Oracle have good roadmaps for their tape technology, but still managing the tapes and data is a problem," Cass said. "We have to reread all past data between runs. That's 60 petabytes in four months at 6 Gbps."

A "run" refers to when the accelerator is put into action. StorageTek is now part of Oracle, whose databases CERN uses.

CERN has to run 75 drives flat out at a sustained 80 Mbps just to handle controlled access, Cass said.

Dealing With the Data

CERN uses three Oracle accelerator database applications.

One's a short-term settings and control configuration that retains data for about a week. "As you ramp up the energy (for the beams) you need to know how it should behave and to have control systems to see how it's behaving and, if there's a problem, where does it come from," Cass explained.

The second is a real-time measurement log database that retains data for a week.

The third is a long-term archive of logs that retains data for about 20 years. There are 2 trillion records in the archives, which are growing by 4 billion records a day. Managing that is complicated. "They want to do searches across the full 2 trillion rows ever now and then," Cass remarked.

There are 98 PCs in all in CERN's control system, which consists of 150 federated Supervisory Control and Data Acquisition (SCADA) systems called "PVSS" from ETM, a company now owned by Siemens (NYSE: SI). The PCs monitor 934,000 parameters.

Overall, CERN has about 5,000 PCs, Cass stated.

CERN's processing power is distributed worldwide over a grid. "There are not many computing grids used on the scale of the LHC computing grid, which federates the EG, EGI and ARC science grids in Europe and the Open Science Grid in the United States," Cass said. "The Grid is enabling distributed computing resources to be brought together to run 1 million jobs a day. Grid usage is really good."

CERN has a Tier Zero center, 11 Tier One centers at different labs, and 150 Tier Two centers at various universities. Tier Zero performs data recording, initial data reconstruction and data redistribution, Cass said. Tier One is for permanent storage, reprocessing and analysis, while Tier Two is for simulation and end user analysis.

CERN has also developed a Google (Nasdaq: GOOG) Earth-based monitoring system that runs about 11,400 jobs worldwide at a data transfer rate of 6.55 Gbps.

Problems, Problems, Problems


Related Stories

Dark Matter Detector Poised for Magical Mystery Tour
August 27, 2010
When the space shuttle program's final flight launches next February, it will carry on board the Alpha Magnetic Spectrometer, a device designed to scan space. Using the AMS as it's parked in orbit, scientists hope to find evidence regarding two types of mysterious matter about which they've hypothesized for years: dark matter and antimatter.

Hadron Smashes Through Door to New Era in Physics
March 30, 2010
The Large Hadron Collider set another record on Tuesday, smashing two proton beams together at a total energy level of 7 TeVs, the highest energies ever observed in laboratory conditions. More than half a million collision events were recorded during the day's experiment, CERN tweeted to the world.

Extreme Cloud Computing, CERN Style
March 17, 2010
In many ways, CERN -- the European Organization for Nuclear Research in Geneva -- is quite possibly the New York of cloud computing. If cloud can make it there, it can probably make it anywhere. CERN deals with large data sets, big throughput requirements, a global workforce and finite budgets. Sound familiar?


No comments: