43
inShare

Coming by 2023, an exascale supercomputer in the U.S.

Wide-angle view of the ALMA correlator, one of the most powerful supercomputers in the world. — The ALMA correlator, one of the most powerful supercomputers in the world, has now been fully installed and tested at its remote, high altitude site in the Andes of northern Chile. This wide-angle view shows some of the racks of the correlator in the ALMA Array Operations Site Technical Building. This photograph shows one of four quadrants of the correlator. The full system has four identical quadrants, with over 134 million processors, performing up to 17 quadrillion operations per second. [December 2012]
Credit: European Southern Observatory (ESO)

That's assuming Congress allocates the money for it

Computerworld | Nov 19, 2014 3:55 AM PT

NEW ORLEANS -- The U.S. has set 2023 as the target date for producing the next great leap in supercomputing, if its plans aren't thwarted by two presidential and four Congressional elections between now and then.

It may seem odd to note the role of politics in a story about supercomputing. But as these systems get more complex -- and expensive -- they compete for science dollars from a Congress unafraid of cutting science funding.

That political reality has frustrated the supercomputing community, and prompted an effort at this year's big supercomputing conference, SC14, here to educate researchers on the need to sell the benefits of supercomputing to a broader audience.

The theme of this year's conference: "HPC matters."

Supercomputing funding efforts in the U.S. are getting a boost from rising global competition from Europe, Japan and China, which now has the world's faster supercomputer. The U.S. Department of Energy last week announced $325 million for two 150-petaflop systems from IBM, with an option on one system to build it out to 300 petaflops.

Dave Turek, vice president of technical computing at IBM, said these systems have the architectural capability to support 500 petaflops, or a half of an exaflop.

One exaflop equals one quintillion (a quintillion is 1 followed by 18 zeros) calculations per second. It is the next great goal in supercomputing that followed the U.S. achievement in 2008 of reaching one petaflop, or 1,000 teraflops, on a system built by IBM. A petaflop equals one quadrillion (1 followed by 15 zeros) calculations per second.

The 2023 date "is when we are going to have an exascale system," William Harrod, Research Division Director for DOE's Advanced Scientific Computing Research program, said in an interview. While the U.S. has spent about $300 million so far on the next generation of systems, that's a "low level," said Harrod.

Congress will have to approve more funding to advance research to meet the development timelines, he said. And while congressional support "looks good today," Harrod isn't predicting the future.

The technical challenges to building an exascale system are many. They include solving software problems to enable parallelism across what may be hundreds of thousands of compute cores; dealing with reliability and resiliency needs in an environment that will see ongoing core failures; and energy efficiency.

That last issue, energy efficiency, gets a lot of attention. For every megawatt of power, the annual cost is roughly $1 million. The 150-petaflop systems DOE has planned for 2017 will operate at about 10 MW.

The top researchers internationally acknowledge that there is competition to reach exascale, but there's also an understanding that software stack development is so complex that international cooperation is needed.

Although the Europeans are operating on a time frame that may be similar to the U.S., Japan had earlier announced a goal to reach exascale by 2020. But Akinori Yonezawa, deputy director at the Riken Advanced Institute for Computational Science, said, in an interview Tuesday, that the goal is to now build a 200- to 600-petaflop system by 2020, not an exascale system.

Last month, Riken selected Fujitsu to develop the basic design for this system.

In 2008, the first U.S. petascale system came from by IBM. If Moore's Law still applied to high performance computing, the U.S. should reach exascale by 2018. But it became clear early on that the technical issues were too great to meet that date.

Exascale won't necessarily be an easy thing to agree on.

An exascale system can be built today by just connection "a gazillion" GPUs at it, said IBM's Turek. "The question is what will work on? What will it support?" he said.

Today, the Linpack benchmark, which measures a system's floating point rate of execution, is widely used to determine capability and ranking on the Top 500 supercomputer list. But for an exascale system, Turek said, a more useful metric may be application performance: how much improvement is the system delivering for a real-world use.

Turek said the DOE systems IBM is building are a stepping stone to exascale. "It's a vehicle to mitigate risk, because we know there is a tremendous amount of learning and innovation that needs to take place," he said.

Patrick Thibodeau — Reporter