CISL hardware cyberinfrastructure services

Yellowstone
Yellowstone’s end panels highlight one of the geological wonders of the national park after which it was named.

CISL deploys and operates NCAR’s high performance computing (HPC) environment on behalf of the atmospheric and related sciences community. The integrated petascale computing, analysis, visualization, networking, data storage, and archival systems constitute a world-class HPC resource for about 2,000 researchers from institutions throughout the U.S. and abroad.

While more details about the cyberinfrastructure managed by CISL appear in subsequent sections of this report, the primary resources include:

  • The Yellowstone compute cluster based on IBM’s iDataPlex architecture: a 1.5 PFLOPS system composed of 4,536 two-socket Intel Sandy Bridge nodes connected by a full fat-tree FDR InfiniBand interconnect.

  • The Globally Accessible Data Environment (GLADE) high-performance parallel file system capable of over 90 GB/second of sustained bandwidth with a data storage capacity of 16.4 PB.

  • A data-sharing environment based on Globus Plus software with a capacity of 1.5 PB.

  • A high-performance data archival system based on IBM’s High Performance Storage System (HPSS). Current data holdings exceed 50 PB, with a capacity of 160 PB.

  • Data Analysis and Visualization resources:

    • The Caldera computation and visualization cluster comprised of 16 nodes based on the same two-socket Sandy Bridge architecture as Yellowstone, but augmented with two NVIDIA K20X GPGPU accelerators per node, and a 16-node Pronghorn system based on the same node-level architecture as Yellowstone.

    • The Geyser data analysis and visualization cluster comprised of 16 nodes based on the Intel Westmere processor and featuring 1 TB of DRAM per node and NVIDIA K5000 GPUs.

  • The Erebus cluster comprised of 84 nodes and based on the same node-level architecture as Yellowstone. Erebus is operated by CISL on behalf of the U.S. Antarctic Program’s Antarctic Mesoscale Prediction System (AMPS) project.

  • CISL’s HPC Futures Lab, a range of smaller systems for pre-production testing and evaluating future and emerging HPC hardware and software technologies.

Yellowstone and its associated complex of HPC systems and storage resources operated in full production status throughout FY2015. CISL currently plans to continue operating Yellowstone through calendar year 2017.

NWSC-2 procurement

In concert with UCAR during FY2015, CISL conducted the NWSC-2 procurement which was designed to obtain a high-performance computing system to replace Yellowstone and augment the GLADE parallel file system. Draft technical specifications were released for vendor comment early in the fiscal year, followed by an early release of the NCAR benchmarks that will continue to exist and be enhanced outside of CISL procurement efforts. The NWSC-2 RFP was publicly released in April, and proposal evaluation was conducted during the summer. At the end of FY2015, final negotiations were conducted to provide the following new NWSC-2 resources for production use by January 2017:

  • A new high-performance computing system with a peak computation rate exceeding 5 PFLOPS.

  • Augmentation of the GLADE storage system with over 20 PB of additional storage capacity and 200 GB/second bandwidth to the new HPC system, with the capability to expand GLADE storage capacity beyond 56 PB and incrementally enhance bandwidth.

Funding

NCAR’s supercomputers are managed by CISL under the UCAR/NSF Cooperative Agreement and are supported by NSF Core funds.