Provide a capacious and reliable data archive

The High Performance Storage System (HPSS) is an advanced, highly scalable and flexible mass storage and archival resource that supports both NCAR’s supercomputing environment as well as divisional servers run by other NCAR laboratories and UCAR programs. CISL also uses the HPSS to support disaster recovery for irreplaceable data by maintaining HPSS infrastructure both at the NCAR Wyoming Supercomputing Center (NWSC) in Cheyenne, Wyoming and at the NCAR Mesa Lab in Boulder, Colorado. The tape libraries in Boulder are used to store critical data specified in the CISL Business Continuity Plan.

HPSS data growth
This chart shows recent growth of the HPSS archive located at the NCAR-Wyoming Supercomputing Center. The vertical blue lines mark the dates that the 5.34-petaflops Cheyenne supercomputer began production alongside the 1.5-petaflops supercomputer Yellowstone.

Tailored to the needs of the Earth System science community, this capacious and reliable data archive is a strategic component in the cyberinfrastructure that CISL provides and maintains to expand the productivity of the research community. CISL’s ongoing leadership in providing discipline-focused computing and data services is a critical role for NCAR as a national center.

The HPSS system is maintaining a steady growth rate: Data holdings in FY2017 grew by nearly 16 PB of new data and an additional 30 million files. Unique holdings as of October 2017 stand at around 79 PB and 271 million files, with growth since Yellowstone began production averaging around 1.25 PB per month. After recent acquisition of two additional tape libraries at NWSC, the HPSS system has doubled its maximum data capacity from 160 to 320 PB. Further augmentation of the archival system has included purchases of a newer, faster metadata server and additional data movers, both to support the increase in load from the new Cheyenne system and the overlap period with Yellowstone, as well as to increase disk cache residency within the archive which reduces tape mounts. In addition to these improvements, a major HPSS upgrade was completed first quarter of FY2017.

In FY2012, the HPSS was expanded to support the new supercomputing environment at the NWSC, which included the workload from NCAR’s new supercomputer, Yellowstone. HPSS services were relocated to the NWSC on newly installed hardware, and the Mesa Lab Computing Facility (MLCF) in Boulder became a satellite archive site for disaster recovery purposes. The tape libraries in Boulder are used to store critical data specified in the CISL Business Continuity Plan. This disaster recovery service currently supports the NCAR RDA, NCAR’s EOL, and UCAR’s COSMIC program.

To meet the data load from the new Cheyenne system, analysis and projection exercises were conducted to size an augmentation of the current HPSS equipment. Based on this analysis and consideration of projected technology advances, CISL is planning to extend the current archival system subcontract, procuring next-generation tapes and drives to meet the challenges of our data-centric environment.

The NCAR HPSS is managed by CISL under the UCAR/NSF Cooperative Agreement and is supported by NSF Core funds and CSL funding.