Provide the community with Big Data services

CISL provides our research community with Big Data tools and services for locating, accessing, and analyzing a variety of observational and model research data collections. These data are served through data gateways over high-speed wide-area networks and are also accessible from disk and tape storage on the Yellowstone and Cheyenne computing complexes. These tools and services combine to support our communities’ efforts to extract scientific knowledge from the many petabytes of data available on NCAR’s cyberinfrastructure.

Research Data Archive (RDA): The climate and weather research communities’ data needs continue to grow, so CISL adds new content and access features to the RDA. More than 13,000 unique users acquire 2 petabytes of data yearly through the RDA web portal. In addition, hundreds of internal users access substantial amounts of data directly from NCAR’s Globally Accessible Data Environment (GLADE).

Digital Asset Service Hub (DASH): DASH is maintained by NCAR’s cross-organization Data Stewardship Engineering Team (DSET) initiative. DASH currently provides digital asset management support, engagement, and training resources for NCAR and for UCAR Community Programs. Covered digital assets include datasets, publications, software, and models. Additionally, DASH will provide the information resource for searching and discovering digital assets held by groups throughout NCAR, increasing Big Data services’ scientific value and impact on the research community.

Data Assimilation Research Testbed (DART): Data assimilation (DA) is a key tool for Earth System science that allows models to be confronted with observations. DA is essential for making forecasts for all components of the Earth System at all space and time scales. DART is a software facility for ensemble data assimilation that allows uncertainty quantification, which is essential to many prediction and scientific goals.

DART software: DART software supports community researchers and improves their prediction skill for and understanding of the Earth System. This software helps researchers collaboratively develop and apply data assimilation methods across a wide range of geophysical problems.

Data gateways: Data gateways provide diverse scientific communities with access to data-sharing infrastructure. CISL gateways span climate science, regional climate change, solar science, digital preservation, and international efforts to develop metadata and knowledge infrastructure.

CMIP Analysis Platform: The CMIP Analysis Platform gives researchers convenient access to climate data from the Coupled Model Intercomparison Project (CMIP) on GLADE. By hosting the data on GLADE, the platform enables researchers to use the HPC analysis and visualization systems to work with CMIP data rather than transfer large data sets from Earth System Grid Federation (ESGF) sites to their local machines.

Advanced visualization services: CISL staff work closely with individual scientists to develop engaging and informative visualizations that are used for research, scientific briefings, presentations at conferences, publication, and outreach to NCAR visitors. CISL also explores new technologies and visualization techniques to examine how they can be applied to advance geoscience research.

Data analysis and visualization software: CISL’s portfolio of data analysis software provides an ever-growing community of scientists with unique capabilities tailored to the disciplines we serve. The scalability and performance of these tools are increasingly important in the era of Big Data. The Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Research (VAPOR) offers the capability to efficiently explore enormous or complex 3D data sets. The NCAR Command Language (NCL) is an open source scripting language for geoscientific data analysis and visualization. NCL reads and writes several geoscientific data formats and creates publication-quality graphics.

Reliable, long-term, customized support for scientific advancement defines the overarching merit of CISL’s integrated computing and data services. CISL provides a portfolio of advanced data services specifically tailored for the atmospheric, geospace, and related sciences communities. Stewardship of valuable reference data collections and operation of petascale computing environments are fundamental underpinnings for NCAR’s pursuit of the Grand Challenges identified in its strategic plan, as are the activities to create the coordinated, next-generation portfolio of Big Data services. The criticality of these services is embodied in CISL’s strategic imperative to “Develop and sustain advanced computing and data system services” and its imperative to “Provide the community with Big Data Services.” As with all CISL services, data services evolve in response to changes in the underlying technologies and the scientific demands of the community, informed by CISL’s strategic research and development activities.

The FY2017 accomplishments for each of these efforts is specified in the sections below.

Funding for each of these efforts is specified in the sections below.