Imperative IV

Provide comprehensive data services, open access, and long-term stewardship of data

NSF policy requires data set preservation and availability to users pursuing research questions apart from those that justified the original project, and NSF requires that grant proposals include comprehensive data management plans. Additionally, one of NSF’s core expectations in the NSF - UCAR Cooperative Agreement focuses on data issues, specifically calling for NCAR to “serve as stewards of high quality scientific data on behalf of the community through maintenance, enhancement and curation.” For EOL, this charge is a joint effort between the three EOL research platform Facilities (ISF, RAF, and RSF) and the Computing, Data and Software Facility (CDS). Whereas the first three Facilities are responsible for data acquisition from our sensors, instruments and facilities, CDS is responsible for developing and maintaining EOL’s data and metadata services, collaborative tools, and software engineering, all of which are integral to EOL Imperative IV.   

Data sets collected and preserved by EOL have value that extends far beyond immediate project-team use. Modern data-access mechanisms increase the importance and utility of data set preservation, and improve data access by the scientific community. Leveraging these mechanisms, EOL has enlarged the scope of its data services to include service provision from mission planning stages, to data collection and quality control and long-term archives. EOL also provides stewardship for a select set of data not collected by its observational facilities. These efforts directly feed into NCAR’s Strategic Plan Imperative to develop and provide state-of-the-art data services that meet the needs of NSF, NCAR, and the science community. 

 

Data and Publication Stewardship Project (DPS)

The Data and Publications Stewardship Project (DPS) began in late 2015 with the goal of identifying, locating, managing, and then assigning Digital Object Identifiers (DOIs) for all datasets collected from EOL instrumentation between 2005 to 2015.  By managing these datasets within the EMDAC data system, we now allow for easy access and metrics tracking for the data and metadata.  Significant progress has been made on metadata cleanup, field project web page development, dataset addition to EMDAC, and DOI assignment, and as of the end of FY 2016, 90% of all datasets from that timeframe have been assigned DOIs.

The DPS project also included work on identifying publications resulting from EOL-maintained instrumentation and datasets and metrics collection. Over 6,000 publications are now included in the EMDAC database, including the 1,200 added for the years 2005 through 2015, and EOL began work in 2016 to verify that all publications included are indeed associated with EOL-supported field campaigns.

 

Mission Coordinator and Catalog Maps for Aircraft

A new software tool aimed at scientists has being installed on EOL aircraft.  This tool merges the Mission Coordinator Display with the Catalog Maps GIS Tool and allows airborne and ground-based scientists to overlay relevant information (e.g., satellite imagery, radar data, lightning, vertical profiles) and flight imagery with the locations, tracks, and plans of project aircraft in real time. Previously, the Catalog Maps GIS Tool (shown above) only worked on the ground. As part of this upgrade in 2016, its capabilities are now available in the air.  Merging these tools provides a more coherent set of displays to scientists and flight personnel and improves the flow of supporting products between the ground and the plane. The new software will also provide a streamlined pathway for upgrading features and services both on the ground and in the air and will offer the ability to playback products during a mission using the tool - a feature that was not available with the Mission Coordinator Display. The new Catalog Maps tool was demonstrated onboard the NSF/NCAR GV during the ORCAS campaign (see here).  

 

Backup and Online Access of EOL’s HPSS Holdings
EOL currently has about 350 terabytes (TB) of data composed of 4.3 million files spanning the past 35 years on NCAR/CISL’s High Performance Storage System (HPSS).  In FY 2016, EOL undertook work to create a consolidated version of EOL’s HPSS holdings that would fit on roughly 45 cartridges, as opposed to the 6,600 tape cartridges on which the data currently reside.  This dataset would be held at the NWSC and facilitate migration of EOL observational data for any future HPSS upgrades. We also began work to create an offsite disaster recovery file set, which would be stored at the NCAR Mesa Lab, and to build an online copy of EOL data that can be integrated into EMDAC.  This implementation will allow for very quick access to EOL data by the user community, allow for building more advanced data services on top of the archive, and will address issues of speed and disaster recovery.

 

CHORDS
In FY 2016, EOL’s Cloud-Hosted Real-time Data Services (CHORDS) proved vital in its implementation on a few projects.  Funded by the EarthCube Initiative, CHORDS allows scientists to easily provide Internet access to real-time streaming data, with the goal of lowering the barrier for instrument teams in putting their real-time data online and in standard formats.  Scientists and engineers from the Joint Numerical Testbed Program (JNTP) in NCAR/RAL and the UCAR/JOSS program developed a way to three-dimensionally print automated weather stations (3D PAWS), and CHORDS was in turn implemented into the 3D PAWS project to retrieve the data from the weather stations, allowing it to be viewed and distributed online. EarthCube logo

CHORDS was also tested in FY 2016 as a method to provide access to EOL field data in real time during ORCAS, for the purpose of improving data quality for that experiment, and is being used with atmospheric, hydrological, and solid earth sensors. Please see the EarthCube website for more information.