Data

Provide Comprehensive Data Services and Long-Term Data Stewardship

Data has been and will continue to be the ultimate product from field campaigns EOL supports. As such, the data we provide must be of high quality and well managed and preserved. Furthermore, in view of the President’s 2013 Open Data Executive Order for public access to data and NSF’s increasing emphasis on multi-disciplinary science, EOL must ensure that its current and historical data is well documented, discoverable, and feed seamlessly into scientific analysis workflows. Imperative 3 describes our activities to meet these challenges, which are divided into three key areas: 1) acquisition, quality control, and data management; 2) standardization of data formats and distribution; and 3) data citation and metrics.

 

Sustain Efficient Acquisition, Quality Control, and Data Management 

Non-NSF Field Campaign support

Data Management support was provided to several field campaigns that were funded by agencies other than NSF (see Deployment for NSF-funded field campaigns). In the spring of FY 2018, the National Oceanic and Atmospheric Administration (NOAA) commenced another season of severe weather studies in the Southeast US in a project called the Verification of the Origins of Rotation in Tornadoes Experiment Southeast (VORTEX-SE 2018). EOL provided its Field Catalog  to assist PIs with decision making and project operations documentation. Additionally, EOL provided for archival of datasets collected during this project.  EOL also provided data archive services to another project called High Ice Water Content (HIWC 2018) which continued to collect data on supercooled liquid water in clouds off the coast of Florida in August. The project was funded by the Federal Aviation Administration (FAA).

Legacy Data Rescue

Examples of legacy data
Examples of legacy field data

A review of EOL’s physical archives consisting of reports, photos, publications and data/metadata stored on various media is currently underway in order to preserve the legacy data collected for future reference and research. This is an important task as NCAR has been involved in over 450 field campaigns over the past 50 years. While most of our data collected over the last 25+ years are on-line, there are a large number of older datasets stored on various media from film, paper, tape and others that need to be converted to digital form while they still can be read. This effort was only partially completed in FY18 and will continue in FY19.

This effort resulted in the creation of a Document Library and a Data Media Room. The Document Library has been inventoried and cross-checked again EOL’s existing web pages. New Data Archive web pages have been added for projects that were missing while others have been updated or corrected with more complete information. An inventory of all of the assets is being created so they can be made available to the research community.

 

Standardize Data Formats and Distribution 

EOL Data Management Committee/ Lab-wide coordination of data processing

Data Management coordination across the laboratory is an important focus of EOL. A new committee was formed to assist with sharing best practices and ensuring more adherence to consistent data archival procedures across the laboratory called the EOL Data Management Committee (DMC).  The DMC is composed of Data Managers from each of EOL's science facility and is led by the head of the Data Management and Services Facility. The group meets on an approximately monthly basis.

GeoDaRRS

With funding from the National Science Foundation, NCAR hosted the Geoscience Digital Data Resource and Repository Services (GeoDaRRS) workshop in August of 2018. The workshop was organized by members of EOL, CISL and the NCAR Library and brought together over 60 individuals from multiple stakeholder groups to discuss data management and archiving challenges and opportunities within the geosciences. The workshop resulted in a report including recommendations intended to provide concrete steps on how stakeholders can move forward and work to address the many data management challenges faced by the geoscience research community.

GOES-R GRB Data Collection and Archival

In collaboration with staff in Unidata and RAL, EOL staff operated a ground station to collect GRB direct broadcast data from the first NOAA next generation geostationary weather satellite – GOES-16. EOL provided imagery and data from this satellite in support of active U.S. field campaigns after it became operational on December 18, 2017. EOL is also archiving these data on a continual basis to the NCAR High Performance Storage System (HPSS).

 

Develop Data Workflows and Citation Metrics

Improving Search Capabilities for users of EOL Data Archives

EOL continues to work on improving metadata for project archives. This included the implementation of Global Change Master Directory (GCMD) keywords and the cleanup of non-standard vocabulary terms in our archive metadata. EOL has collected data in its archive over the past 28 years and over time standards and archive practices have developed and evolved leading to inconsistencies in the metadata. These inconsistencies impact the ability to search for data, particularly by users who weren’t involved in the original field campaigns. The implementation of GCMD keywords and the metadata cleanup starting with recent projects and working backward in time is addressing these issues. EOL has also implemented a free text search capability to improve data archive services for users.