Advance data-centric research

CISL has a large portfolio of data-centric research activities. This section presents some highlights from FY2018 that demonstrate the diversity of these activities. They include advancements in development of data assimilation tools that combine observational and model forecasts to produce large ensemble analysis and forecast data sets; an ecosystem for developing, fostering, and distributing tools for analyzing geoscience data; and methods and science for translating the influence of global processes that affect our climate into specific regional and local impacts.

Data assimilation

The Data Assimilation Research Testbed (DART) is a community framework for ensemble data assimilation research and applications. In addition to developing advanced methods for data assimilation, the DART team collaborates with modelers and observationalists to develop data assimilation capabilities for new models and observations.

Depiction of streamflow data
Figure 1. A depiction of the stream order on a test domain for WRF-Hydro and DART. The domain is approximately 100 km on each side. Green triangles indicate locations where stream gauge observations are available.

CISL made significant progress in FY2018 on building a DART capability for the NCAR-based community Weather Research and Forecasting Hydrologic (WRF-Hydro) modeling system, which it coupled with DART to perform hourly streamflow forecasts over the continental United States. The coupled system is run in multiple configurations, including a vector-based representation of surface water. The current framework runs over the “Matthew” domain on the East Coast of the U.S. The domain is approximately 100 km square, with an ensemble of 80 members assimilating streamflow data from 115 gauges (see Figure 1). The system also estimates physical and hyper model parameters such as factors that regulate the efficiency of the streams’ response to external fluxes (precipitation, for example). For data assimilation purposes, localization is implemented along the stream in such a way that observations affect only a restricted length of upstream and downstream sections. Proposed pattern-based localization ensures that streams from different watersheds are not updated with potentially unrelated streamflow data. Different adaptive localization formulations are being explored, and methods for tackling sampling and model errors are being investigated.

Open-source data analysis tools

CISL has been involved in the Pangeo Data open source development community since its founding in 2016 with the mission “to cultivate an ecosystem in which the next generation of open-source analysis tools for ocean, atmosphere and climate science can be developed, distributed, and sustained.” Funded by an NSF EarthCube grant awarded in September 2017, CISL and Columbia University’s Lamont-Doherty Earth Observatory collaborated over the past year to develop and better integrate an ecosystem of independent open-source packages for geoscience data analysis, including Dask, xarray, and Jupyter Notebooks. CISL’s focus in the Pangeo Data project has been to assist with deployment and use of the Pangeo Data Environment on NCAR systems, including Cheyenne, Geyser, Caldera, and the new data analysis and visualization cluster, Casper.

Most notably in FY2018, the new environment was standardized and CISL simplified and implemented it for use on NCAR systems. Simplification of the Pangeo Data Environment used on the Cheyenne system was aided by development of an add-on package, Dask-Jobqueue, which enables users to launch new jobs directly from within a Jupyter Notebook interface. CISL also established standing up a JupyterHub server as a priority. This will allow “push-button” access to Cheyenne via Jupyter Notebook sessions, akin to JupyterHub deployments at the National Energy Research Scientific Computing Center and the University of Colorado, Boulder.

Regional and local climate impacts

CISL made progress in FY2018 toward several major scientific advances in simulated climate model data production, archiving of data, bias correction, and regional process level analysis through grants from NSF, the Department of Energy (DOE), and the Department of Defense. One highlight was completion of a set of simulations for the North America CORDEX (NA-CORDEX) program by CISL’s Regional Integrated Science Collective (RISC). RISC ran the simulations with the Weather Research Forecast (WRF) model and the Regional Climate Model Version 4 (RegCM4) in collaboration with Iowa State University. Driven by CMIP5 global climate models using the RCP 8.5 “business as usual” greenhouse gas scenario, the simulations spanned 150 years at 50-km and 25-km resolutions over most of North America. For the DOE project, 12-km WRF simulations were driven by three of the same models, with the same greenhouse gas scenario, for two 30-year time periods representing the current and late 21st century.

Graph comparing simulation outputs
Figure 2. Equilibrium climate sensitivities (oC) versus annual mean change in temperature from NA-CORDEX simulations and their parent Global Climate Models (GCMs) for 1951-2000 versus 2050-2099 over the NA-CORDEX domain (most of North America).

RISC populated the NA-CORDEX website with output from those simulations and others, with recommended guidance for use, and with documentation of model characteristics. A submitted first analysis of the RegCM4 and WRF simulations demonstrates that the equilibrium climate sensitivities strongly influence temperature changes seen in the regional climate models down to the grid point level (Figure 2). These and other efforts illustrate CISL’s approach to the grand challenge for Earth system science: translating the influence of global processes that affect our climate into specific regional and local impacts. CISL’s research combines knowledge of Earth system models, downscaling methods, scientific workflows for large data sets, statistics, and the needs and constraints of regional and local stakeholders. This effort integrates CISL expertise in data science and impact assessment with the goal of transferring climate science into useful products for decision making in adaptation research and risk analysis.