Advance data assimilation science

Data assimilation is providing rapid advances in geophysical studies. The Data Assimilation Research Section (DAReS) of IMAGe performs fundamental research on ensemble data assimilation methodologies for application across a wide range of geophysical problems. DAReS develops and maintains a software facility for ensemble data assimilation called the Data Assimilation Research Testbed (DART). DAReS also provides support to a growing community of NCAR, university, and government laboratory partners who are applying ensemble data assimilation methods.

DART provides ensemble data assimilation (DA) tools that use state-of-the-art statistical methods for combining model forecasts with observations to produce initial conditions for forecasts along with estimates of uncertainty. DART tools can also diagnose and improve both models and observing systems. The use of ensembles of forecasts means that DART applications are among the largest and most computing intensive in the geosciences, so effective use of supercomputing facilities using advanced scalable algorithms is essential. All of these aspects of DART are key to meeting CISL’s strategic goal to “Enhance the effective use of current and future computational systems by improving mathematical and computational methods for Earth System models and related observations” and in particular the imperative to “advance data-centric research.”

DART memory requirements
This chart shows the total amount of memory required on a Yellowstone node to run DART with a 50-member ensemble for a WRF model that has 184 million model-state variables. The default version of DART can only be run with 8 processes per node because of its large memory footprint. The new version has good memory scalability with the amount of memory decreasing as the number of processes increases, and allows more efficient use of Yellowstone since 16 processes can be run on each compute node. This allows DART to work with much larger models while using computing more efficiently.

Work to develop a memory-scalable version of DART has continued, and this includes exploring the impact of distributing model metadata across processors. Nine large models including CESM components, WRF, and MPAS have now been converted to work with the new DART version. Improved efficiency for DART with CESM models is being developed in collaboration with CGD by avoiding redundant initialization for ensemble forecasts using the CESM coupler. In addition, a more efficient version of DART for coupled CAM/POP data assimilation has been completed and is being tested by CGD scientists. New forward operators and improved methods of assimilating satellite retrieval profiles of chemical constituents have been developed and tested in collaboration with ACOM. With help from MMM, a version of DART that replicates the capabilities of the NOAA operational ensemble Kalman filter with WRF has been completed and used to understand differences between the capabilities of the two systems for operational predictiton. New algorithms for explicitly estimating correlated observational errors have been developed in collaboration with RAL, and this led to significant improvement when assimilating surface observations and satellite radiances in idealized tests. A novel DA methodology, pseudo-orbit DA was implemented in DART in collaboration with the University of Chicago and Oxford University and is being tested with idealized models.

Data assimilation research in IMAGe is supported by NSF Core funding plus Grant 16-013 from the University of New Hampshire's Open Geospace General Circulation Model program, Grant N0014-15-1-2300 (subaward A15-0093-S001-P0567931) from the DOD Office of Naval Research's National Oceanographic Partnership Program, and Grants OCE149559 and OCE1243015 from the National Science Foundation program Decadal and Regional Climate Prediction using Earth System Models.