CISL Science

Ensemble size reduces failure percentage

The Ensemble Consistency test was jointly developed between IMAGe and TDD and provides an objective and fast way to test if new CESM runs such as those coming from a different hardware system or even from a different set of compiler options are consistent with an ensemble of runs obtained from Yellowstone using different initial conditions and three different compilers. Although the test has already been successfully implemented in the latest CESM release, we are still working to optimize the ensemble size and composition that is the benchmark for testing CESM consistency under other conditions. A practical question is how large should the benchmark ensemble be? The figure shows results from a Monte Carlo Study selecting 10,000 hypothetical samples that should all pass the Ensemble Consistency Test (ECT) for CESM. The variable is the benchmark ensemble size. It can be seen that an ensemble size of 250 runs from each of three compilers is required to achieve the theoretical (and desired) false positive rate of 0.5%, which is shown as a green horizontal dashed line. That is, a version of CESM that is completely consistent with the benchmark distribution will only be rejected by this test 1 out of 200 times. Such a large ensemble size (3x250) is required because the consistency test depends on the estimation of principal components (e.g., EOFs) from the global annual means of 114 variables. In this case, principal component analysis requires a large sample size to be robust.

CISL research activities support scientific computation, numerical methods, geophysical modeling, and the analysis of geophysical data and model experiments. These activities are chosen to lead the geophysics community in adopting new computational methods and mathematical tools to improve research.

Diverse scientific disciplines often share common tools and numerical methods. The kind of mathematical, computational, and physical sciences housed in CISL focus on areas that have broad application across scientific computation in the geosciences. Hallmarks of this research are innovative and standout contributions that not only have relevance for the overall NCAR scientific program, but also are significant in their specific area of mathematical, computational, or data science.

The figures shown here illustrate some of the diversity of scientific research in CISL. One interesting theme that unifies much of this research is the interplay between seemingly deterministic physical models and algorithms and the use of statistics and random processes to manage uncertainty. The first example illustrates probabilistic forecasts of severe weather to suggest the uncertainty in the forecast. Although probabilistic forecasts are acknowledged to be more useful for supporting decisions, they are also challenging to calibrate, so this is part of CISL’s research in data assimilation. The example shown by the second figure is in the context of software engineering. Here the consistency of CESM being run under new hardware or compiler conditions can be tested using the distribution of results under known conditions that are correct. Statistics and probabilities are used to calibrate this test in a way that addresses the particular issues in Earth System modeling and the practical constraints in software development.

Some notable highlights in CISL research during FY2015 include:

  • Data-centric research that extends tools for data assimilation and data analysis of complex observational data, processing of large simulations, and visualization of model output.

  • Algorithm developments that accelerate the simulation of geophysical processes and make better use of computational and storage resources.

  • Developing and evaluating computational strategies for new architectures to anticipate how codes and workflow may have to adapt to future systems.

Storm forecasts
Examples of forecast probabilities of daily hazardous local storms during June 2015 from NCAR’s real-time ensemble prediction system. These results are based on NCAR’s Data Assimilation Research Testbed (DART) and a high resolution version of the Weather Research and Forecast model (WRF) to produce forecasts of severe storm events. In this study a 10-member ensemble is used to identify representations of individual storm hazards. These are then combined to generate probabilistic severe storm guidance for a 24-hour forecast period. Higher probabilities (warmer fill colors) indicate a higher forecast probability of hazardous storms occurring within a particular region. The two plots in this figure illustrate different distributions of severe storm events for the U.S., highlighting the range of skill that these ensemble forecasts have in predicting high-impact weather events. Overlain in black polygons are areas where local National Weather Service (NWS) forecast offices issued either tornado or severe thunderstorm warnings. Probabilistic forecasts of weather hazards based on ensembles is an active area of research, and this study evaluates how some new measures of severe weather vary over the full area of the the conterminous United States. In particular, this work is novel in using a regional-scale ensemble method as the initial conditions to make the forecasts, and also in using new diagnostics to assess the forecast skill. Future activities planned for the ensemble system include plans to compare this work to the operational GSI system and to consider forecasting winter weather hazards.

The funding sources for these many projects are specified in the following individual reports and subsections.