Improving Prediction of High-Impact Weather

Severe thunderstorms are a considerable forecast challenge with socially disruptive impacts. A number of efforts are underway within MMM to improve prediction of high-impact weather, including use of finer horizontal grid spacing to better resolve inner storm processes, refining the quality of initial conditions for weather forecasts, and applying machine learning methods to increase the usability of convection-permitting forecasts for stakeholders.

Despite expectations that finer resolution forecasts should lead to more skillful predictions, past research has come up short in demonstrating a clear benefit to finer horizontal grid spacing for next-day, high-impact weather forecasts. The present study simulates select severe weather events that occurred between 2010-2017 at convection-permitting resolutions of 3-km and 1-km horizontal grid spacing. The coarse resolution forecasts adequately resolved individual thunderstorms, while finer resolution enabled more accurate simulation of processes within thunderstorms. By using a much larger set of events than was used in previous studies, this investigation clearly demonstrates that finer grid spacing improves forecast skill through better placement of storm events (Fig. Aa), and also enables greater discrimination of specific severe storm hazards, such as tornadoes (not shown). However, the finer grid spacing forecasts requires nearly 30 times more computational power per simulation, with only modest forecast improvement for most events. This result motivates consideration of alternative approaches to improve the usability of coarser resolution convection-permitting forecasts.

Among these efforts, prior MMM-led research demonstrated that 10-member ensemble predictions at 3-km horizontal grid spacing provides greater forecast value with lower computational cost than deterministic 1-km forecasts. Methods to initialize ensemble predictions for high-impact weather forecasting is an active area of research, leading to recent activities by MMM to generate ensemble analyses on the same model grid as the convection-permitting ensemble forecasts. This approach reduces prediction errors in the early part of the forecast and allows for the data assimilation system to use more observations, such as those from Doppler radars, to improve the initial placement of storms. The use of higher-resolution ensemble analyses not only improves the early forecast skill, it also leads to ensemble forecasts that more accurately represent the associated forecast uncertainty (Fig Ab), enabling a better assessment of the likelihood of high-impact weather events.

In an effort to bolster the value of coarser resolution convection-permitting forecasts, machine learning methods are employed, specifically to improve the prediction of individual severe storm hazards. While finer resolution simulations better resolve and explicitly predict some storm hazards, machine learning methods can be trained to capitalize on diagnosed near storm environment information, enabling significantly improved forecast guidance for specific storm hazards. A proof of concept was developed using the above-mentioned set of nearly 500 high-resolution forecasts of high-impact weather events. Diagnostics from these forecasts, including environmental and explicit storm attributes, were input into a neural network (NN), which learns the relationships between the diagnostic fields and the occurrence of hazards. After training, the NN is capable of providing probabilities of convective hazards for real-time forecasts that can be used as forecast guidance (Fig. B). Verification of the NN predictions reveal the ability to discriminate between hazard types (e.g., severe wind versus tornado) more effectively than with existing forecast diagnostics, such as updraft helicity. Ultimately, this transformation of forecast guidance will enable stakeholders to make more informed decisions.

MMM is performing this collaborative work with operational forecast system developers at NOAA’s Earth Systems Research Laboratory Global Systems Division and National Severe Storms Laboratory including financial support from NOAA’s Office of Atmospheric and Oceanic Research through NOAA grants NA17OAR4590182, NA17OAR4590114, and NA19OAR4590128, with additional support provided through NCAR’s Short Term Explicit Prediction program, enabled by NSF.

(a) Fractions skill score (FSS) for 3-km (dashed) and 1-km (solid) forecasts for the eastern 2/3rd conterminous U.S. region for 279 springtime forecasts, across a range of precipitation intensity thresholds, for hourly accumulated precipitation from forecast hour 18-36. Larger values of FSS indicate greater forecast skill. Circles overlain on the curve indicate hours where differences in skill between the two forecast sets were statistically significant. And, (b) attributes diagram computed over the CONUS east of 105°W using a 100-km neighborhood length scale aggregated over 26 1–12-h forecasts of 1-h precipitation for a precipitation exceedance threshold of 5.0 mm h-1. The horizontal line near the x-axis represents observed frequency of the event and the diagonal line is perfect.  Points lying in grey-shaded regions had skill compared to climatological forecasts as measured by the Brier skill score.  Values were not plotted for a particular bin if fewer than 500 grid points had forecast probabilities in that bin over all 26 forecasts
Figure A: (a) Fractions skill score (FSS) for 3-km (dashed) and 1-km (solid) forecasts for the eastern 2/3rd conterminous U.S. region for 279 springtime forecasts, across a range of precipitation intensity thresholds, for hourly accumulated precipitation from forecast hour 18-36. Larger values of FSS indicate greater forecast skill. Circles overlain on the curve indicate hours where differences in skill between the two forecast sets were statistically significant. And, (b) attributes diagram computed over the CONUS east of 105°W using a 100-km neighborhood length scale aggregated over 26 1–12-h forecasts of 1-h precipitation for a precipitation exceedance threshold of 5.0 mm h-1. The horizontal line near the x-axis represents observed frequency of the event and the diagonal line is perfect. Points lying in grey-shaded regions had skill compared to climatological forecasts as measured by the Brier skill score. Values were not plotted for a particular bin if fewer than 500 grid points had forecast probabilities in that bin over all 26 forecasts.
An example of forecast output from the Neural Network (NN) machine learning approach for a high-impact derecho event occurring in June 2012 that resulted in extensive damage in Washington, D.C. Bars represent hourly forecast probabilities for each hazard, with shown hazards of severe wind gusts, hail, and tornadoes from top to bottom. Dots above bars represent verifying observations of that hazard at each threshold indicated for both wind and hail.  The inlay upper right indicates preliminary storm report locations, where markers indicate reported hazard types and locations
Figure B: An example of forecast output from the Neural Network (NN) machine learning approach for a high-impact derecho event occurring in June 2012 that resulted in extensive damage in Washington, D.C. Bars represent hourly forecast probabilities for each hazard, with shown hazards of severe wind gusts, hail, and tornadoes from top to bottom. Dots above bars represent verifying observations of that hazard at each threshold indicated for both wind and hail. The inlay upper right indicates preliminary storm report locations, where markers indicate reported hazard types and locations.