RAL is a leader in prediction systems at the intersection of data science and atmospheric science for many years. A recent research project and key development area has been to integrate machine learning within parameterizations of numerical weather prediction models.

Surface layer parameterizations in numerical weather prediction models provide an interface between the land surface model and the lowest levels of the atmospheric model through the calculation of momentum, sensible heat, and latent heat fluxes. Current surface layer parameterizations are based on Monin-Obukhov (MO) similarity theory, which links the near surface vertical profiles of wind, temperature, and moisture to their relevant fluxes through the use of empirical functions conditioned on the stability of the surface layer. While these empirical functions agree closely with observations under homogeneous conditions, there are many situations in which observed fluxes do not match the estimates from similarity theory. Therefore, the goal of this project is to train a diverse set of machine learning approaches on multi-year time series of surface layer and flux observations.

We have acquired the necessary surface layer observations and quality controlled the data from meteorological towers in Cabauw, Netherlands, and Idaho, United States. We have trained random forests and artificial neural networks to predict friction velocity and the temperature and moisture turbulent scale terms. These terms can be used to derive the surface momentum, sensible heat, and latent heat fluxes as well as calculating stability diagnostics. We have evaluated each machine learning model and identify which approaches perform best under different stability regimes and weather conditions. We found that the machine learning approaches generally have lower error and higher correlation coefficient than MO Theory (Figure 1 shows the results for Temperature Scale on the Cabauw dataset and Figure 2 shows the results for the Friction Velocity also on the Cabauw dataset).

To verify the robustness of the models, we tested training the models at one site and applying to the other. For all predictand variables (friction velocity, temperature scale and moisture scale), the machine learning models generally outperformed the MO similarity theory.

The best performing models are then evaluated within the WRF single column model to check for any potential biases created during the numerical model integration process. These machine learning based methods are compared to the empirical method of surface layer parameterizations in WRF model. The first step in the process was to save scikit-learn decision trees from random forest to csv files and read them into Fortran as an array of decision tree derived types. Within the random forest surface layer parameterization, the process calculates the derived input variables for ML models, feeds vectors of inputs to random forests for friction velocity, temperature scale, moisture scale and calculates fluxes, exchange coefficients and surface variables. The current work is testing with WRF Single Column Model on idealized case study using GABLS II constant forcing, YSU Boundary Layer and Slab Land Surface Model. The initial results are promising with the random forest implementation generally capturing the daily patterns, as shown in Figure 3.

Areas of development for the next fiscal year include:

- Obtain additional datasets for a more robust verification of the machine learning models.
- Finalize the verification of the random forest surface layer parameterization in Single Column WRF.
- Implement the neural network surface layer parameterization in WRF.
- Begin testing the parameterizations over water for enhancing offshore wind energy predictions.
- Hold a workshop and build online tutorial to advance the usage of machine learning based parameterizations in numerical weather prediction.