GPU-Accelerated Microscale Modeling: FastEddy

BACKGROUND

Figure 1: These two animations are taken from different time periods within the same simulation where surface skin temperature was prescribed to evolve from a higher temperature (convective cell regime on the top), to lower temperature with weaker thermal forcing (convective roll regime on the bottom).

Figure 2: FastEddy™ limited area domain simulation with the cell perturbation method for resolved turbulence instigation (top) versus a periodic domain reference simulation (bottom) versus. This feature allow FastEddy™ to be applied real-world locations for specific times and dates. The longer-term goal is to provide synchronization of FastEddy™ simulations as nested-domains driven by WRF mesoscale forecast simulations (e.g. High Resolution Rapid Refresh, HRRR forecasts).

 

Figure 3: FastEddy™ has been extended to allow for multiple GPU execution, alleviating the limited memory constraints on domain size for single GPU simulations. Here, the foreground image shows a single domain of size 22.5 km x 54 km consisting of ~10 million gridpoints run on a single GPU. The background image shows the results of utilizing 16 GPUs under horizontal domain decomposition via MPI, to model a domain 16 times larger (90 km x 216 km, 160 million gridpoints) at the same, sub-100m resolution.

 

Figure 4: Contours of velocity components: v = streamwise, u = spanwise, w = vertical, at a horizontal plane at z = 8 m. Flow direction is southerly. Southern boundary uses the cell perturbation to generate resolved turbulence. Downtown Oklahoma City, targeting one of the itensive observational periods during the Joint Urban 2003 field campaign..

The overarching goal of this effort is to design, develop, implement, validate, and promulgate a disruptive capability in the numerical modeling of complex microscale flows utilizing advanced computing architectures. To date, the application of the large-eddy simulation (LES) technique has been restricted to fundamental research due to the substantial computational expense of the method. Nonetheless, the efficacy of this method in capturing the influence of turbulence across a plethora of application scenarios only continues to grow. Our mission is to develop an LES modeling system targeting general-purpose-graphics-processing-unit (GPGPU) architectures in order to achieve at least order-of-magnitude performance gains. Such performance gains are the crucial requirement for realization of the LES method as a viable tool for microscale operational, educational, and more comprehensive research applications.

FastEddy™ is a new hybrid CPU/GPU-accelerated, LES model developed within RAL-NSAP beginning in FY2017. Applications of this model target turbulence-resolving microscale atmospheric boundary layer flow simulation with atmospheric transport and dispersion of hazardous species and greenhouse gases.  FastEddy™ is a resident-GPU model, meaning that all prognostic calculations are carried out in an accelerated manner on the GPU with CPU utilization strictly limited to model configuration and input/output of modeling results.  This resident-GPU approach shows tremendous early potential for achieving faster-than-real-time microscale simulations across domains of order 100-1000 km2 at a resolution of O(10m).

FY2019 Accomplishments

  • Momentum stress, turbulence closure, and surface layer parameterization (Monin-Obhukov) were implemented.
  • Demonstrated ~3x faster than real-time execution in fully-compressible mode for domain extents of 80 km2 at resolution of 30 m for canonical stability regimes on a single NVIDIA GP100 GPU.
  • Implemented building-resolving capabilities and carry out corresponding verification/validation based on a wind tunnel experiment and full atmospheric scale on the Joint Urban 2003 at Oklahoma City.
  • Further optimization to the multi-GPU implementation.
  • Implementation of the cell-perturbation method to generate resolved turbulence from a smooth mesoscale lateral-boundary-conditions forcing.
  • Implementation of two weighted essentially non-oscillatory (WENO) advection schemes of third- and fifth-order accuracy.
  • Code restructuring to have a modularized framework for the hydrodynamics solver.

FY2020 Plans

  • WRF-to FastEddy™-coupling for combined mesoscale and microscale modeling in one system utilizing the cell perturbation method for resolved turbulence instigation at the nested boundaries of LES domains.
  • Implementation of turbulence closure based on a subgrid-scale turbulence-kinetic-energy transport equation.
  • Implementation of a canopy model.
  • Enhance the FastEddy dynamical core to incorporate moist dynamics and microphysics.
  • Incorporate capability to model chemistry processes.
  • Publish several papers on the initial dynamical core formulation and subsequent capabilities.