Explore meshless numerical methods for modeling

Radial basis functions (RBFs) offer a novel numerical approach for solving atmospheric dynamics to high accuracy. Being a mesh-free method, RBFs excel in solving problems that require geometric flexibility (e.g., coastlines), local refinement for features (e.g., hurricanes), and with little increase in programming complexity when extended to higher dimensional spaces. In particular, the RBF-generated finite differences (RBF-FD) approach has allowed the RBF method to become computationally cost-effective in terms of scalability, memory, and runtime for solving systems of PDEs. The localized and accurate nature of the RBF-FD method:

  • Leads to matrices that are over 99% empty.
  • Allows it to scale as O(N) per time step, with N being with the total number of nodes.
  • Makes it highly suitable for parallelization on accelerator-based computer architectures.

When developing a high-performance-computing (HPC) algorithm, scalability, performance, and portability are essential. Many of the current standard atmospheric models are not particularly portable between high-performance systems. The RBF-FD solver for the shallow water equations (SWE) on the sphere was developed to target the three dominant HPC system architectures: Manycore, Intel Multicore, and NVIDIA General Purpose Graphics Processing Units (GPGPUs). Solver portability on all three architectures was facilitated by the directive-based OpenACC and OpenMP languages for simple shared memory parallelization. MPI was used for distributed-memory parallelization to address the scalability of the solver. This allowed for a single-source implementation requiring only a simple recompilation to run on practically any HPC system today.

Excellent performance was demonstrated on both NVIDIA and Intel systems, as shown in the figures. An approximately 4.5 TFLOPS speedup was achieved on the GPGPU system, and a 6 TFLOPS speedup was achieved on NCAR’s Xeon-based Laramie system. Both of these results represent more than a 100X speedup over the highest achieved performance by the previous single-device GPU implementation.

RBF performance speedups
The strong scaling performance results for an NVIDIA GPU system (left) and an Intel Broadwell CPU system. The Intel system demonstrated super-linear scaling.

This work advances CISL’s scientific efforts to develop scalable algorithms for atmospheric modeling on massively parallel and accelerator-based computer architectures. Development of numerical algorithms based on meshless methods for atmospheric modeling at NCAR is supported by NSF Core funds.