Climatological Dispersion Patterns with Self-Organizing Maps

Background

Figure 2.  Conceptual SAHARA workflow, with input of HPAC project files and CFSR atmospheric reanalysis output (top row), into the SOM (middle row), to produce hazardous material dosages for typical days (bottom row).
Figure 2.  Conceptual SAHARA workflow, with input of HPAC project files and CFSR atmospheric reanalysis output (top row), into the SOM (middle row), to produce hazardous material dosages for typical days (bottom row).

We have developed a software tool, the SOM-Assisted Hazard Area Risk Analysis (SAHARA), to reduce large climate datasets to more manageable sizes - yet statistically similar - which are then used to produce ensembles of potential hazard outcomes.

Figure 1.  Conceptual SOM workflow for the production of typical days.
Figure 1.  Conceptual SOM workflow for the production of typical days.

The Self-Organizing Map (SOM) is a machine learning / data clustering algorithm which is well-suited for data that have strong topological properties. By employing the SOM algorithm to analyze topological patterns of climatological fields over a regional domain for a 30 year span, we can find a close statistical equivalent with fewer, non-contiguous input days. When using SOMs to cluster monthly climate data in this way, we find that by sampling only 150 days, it reduces computational time by greater than a factor of 6 compared to using the entire climate dataset. 

The SAHARA software can scale from a laptop to workstations to many-core, many-node clusters by using a modern microservice architecture to distribute the Climate Database (CSFR currently), the SOM Engine, atmospheric model ensembles (such as the SCIPUFF Transport and Dispersion model) and pre- and post-processing across available computing resources, either locally or remotely. 

Accomplishments in FY2018

  • Transition of the software to DoD HPC systems for client use and improved efficiency.
  • Fulfillment of 21 client requests for SOM typical days.

 

Plans for FY2019

  • Parameter sensitivity study for chemical and biological (chem/bio) hazards, as the current SOM parameters are optimized for nuclear hazards.
  • Use of WRF and WRF-Chem, in place of CFSR, for chem/bio applications
  • CCMI compliance