Enhance data analysis and visualization software

John Clyne presenting award
Presentation of award to the winner of the VAPOR visualization contest organized by the Korean Institute of Science and Technology Information (KISTI) at the Korean Supercomputing Conference. Nearly 20 students in the atmospheric sciences participated in the contests.

The Visualization and Analysis Platform for Ocean, Atmosphere, and Solar Researchers (VAPOR) project is an open source software development effort aimed at improving the ability of researchers in the Earth System sciences to interactively analyze and interpret results arising from numerical modeling. VAPOR’s unique features include its use of a wavelet-based, progressive-access data model that permits exploration of some of the largest simulation outputs using only desktop computing resources; a feature set and user interface that is focused on the needs of the Earth System sciences community; and a strong emphasis on supporting both qualitative and quantitative data analysis. The VAPOR package has a community of over 8,000 registered users worldwide, and to date has been cited over 280 times.

NCL is an open source data analysis and visualization environment developed in close collaboration between CISL, NCAR climate modeling and weather research groups, and the university and broader geoscience communities. PyNIO and PyNGL are Python modules based on NCL’s file input/output and visualization capabilities. WRF-Python is a new package that provides a comprehensive Python interface to the WRF-ARW computational and visualization capabilities in NCL. These science-driven, well-supported, free tools enable scientists to effectively read, analyze, and visualize a wide variety of complex geoscientific data formats on a variety of computing platforms. NCL has been embraced broadly across the international geoscience community spanning research, education, operational, military, government, and commercial organizations. It is used for creating publication-quality visualizations, analyzing climate model data, and real-time data display at operational centers. The Python tools are aimed at exposing NCL’s unique capabilities to a mainstream language that has widespread adoption in the scientific community. In the period October 2015 to September 2016, NCL was downloaded 20,398 times with a monthly average of 35,220 unique visits to its website. Beta versions of PyNGL and PyNIO were downloaded about 2,350 times.

A CISL strategic goal is expanding the productivity of researchers in the atmospheric, geospace, and related sciences through advanced computing and data services. VAPOR’s Big Data capabilities make it unique among advanced visualization tools and satisfy a critical need for computational geoscience, particularly in the areas of weather prediction, climate, and ocean modeling. NCL and related Python tools address disparate and evolving research issues in the climate, atmospheric, and oceanographic community, for example, the effects of heat waves, droughts, and evapotranspiration on humans and agriculture. Scientists can produce results more quickly and effectively because these tools facilitate decision making, improve understanding, and stimulate insights from data sets that are being produced in varieties and volumes far exceeding anything they’ve ever had to manage before.

VAPOR’s FY2016 efforts were again focused primarily on meeting the contractual obligations of grants from the Korea Institute of Science and Information Technology (KISTI) and the NSF. The KISTI award funded the development of numerous enhancements to VAPOR’s suite of visualization tools such as support for calculation and display of basic data statistics; closer integration with Python in the form of plotting capabilities enabled with the widely used Matplotlib module; and animation encoding. These new features will be made publicly available in Release 2.6 of VAPOR planned for November 2016. Additionally, the KISTI grant supported ongoing efforts to support coupled model analysis. A two-year NSF SSI2 grant was completed with all milestones successfully met, the most notable of which was the refactoring of VAPOR’s wavelet-based, progressive-access data format – the VAPOR Data Collection (VDC) - to improve its generality and hasten its adoption by other data analysis packages. Toward that end the new data format has been integrated into partner UCSD’s bioimaging package, QUEST, and nascent efforts were begun to add support for the VDC in CISL’s NCL package. In addition to meeting contractual obligations, the VAPOR team completed and released an evaluation version of VAPOR3. VAPOR3 represents a complete refactoring of the VAPOR code based aimed at addressing many of the limitations of the original design.

A major new version of NCL is well underway, with over 80 new functions for extreme value statistics, heat stress, crop and evapotranspiration, bootstrap estimates, and meteorology, new graphical capabilities based on high user demand, critical updates to the internal map database, major overhauls to the file I/O library, and significant speed-up of popular computational routines. A release is expected in November 2016. At the beginning of FY2016, CISL hired two software engineers to further its Python integration efforts with NCL. This resulted in the completion of two long-standing milestones: the creation of a WRF processing Python package which uses “xarray” as its core data model, and the integration of our Python tools and NCL under the Conda package manager, making these tools significantly easier to install. WRF-Python is in use by friendly testers, with an expected release date of December 2016. Both PyNGL and PyNIO had beta releases in FY2016, with major releases expected in early FY2017.

Climate reanalyzer visualization
Climate Reanalyzer, developed by Sean Birkel, a Research Assistant Professor at Climate Change Institute / University of Maine, is a platform for visualizing a variety of weather and climate datasets and models. The site includes maps, time series, and correlation interfaces for several monthly reanalysis models, maps for CFSR daily reanalysis, maps for monthly 4km PRISM U.S., and an interface for plotting data from the Global Historical Climatology Network (GHCN). Visitors to Climate Reanalyzer include researchers, teachers, students, and climate and weather enthusiasts. Images from Climate Reanalyzer regularly appear on blogs and weather and climate-related news media. Data processing and graphics on Climate Reanalyzer are done almost entirely from scripts written in NCL.

The VAPOR project is supported by NSF Core funds, a subaward from the University of California at San Diego, NSF 54067252, and a grant from the Korean Institute for Science and Technology Information. The NCL and related Python tools project is supported by NSF Core funds.