Jeremiah Johnson receives NASA grant to develop massive machine learning-ready auroral image database

Kassidy Taylor
Composite of 49 all-sky auroral images taken from various sites across North America

Jeremiah Johnson ‘10G, assistant professor of data science at UNH Manchester, recently had a research proposal selected for funding by the National Aeronautics and Space Administration (NASA). The approximately $170k award will support Johnson’s work with colleagues from the Geophysical Institute at the University of Alaska-Fairbanks on classifying a large dataset of white light all-sky auroral images. These images have been collected every three seconds during the nighttime hours at various sites in North America since approximately 2006 as part of the THEMIS mission, which seeks to better understand auroral substorms that are visible in the Northern hemisphere as a sudden brightening of the Northern lights. Johnson’s efforts will produce the largest publicly available, labeled, homogeneous, machine learning-ready auroral image database created to date.

Johnson’s proposal, Producing Homogeneous Machine Learning-Ready Auroral Image Databases Using Unsupervised Learning, is one of 12 selected for NASA’s Living With A Star Tools and Methods Program, which solicits tools, techniques and methods that enable critically needed science advances in the area of heliophysics research.

The mechanisms that cause the aurora are known, but the smaller-scale auroral forms observed in all-sky images and their connection to the dynamics of Earth’s magnetosphere are not well-understood. Machine learning offers the possibility of surfacing new knowledge in this area, but Johnson notes that existing auroral image databases are not yet machine learning-ready. With the NASA funding, Johnson is developing a state-of-the-art unsupervised machine learning algorithm capable of identifying distinct categories of auroral images. This algorithm will be used to automatically label the white light auroral image data from the THEMIS mission, enabling the space science community to conduct statistical studies on the relationship between different categories of auroral images, near-earth solar wind conditions and geomagnetic disturbances at the earth's surface that were not previously possible.

Johnson plans to deliver the machine learning-ready dataset produced by this research, along with the models and software necessary to reproduce the results, to NASA’s Space Physics Data Facility by October 2023.

learn more about the analytics & data science program