Meet the winners of The NASA Airathon: Predict Air Quality Challenge, a public competition inviting participants to use NASA satellite data, model outputs, and ground measurements, while developing algorithms for estimating daily levels of surface-level air pollutants with high spatial resolution.
The goal of this competition is to advance the science of estimating surface-level air pollutant concentrations to address air pollution, one of the greatest environmental threats to human health. The results will contribute to the development of more accurate air quality data products from future NASA satellite missions, including MAIA (Multi-Angle Imager for Aerosols), TEMPO (Tropospheric Emissions: Monitoring of Pollution), and AOS (Atmosphere Observing System).
Applications Coordinators of the upcoming NASA MAIA and NASA TEMPO missions led this challenge, with collaboration from the U.S. Department of State, the U.S. Environmental Protection Agency, and crowdsourcing platforms DrivenData and HeroX.
The NASA Airathon challenge was created to engage a global data community, bring together a wide diversity of backgrounds and skills, test hundreds to thousands of models quickly and cost-effectively, and elevate the best-performing solutions to help provide high-resolution air quality information for public safety.
The challenge focused on two critical air quality measures: particulate matter less than 2.5 micrometers in size (PM2.5) and nitrogen dioxide (NO2). When inhaled, these pollutants can lead to negative health impacts including heart and chronic respiratory illness, cancer, and premature death according to the U.S. Centers for Disease Control and Prevention (CDC). According to Airathon collaborator DrivenData, millions of people currently cannot take daily action to avoid pollutants and protect their health because no single satellite instrument provides ready-to-use, high-resolution information on surface-level air pollutants.
Challenge participants were tasked to use NASA remote sensing data and other Earth data to develop models for estimating daily levels of PM2.5 and NO2 (in separate competition tracks) with high spatial resolution across three urban areas: those surrounding the cities of Los Angeles in the U.S., Delhi in India, and Taipei in Taiwan.
Particulate Track (PM2.5) PM2.5 refers to particulate matter less than 2.5 micrometers in size. It can last days to weeks in the atmosphere and penetrate deep into human lungs, increasing the risk of heart disease, lower respiratory infections, and poor pregnancy outcomes.
Trace Gas Track (NO2) NO2 forms in the atmosphere from the burning of fossil fuels such as coal, oil, or gas, and has a short lifetime on the order of hours near the surface. It can cause respiratory issues, while also contributing to the production of nitrate aerosols, a component of PM2.5, and ozone in the atmosphere, a pollutant harmful to both human and plant health.
This challenge generated over 1,250 submissions and drew more than 1,000 participants from 123 countries and winners from four continents. The total prize for this competition was $50,000. Congratulations to the challenge winners from each track!
Particulate Track (PM2.5) Winners:
1st Place - Vishwas Chepuri, Nalgonda, India
2nd Place - Raphael Kiminya, Meru, Kenya
3rd Place - Kudaibergen Abutalip. Nur-Sultan, Kazakhstan
Trace Gas Track (NO2) Winners:
1st Place - A. David Lander, United States
2nd Place. Raphael Kiminya, Meru, Kenya
3rd Place, Sukanta Basu, Delft, The Netherlands
Challenge winners in both tracks used the technique of “tree-based ensemble machine learning models” to accurately estimate daily surface-level PM2.5 and NO2 concentrations. These results highlight the added value of incorporating information from multiple models instead of using a single model. The low computational cost and memory requirements of the winning solutions are also promising outcomes, as these tree-based ensemble models could be implemented on most computing environments and provide air quality information in a timely manner to the public. The competition also uncovered that missing data from current satellite products made it difficult for participants to develop of accurate algorithms, gaps that may be addressed as higher resolution data from TEMPO and MAIA satellite missions become available.
Open-source code and documents from all winning models can be found on DrivenData’s repository on GitHub. The successful models could aid in the dissemination of more accurate surface-level PM2.5 and NO2 data products to the public, providing improved public health and safety information so people can reduce their air pollution exposure. In particular, the U.S. Department of State has interest in evaluating and using the methods from the winning models through an ongoing NASA Health and Air Quality project, and to provide useful air quality information to State Department employees and the general public at various embassy locations.
Learn more about the winners and see their submission scores.
Next, the challenge team will be analyzing, engineering, and testing the winning solutions. Based on the challenge participants’ feedback, they are also considering generating technical resources for working with MAIA/TEMPO when ready to help make them more usable for data communities. Results from this challenge will also be relayed to the TEMPO and MAIA algorithm development teams to enable more accurate value-added air quality products from the upcoming satellite missions.
The NASA Tournament Lab, part of the Prizes, Challenges, and Crowdsourcing program in the Space Technology Mission Directorate, managed the challenge.
Get involved! Explore other prize and challenge opportunities to help make a difference here on Earth.