← Research Projects
Project 05 Machine Learning Remote Sensing Air Quality

Ground-Level NO₂ Predictions Using
Satellite Imagery-Driven Convolutional Neural Networks

Training a CNN on both TROPOMI and TEMPO satellite data to predict surface NO₂ concentrations across Wisconsin — filling the gap left by sparse ground monitors and testing whether TEMPO's hourly imagery can improve spatial generalization.

Supervisor Prof. Tracey Holloway, UW–Madison
Timeline January 2026 – Present
Status Ongoing
Plain-English Summary

Wisconsin has only three NO₂ air quality monitors, all located in Milwaukee. For everyone else in the state — from Green Bay to rural farm communities — there is no direct measurement of this harmful pollutant. Satellites can see the whole state, but satellite readings don't directly translate to ground-level concentrations.

This project is building a convolutional neural network — the same type of AI used in image recognition — to read satellite imagery and predict what the ground-level NO₂ would be at any location in Wisconsin. The key novelty: training the model on NASA's new TEMPO satellite, which captures hourly NO₂ data at higher resolution than any prior satellite, alongside the established TROPOMI instrument. No published model has been trained on TEMPO data yet.

Three problems this project addresses

Why this work is needed
1

No model has been trained on TEMPO yet. TEMPO launched in April 2023 and provides hourly NO₂ imagery over North America at higher resolution than any prior satellite. It represents an entirely new source of data — and no published model has used it.

2

Existing AI models fail at new locations. The best current approach predicts NO₂ accurately at sites it trained on, but accuracy drops sharply at sites it has never seen — the real-world test that matters. TEMPO's richer hourly data may help the model learn patterns that generalize more broadly.

3

Wisconsin has only three NO₂ monitors, all in Milwaukee. The rest of the state — farms, mid-size cities, industrial zones — has no direct air quality measurement for this pollutant. That leaves communities potentially exposed to elevated NO₂ without any data.

Why this is hard to measure

NO₂ is a traffic and combustion pollutant linked to asthma, cardiovascular disease, and the formation of ozone and PM2.5. The EPA sets a legal limit of 53 ppb annual average.

Satellites can photograph NO₂ across the entire state at once — but they measure the total amount in the whole air column above, not just what's at the surface where people breathe. The relationship between the satellite reading and ground level varies by location, season, weather, and nearby land use. Translating satellite data into accurate surface readings requires machine learning. And because NO₂ can vary tenfold over just a few kilometers, the model needs to learn fine-scale spatial patterns — which is exactly what convolutional neural networks (CNNs) are designed to do.

The main satellite used in prior work is TROPOMI (TROPOspheric Monitoring Instrument), launched by the European Space Agency in 2017. It orbits the Earth once a day and images NO₂ across the globe at roughly 3.5 km resolution. This project also introduces TEMPO (Tropospheric Emissions: Monitoring of Pollution), NASA's geostationary satellite launched in April 2023. Unlike TROPOMI, TEMPO stays fixed above North America and captures imagery every hour throughout the day — up to 12 snapshots where TROPOMI gives one.

Two key studies that frame this project

Two prior studies define the starting point. Both use TROPOMI; neither uses TEMPO.

AspectKim et al. 2024 (JGR Atmospheres)Cao 2023 (Frontiers Env. Sci.)
MethodMultivariate linear regression (MLR)Convolutional neural network (CNN)
Satellite inputTROPOMI (single value per site)TROPOMI (2D pixel grids)
Best annual R²0.78 (anscMLR)0.952
Best daily R²Annual only0.892
Spatial CV R²0.65–0.89 (regional)0.593 (poor)
Key strengthInterpretable, minimal computeSpatial pattern learning
Key limitationCannot capture spatial structureFails at unseen locations
Supervisor connectionProf. Holloway's lab (UW-Madison)Fairview High School student

Cao's CNN is more accurate but collapses at new locations. This project tests whether TEMPO's hourly imagery can close that gap.

Following Cao's CNN protocol, extended with TEMPO

The core methodology follows Cao (2023): a convolutional neural network that receives stacked 2D satellite imagery grids around each EPA monitoring site and predicts surface NO₂ concentration. The key extension is training on both TROPOMI and TEMPO inputs, plus adapting the pipeline specifically for Wisconsin's monitoring context.

Cao (2023) Baseline
SatelliteTROPOMI only
Sites500 CONUS monitors
Period2018–2022
Resolution3.5 km × 5.5 km
TemporalDaily + annual
Grid size4,000 m (5×6 pixels)
This Project (Extension)
SatelliteTROPOMI + TEMPO
SitesWI monitors + CONUS
Period2023–present (TEMPO era)
Resolution2.1 km × 4.4 km (TEMPO)
TemporalHourly (TEMPO) + daily
WI focusStatewide coverage map

TEMPO passes over a location up to 12 times per day; TROPOMI does so once. That richer daily signal may teach the model patterns that hold up at new locations — addressing the field's central limitation.

Five phases from data to Wisconsin maps

01
Problem Definition and Data Inventory

Define what the model predicts (ground-level NO₂ in ppb), identify all input data sources (satellite imagery, meteorology, land use, roads), and inventory Wisconsin's 3 EPA monitors alongside nationwide training sites.

02
Data Acquisition and Pipeline

Pull TROPOMI and TEMPO imagery from Google Earth Engine as pixel grids. Download weather data (temperature, wind, pressure). Collect land-use layers (roads, vegetation, population density). Align everything to EPA monitor locations in space and time.

03
Preprocessing and Feature Engineering

Fill satellite data gaps caused by cloud cover. Normalize inputs so the model trains cleanly. Build feature grids at multiple resolutions and add time-of-day and seasonal encodings.

04
Model Training and Evaluation

Train the CNN and compare it against simpler baseline models using the same inputs. The critical test: how well does it predict at locations it has never seen during training? That spatial generalization score is the key metric.

05
Wisconsin NO₂ Mapping

Apply the model across all of Wisconsin to generate a statewide NO₂ map at roughly 1 km resolution — identifying high-exposure areas that have no monitors today, and comparing lakeshore vs. inland and urban vs. rural patterns.

What this project is trying to answer

  • Does adding TEMPO's hourly data make the model work better at places it hasn't seen before — solving the field's core generalization problem?
  • Can the model produce reliable NO₂ estimates for Wisconsin locations that currently have no monitor at all?
  • Are there communities across Wisconsin with elevated NO₂ that nobody is currently measuring — and if so, where?
← Project 04: Ozone Exceedance Days Project 06: Dairy Herd Simulation →