Artificial intelligence and machine learning have grown in popularity in recent decades as a result of advances in high-performance computing and open-source software. At the core, machine learning provides a statistical inference based on the inputs provided by the user, in which algorithms learn relationships between input data and output results. The complexity of these algorithms allows for the discovery of patterns and trends invisible to the human analyst, making it important to create analysis-appropriate input for these models to ensure that they answer the questions we are asking. This training will provide attendees an overview of machine learning in regards to Earth Science, and how to apply these algorithms and techniques to remote sensing data in a meaningful way. Attendees will also be provided with end-to-end case study examples for generating a simple random forest model for land cover classification from optical remote sensing. We will also present additional case studies to apply the presented workflows using additional NASA data.


REGISTER 10:00-11:30 EDT (UTC-4)

This training is also available in Spanish.

(2023). ARSET - Fundamentals of Machine Learning for Earth Science. NASA Applied Remote Sensing Training Program (ARSET).

By the end of this training attendees will be able to: 

  • Recognize the most common machine learning methods used for processing Earth Science data
  • Describe the benefits and limitations of machine learning for Earth Science analysis 
  • Explain how to apply basic machine learning algorithms and techniques in a meaningful manner to remote sensing data
  • Use an analysis-appropriate training dataset to evaluate conditions and solutions for a given case study
  • Complete basic procedures to interpret, refine and evaluate the accuracy of the results of machine learning analysis

Research and applied scientists interested in learning how to apply basic machine learning techniques to Earth science data (e.g., satellite data such as MODIS, Landsat, Sentinel-2, etc.).

Course Format
  • Three, 1.5-hour sessions
Part 1: Overview of Machine Learning
10:00 am ~ 11:30 am
EDT (UTC-4:00)

Trainers: Jordan A. Caraballo-Vega, Caleb Spradlin, Jian Li, Jules Kouatchou

  • Overview of Machine Learning
  • Importance of Machine Learning targeted towards Earth Science
  • Usability of Machine Learning 
  • Software to Support Machine Learning 
  • Machine Learning Applications 
  • Hands on Jupyter Notebook Exercise: Load and Visualize Data
  • Post-session assignment  
  • Q&A Session
Part 2: Training Data and Land Cover Classification Example
10:00 am ~ 11:30 am
EDT (UTC-4:00)

Trainers: Jordan A. Caraballo-Vega, Caleb Spradlin, Jian Li, Jules Kouatchou

  • Download the training data
  • Exploratory data analysis 
  • Extracting training data from tabular dataset 
  • Extracting training data from raster dataset 
  • Training and inference of tabular and raster dataset 
  • Metrics and model evaluation
  • Hands on Jupyter Notebook Exercise: MODIS Water Classification Case Study
  • Post-session assignment  
  • Q&A Session
Part 3: Model Tuning, Parameter Optimization, and Additional Machine Learning Algorithms
10:00 am ~ 11:30 am
EDT (UTC-4:00)

Trainers: Jordan A. Caraballo-Vega, Caleb Spradlin, Jian Li, Jules Kouatchou 

  • Overview of model tuning
  • Overview of parameter optimization 
  • Exercise to optimize existing model 
  • Overview of model explainability and interpretability 
  • Overview of additional machine learning algorithms 
  • Hands on Jupyter Notebook Exercise: Improvements to MODIS Water Classification Model
  • Post-session assignment  
  • Q&A Session