Spatio-Temporal Attention-based Unet for Field Boundary Detection

This is the first place solution of team WeMoveMountains in the NASA Harvest Field Boundary Detection Challenge. This solution is a single 10-fold modified Regnetv-Unet developed in Pytorch.

${{model_nasa_rwanda_field_boundary_competition_gold_v1}}$

MLHub model id: model_nasa_rwanda_field_boundary_competition_gold_v1. Browse on Radiant MLHub.

Training Data

Training Data Source
Training Data Labels

Related MLHub Dataset

The dataset description is available on RadiantMLHub. See here.

Citation

Muhamed T., Azer K. (2023) “Spatio-Temporal Attention-based Unet for Field Boundary Detection”, Version 1.0, Radiant MLHub. [Date Accessed] Radiant MLHub. https://doi.org/10.34911/rdnt.h28fju

License

CC-BY-4.0

Creators

This solution was developped by:

Muhamed Tuo
Azer Ksouri

Learning Approach

Supervised Learning

Prediction Type

Segmentation

Model Architecture

Our solution is a modified Unet++ with an attention mechanism between every interconnection of the encoder to the decoder (Meaning after each output layer of the encoder model). The encoder is a Regnet (more specifically regnetv_040 available in the Timm library).

Training Operating System

The training was done on a Linux system with an Nvidia GPU (A100 80GB). An A600 24GB should be enough to train the model.

Model Inferencing

Review the GitHub repository README to get started running this model for new inferencing.

Training

Augmentation

The only augmentation we did was static. It is done before the training and saved into a new folder. We noticed that the model learned better with little (only Flip augmentation) to no augmentation during the training.

Training procedure

For each tile, all the time-series images are loading as Timestamps x C x H x W and passed to the model. The input now is Batch_size x TimeStamps x C x H x W. It is reshaped to Batch_size*TimeStamps x C x H x W before being fed into the encoder, and then reshaped to Batch_size x TimeStamps x D x H' x W' for the attention pooling mechanism. Finally, the input to the decoder becomes Batch_size x D" x H" x W".

Structure of Output Data

The output file is a csv named output.csv and should be available in the data/output folder. Each row of the csv file correspond to a pixel of the flattened 256x256 image.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Spatio-Temporal Attention-based Unet for Field Boundary Detection

Training Data

Related MLHub Dataset

Citation

License

Creators

Learning Approach

Prediction Type

Model Architecture

Training Operating System

Model Inferencing

Training

Structure of Output Data

Files

index.md

Latest commit

History

index.md

File metadata and controls

Spatio-Temporal Attention-based Unet for Field Boundary Detection

Training Data

Related MLHub Dataset

Citation

License

Creators

Learning Approach

Prediction Type

Model Architecture

Training Operating System

Model Inferencing

Training

Structure of Output Data