Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle undersampling due to lots of images without burned areas #12

Open
5 tasks
weiji14 opened this issue May 29, 2023 · 0 comments
Open
5 tasks

Handle undersampling due to lots of images without burned areas #12

weiji14 opened this issue May 29, 2023 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@weiji14
Copy link
Member

weiji14 commented May 29, 2023

The extra Sentinel-2 imagery dataset provided in https://huggingface.co/datasets/chabud-team/chabud-extra does not contain any burned areas according to https://huggingface.co/datasets/chabud-team/chabud-extra/discussions/1. If we include these datasets in the training, there will be a severe imbalance in the ratio of burned area to unburned area pixels.

Some potential ways to handle the extra data to improve model performance:

  • Loss functions that handle foreground/background classes properly
    • Focal Loss
    • Dice Loss
  • Self-supervised pre-training
    • Develop pretext tasks that make use of the extra data, generate useful embeddings on all the given data, and then fine-tune on images with burned areas only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant