Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WildFire Dataset object and split strategies #47

Merged
merged 18 commits into from
Mar 28, 2020

Conversation

x0s
Copy link
Contributor

@x0s x0s commented Dec 15, 2019

Hello,

This PR aims at:

  • Add WildFireDataset pytorch compatible class
  • Add Strategy to split the dataset in train/val/test without mixing the fire_ids

Example:

from torch.utils.data import DataLoader
from torchvision.transforms import transforms
from pyronear.datasets.wildfire import (WildFireDataset,
                                        WildFireSplitter)

wildfire = WildFireDataset(metadata='wildfire.csv',
                           target_names=['fire', 'clf_confidence'],
                           path_to_frames=path_to_frames)

ratios = {'train': 0.7, 'val': 0.15, 'test':0.15}
transforms = {'train': transforms.RandomCrop(10), 'val': None, 'test': None}

splitter = WildFireSplitter(ratios, transforms)
splitter.fit(wildfire)

wildfire_loader_train = DataLoader(splitter.train, batch_size=64, shuffle=True)
wildfire_loader_val = DataLoader(splitter.val, batch_size=64, shuffle=True)
wildfire_loader_test = DataLoader(splitter.test, batch_size=64, shuffle=True)

Each dataloader will yield the image(transformed if requested) and the two targets (fire and clf_confidence)

For more example, please browse the tests

Any feedback is welcome

@x0s x0s added type: enhancement New feature or request module: datasets Related to datasets ext: tests Related to tests labels Dec 16, 2019
@codecov
Copy link

codecov bot commented Mar 12, 2020

Codecov Report

Merging #47 into master will increase coverage by 2.41%.
The diff coverage is 98.03%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #47      +/-   ##
==========================================
+ Coverage   81.39%   83.80%   +2.41%     
==========================================
  Files          16       18       +2     
  Lines         602      704     +102     
==========================================
+ Hits          490      590     +100     
- Misses        112      114       +2     
Impacted Files Coverage Δ
pyronear/datasets/wildfire/split_strategy.py 96.77% <96.77%> (ø)
pyronear/datasets/wildfire/wildfire.py 98.36% <98.36%> (ø)
pyronear/datasets/utils.py 91.66% <100.00%> (+1.04%) ⬆️
pyronear/datasets/wildfire/__init__.py 100.00% <100.00%> (ø)

Copy link
Member

@MateoLostanlen MateoLostanlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You made a mistake in your example, it is ->
from torch.utils.data import DataLoader
from pyronear.datasets.wildfire import (WildFireDataset,
WildFireSplitter)

wildfire = WildFireDataset(metadata='wildfire.csv',
path_to_frames=path_to_frames)

ratios = {'train': 0.7, 'val': 0.15, 'test':0.15}

splitter = WildFireSplitter(ratios)
splitter.fit(wildfire)

wildfire_loader_train = DataLoader(splitter.train, batch_size=64, shuffle=True)
wildfire_loader_val = DataLoader(splitter.val, batch_size=64, shuffle=True)
wildfire_loader_test = DataLoader(splitter.test, batch_size=64, shuffle=True)

Copy link
Member

@MateoLostanlen MateoLostanlen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe my question is dumb, but I don't understand why you use "from skimage import io" and not pil to load the images. I have a problem to apply the torch vision transform afterwards (https://pytorch.org/docs/stable/torchvision/transforms.html). What do you use as a transform then ?

@x0s
Copy link
Contributor Author

x0s commented Mar 24, 2020

You made a mistake in your example, it is ->
from torch.utils.data import DataLoader
from pyronear.datasets.wildfire import (WildFireDataset,
WildFireSplitter)

wildfire = WildFireDataset(metadata='wildfire.csv',
path_to_frames=path_to_frames)

ratios = {'train': 0.7, 'val': 0.15, 'test':0.15}

splitter = WildFireSplitter(ratios)
splitter.fit(wildfire)

wildfire_loader_train = DataLoader(splitter.train, batch_size=64, shuffle=True)
wildfire_loader_val = DataLoader(splitter.val, batch_size=64, shuffle=True)
wildfire_loader_test = DataLoader(splitter.test, batch_size=64, shuffle=True)

Thanks for your feedback!
Indeed, I forget to update the description.
So, If I understand it right, only these two lines needs to be updated:

splitter = WildFireSplitter(ratios)
splitter.fit(wildfire)

@MateoLostanlen MateoLostanlen merged commit 1fab241 into pyronear:master Mar 28, 2020
@x0s x0s deleted the add-wildfire-dataset branch April 1, 2020 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: tests Related to tests module: datasets Related to datasets type: enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants