-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample order and ID #2516
Comments
The The current implementation also discards the original image path for an If you'd like to keep this information for the use case you described, the Perhaps the changes to |
What does "the order is not guaranteed" mean ? Does it mean that the order of elements yielded by |
Ahhh sorry I totally forgot one implementation detail: we actually sort the paths so the order of the items will be sorted (as long as you don't shuffle with the dataloader). You could double check that the data is always the same to validate. But the paths will still not be accessible. |
I copied the code of globbing and outputed the images path. Don't you need the images path ? If not, how do you pair your predictions with the images ? |
Yeah the images path are used to get the corresponding image data and ground-truth label! The list of image/annotation pairs is stored in a vec of |
I mean when performing inference, we need know the image id or path. for example, in burn's guide example, the infer.rs will give the prediction result, but how to get the image id of each prediction?
![image](https://github.com/user-attachments/assets/22e4ede5-71f8-4f60-b164-359fb289cc03)
|
Especially when we use customized datasets for inference, we need to know the image id corresponding to the prediction result.
![image](https://github.com/user-attachments/assets/f75eeebc-fd18-4426-8479-342aee6c77ef)
|
As I said earlier, the image id (i.e., path to the source) is discarded when reading the image data. This happens specifically within the mapper that transforms an There is no way with the current implementation to preserve that information because it is not currently kept as a dataset item field. But you could easily adapt the code to simply have something like this in your implementation: /// Modified image dataset item that preserves the image source field.
#[derive(Debug, Clone, PartialEq)]
pub struct ImageDatasetItem {
/// Image as a vector with a valid image type.
pub image: Vec<PixelDepth>,
/// Annotation for the image.
pub annotation: Annotation,
/// Original image source.
pub image_path: String,
}
impl Mapper<ImageDatasetItemRaw, ImageDatasetItem> for PathToImageDatasetItem {
/// Convert a raw image dataset item (path-like) to a 3D image array with a target label.
fn map(&self, item: &ImageDatasetItemRaw) -> ImageDatasetItem {
let annotation = parse_image_annotation(&item.annotation, &self.classes);
// Load image from disk
let image = image::open(&item.image_path).unwrap();
// Image as Vec<PixelDepth>
let img_vec = match image.color() {
// ...
};
ImageDatasetItem {
image: img_vec,
annotation,
// Keep the image source as a field
image_path: item.image_path.display().to_string(),
}
}
} |
I don't have any opposition to this addition. If you already made the changes you could make a PR from your fork 🙂 |
Hi,
![image](https://private-user-images.githubusercontent.com/29703450/388021442-b29a1aeb-0066-4915-8811-0c052cc3d2d4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwNzQwMDYsIm5iZiI6MTczOTA3MzcwNiwicGF0aCI6Ii8yOTcwMzQ1MC8zODgwMjE0NDItYjI5YTFhZWItMDA2Ni00OTE1LTg4MTEtMGMwNTJjYzNkMmQ0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA5VDA0MDE0NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZmViN2IzNjM0OGM2Yzc0ZTM4ZTlhMzY4MmMwODQ0MDU1MzM2Y2Y4ZmVlNTUzODJiZGU3ODJlZWRmMThiY2ImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.5o5rVWoU_pmNNBmTYeBFUhPucoIGNCqKpA_UwhE0pnk)
When training/testing a customized dataset, how does burn determines the samples order ? For example, here is the directory tree of cifar10:
When testing model with this customized dataset (png format for teaching & tutorials), I can get the predicted results, but how to pair the results with the samples ? Can burn output both the sample directory and the predicted results together ?
For hugginface dataset, such as mnist, we don't even know the sample ID and cannot see the images, It's also necessary to output the samples' names !
The text was updated successfully, but these errors were encountered: