Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where processor should i put in a training code? #13427

Closed
lycfight opened this issue Sep 5, 2021 · 5 comments
Closed

where processor should i put in a training code? #13427

lycfight opened this issue Sep 5, 2021 · 5 comments

Comments

@lycfight
Copy link

lycfight commented Sep 5, 2021

Hi @lycfight could you please open an issue with a minimal code snippet so we could take a look. Thanks :)

Originally posted by @patil-suraj in #11445 (comment)

@lycfight
Copy link
Author

lycfight commented Sep 5, 2021

I have a problem during writing a train code in pytorch.I want to create a custom Dataset for coco image_caption dataset such as follow:

  1. inherit torch.utils.data:
from torch.utils.data import Dataset
Class Image_textDataset(Dataset):

2.then overwrite getiem(self,idx) where i used processor to process a tuple of (image, text) sample
but it seems that CLIPProcessor could't process a tuple of (image, text) sample to same shape for dataloader to make a batch as follow:

def __getitem__(self, idx):
        img_id = self.img_ids[idx]
        # randomly pick one caption from the image captions
        text = random.choice(self.img_id_to_captions[img_id])
        img_filename = self.img_id_to_filename[img_id]
        img_path = op.join(self.img_dir, img_filename)
        img = Image.open(img_path)
        inputs = processor(text = text, images = img, return_tensors="pt", padding="max_length",truncation=True)
        return inputs

3.or __getitem__return a pair of raw (image, text) sample, and use processor in a custom collate_fn like:

def collate_fn(examples):
        images = [example[0] for example in examples]
        captions = [example[1] for example in examples]
        inputs = processor(
            text=captions,
            images=images,
            max_length=77,
            padding="max_length",
            truncation=True,
            return_tensors="pt",
        )

        batch = {
            "pixel_values": inputs["pixel_values"],
            "input_ids": inputs["input_ids"],
            "attention_mask": inputs["attention_mask"],
        }

        return batch

then pass collate_fn as a parameter to dataloader

@knitemblazor
Copy link

This issue is not of relevance to transformers repository please post this in pytorch forums for quick help

@lycfight
Copy link
Author

lycfight commented Sep 6, 2021

This issue is not of relevance to transformers repository please post this in pytorch forums for quick help

I think processor is a base method of transformers which should be concerned in transformers' tutorials

@patil-suraj
Copy link
Contributor

You could put the processor anywhere you want either in the dataset or collate_fn. If processing on the fly, then I would put it in the collate_fn as it would process the whole batch with single call, which is usually faster than processing single examples.

@github-actions
Copy link

github-actions bot commented Oct 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants