Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not reproducible example in documentation, typo. #1786

Closed
krammnic opened this issue Oct 9, 2024 · 1 comment
Closed

Not reproducible example in documentation, typo. #1786

krammnic opened this issue Oct 9, 2024 · 1 comment

Comments

@krammnic
Copy link
Contributor

krammnic commented Oct 9, 2024

Here it is probably typo:

from torchtune.models.llama3_2_vision import llama3_2_vision_transform
from torchtune.datasets.multimodal import multimodal_chat_dataset

transform = Llama3VisionTransform(
    path="/tmp/Meta-Llama-3-8B-Instruct/original/tokenizer.model",
    prompt_template="torchtune.data.QuestionAnswerTemplate",
    max_seq_len=8192,
    image_size=560,
)
ds = multimodal_chat_dataset(
    model_transform=model_transform,
    source="json",
    data_files="data/my_data.json",
    column_map={
        "dialogue": "conversations",
        "image_path": "image",
    },
    image_dir="/home/user/dataset/",  # /home/user/dataset/images/clock.jpg
    image_tag="<image>",
    split="train",
)
tokenized_dict = ds[0]
print(transform.decode(tokenized_dict["tokens"], skip_special_tokens=False))
# '<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nQuestion:<|image|>What time is it on the clock?Answer:<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt is 10:00AM.<|eot_id|>'
print(tokenized_dict["encoder_input"]["images"][0].shape)  # (num_tiles, num_channels, tile_height, tile_width)
# torch.Size([4, 3, 224, 224])

Shouldn't it be just transform, not model_transform?

@krammnic krammnic changed the title Not reproducible example from documentation, typo. Not reproducible example in documentation, typo. Oct 9, 2024
@RdoubleA
Copy link
Contributor

RdoubleA commented Oct 9, 2024

Ah yeah, good catch. We should either use transform or model_transform across the example.

If you're able to fix it with a quick PR, happy to stamp it. Otherwise, I can address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants