Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run with Video as prompt Error:channels not match #37

Open
Kevin-Lee1299 opened this issue Nov 14, 2024 · 2 comments
Open

Run with Video as prompt Error:channels not match #37

Kevin-Lee1299 opened this issue Nov 14, 2024 · 2 comments

Comments

@Kevin-Lee1299
Copy link

python generate.py --prompt-path sample_data/snippy-chartreuse-mastiff-f79998db196d-20220401-224517.chunk_001.mp4 --actions-path sample_data/snippy-chartreuse-mastiff-f79998db196d-20220401-224517.chunk_001.one_hot_actions.pt

Traceback (most recent call last):
File "D:\text\open-oasis\generate_origin.py", line 199, in
main(args)
File "D:\text\open-oasis\generate_origin.py", line 76, in main
x = vae.encode(x * 2 - 1).mean * scaling_factor
File "D:\text\open-oasis\vae.py", line 283, in encode
x = self.patch_embed(x)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "D:\text\open-oasis\dit.py", line 65, in forward
x = self.proj(x)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\conv.py", line 554, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\conv.py", line 549, in _conv_forward
return F.conv2d(
RuntimeError: Given groups=1, weight of size [1024, 3, 20, 20], expected input[1, 360, 360, 640] to have 3 channels, but got 360 channels instead

@CIntellifusion
Copy link

I got the same issue. Did u solve this? Thanks

@CIntellifusion
Copy link

CIntellifusion commented Feb 15, 2025

Update: I think I have solved this issue by permute the video axis. Paste it and replace the load_prompt in utils.py

def load_prompt(path, video_offset=None, n_prompt_frames=1):
    if path.lower().split(".")[-1] in IMAGE_EXTENSIONS:
        print("prompt is image; ignoring video_offset and n_prompt_frames")
        prompt = read_image(path)
        # add frame dimension
        prompt = rearrange(prompt, "c h w -> 1 c h w")  # torch.Size([1, 3, 360, 640])
    elif path.lower().split(".")[-1] in VIDEO_EXTENSIONS:
        prompt = read_video(path, pts_unit="sec")[0]
        if video_offset is not None:
            prompt = prompt[video_offset:]
        prompt = prompt[:n_prompt_frames] #  # torch.Size([n_frames,360, 640, 3])
        prompt = prompt.permute(0, 3, 1, 2) # torch.Size([n_frames,3,360,640])
    else:
        raise ValueError(f"unrecognized prompt file extension; expected one in {IMAGE_EXTENSIONS} or {VIDEO_EXTENSIONS}")
    assert prompt.shape[0] == n_prompt_frames, f"input prompt {path} had less than n_prompt_frames={n_prompt_frames} frames"
    prompt = resize(prompt, (360, 640))
    # add batch dimension
    prompt = rearrange(prompt, "t c h w -> 1 t c h w")
    prompt = prompt.float() / 255.0
    return prompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants