Run with Video as prompt Error：channels not match #37

Kevin-Lee1299 · 2024-11-14T09:08:46Z

python generate.py --prompt-path sample_data/snippy-chartreuse-mastiff-f79998db196d-20220401-224517.chunk_001.mp4 --actions-path sample_data/snippy-chartreuse-mastiff-f79998db196d-20220401-224517.chunk_001.one_hot_actions.pt

Traceback (most recent call last):
File "D:\text\open-oasis\generate_origin.py", line 199, in
main(args)
File "D:\text\open-oasis\generate_origin.py", line 76, in main
x = vae.encode(x * 2 - 1).mean * scaling_factor
File "D:\text\open-oasis\vae.py", line 283, in encode
x = self.patch_embed(x)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "D:\text\open-oasis\dit.py", line 65, in forward
x = self.proj(x)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\conv.py", line 554, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\text\open-oasis\ENV_DIR\lib\site-packages\torch\nn\modules\conv.py", line 549, in _conv_forward
return F.conv2d(
RuntimeError: Given groups=1, weight of size [1024, 3, 20, 20], expected input[1, 360, 360, 640] to have 3 channels, but got 360 channels instead

CIntellifusion · 2025-02-15T07:55:20Z

I got the same issue. Did u solve this? Thanks

CIntellifusion · 2025-02-15T08:09:47Z

Update: I think I have solved this issue by permute the video axis. Paste it and replace the load_prompt in utils.py

def load_prompt(path, video_offset=None, n_prompt_frames=1):
    if path.lower().split(".")[-1] in IMAGE_EXTENSIONS:
        print("prompt is image; ignoring video_offset and n_prompt_frames")
        prompt = read_image(path)
        # add frame dimension
        prompt = rearrange(prompt, "c h w -> 1 c h w")  # torch.Size([1, 3, 360, 640])
    elif path.lower().split(".")[-1] in VIDEO_EXTENSIONS:
        prompt = read_video(path, pts_unit="sec")[0]
        if video_offset is not None:
            prompt = prompt[video_offset:]
        prompt = prompt[:n_prompt_frames] #  # torch.Size([n_frames,360, 640, 3])
        prompt = prompt.permute(0, 3, 1, 2) # torch.Size([n_frames,3,360,640])
    else:
        raise ValueError(f"unrecognized prompt file extension; expected one in {IMAGE_EXTENSIONS} or {VIDEO_EXTENSIONS}")
    assert prompt.shape[0] == n_prompt_frames, f"input prompt {path} had less than n_prompt_frames={n_prompt_frames} frames"
    prompt = resize(prompt, (360, 640))
    # add batch dimension
    prompt = rearrange(prompt, "t c h w -> 1 t c h w")
    prompt = prompt.float() / 255.0
    return prompt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run with Video as prompt Error：channels not match #37

Run with Video as prompt Error：channels not match #37

Kevin-Lee1299 commented Nov 14, 2024

CIntellifusion commented Feb 15, 2025

CIntellifusion commented Feb 15, 2025 •

edited

Loading

Run with Video as prompt Error：channels not match #37

Run with Video as prompt Error：channels not match #37

Comments

Kevin-Lee1299 commented Nov 14, 2024

CIntellifusion commented Feb 15, 2025

CIntellifusion commented Feb 15, 2025 • edited Loading

CIntellifusion commented Feb 15, 2025 •

edited

Loading