Skip to content

Latest commit

 

History

History
30 lines (19 loc) · 1.28 KB

ISSUES.md

File metadata and controls

30 lines (19 loc) · 1.28 KB

Trouble Shooting

  1. Out of memory during sampling.

    • Possible reason:

      • Too many high-resolution frames for parallel decoding. The default setting will request ca. 66 GB peak VARM.
    • Try this:

      • Reduce the number of jointly decoded frames en_and_decode_n_samples_a_time in inference/vista.yaml.
  2. Get stuck at loading FrozenCLIPEmbedder or FrozenOpenCLIPImageEmbedder.

  3. The shapes of linear layers cannot be multiplied at the cross-attention layers.

    • Possible reason:

      • The dimension of cross-attention is not expended while the action conditions are injected, resulting in a mismatch.
    • Try this:

      • Enable action_control: True in the YAML config file.

<= Previous: [Sampling]