RuntimeError: u must have shape (batch_size, H, L) #26

kaansancak · 2023-06-12T17:39:16Z

Hello,
I am trying to run the benchmark here with fused_fft_conv enabled but I am getting RuntimeError: u must have shape (batch_size, H, L) error. In this case the shape of u is [1, 1, 768, 1, 2048] but it expects [1, 1, 768]. Normally, fftconv handles the last dimension but in this case, the shape check fails.

Log:

Traceback (most recent call last):
  File "/localscratch/safari/benchmarks/runtime_hyena_flashmha.py", line 77, in <module>
    m, t = benchmark_forward(hyena, x, repeats=10, desc='', verbose=False)
  File "/localscratch/safari/benchmarks/runtime_hyena_flashmha.py", line 23, in benchmark_forward
    m = t.timeit(repeats)
  File "/opt/conda/envs/gps/lib/python3.9/site-packages/torch/utils/benchmark/utils/timer.py", line 266, in timeit
    self._timeit(number=max(int(number // 100), 2))
  File "/opt/conda/envs/gps/lib/python3.9/site-packages/torch/utils/benchmark/utils/timer.py", line 256, in _timeit
    return max(self._timer.timeit(number), 1e-9)
  File "/opt/conda/envs/gps/lib/python3.9/timeit.py", line 177, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
  File "/opt/conda/envs/gps/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/localscratch/safari/src/models/sequence/hyena.py", line 361, in forward
    v = self.filter_fn(v, l_filter, k=k[o], bias=bias[o, None, :, None])
  File "/opt/conda/envs/gps/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/localscratch/safari/src/models/sequence/hyena.py", line 218, in forward
    y = fftconv_func(
  File "/localscratch/safari/src/ops/fftconv.py", line 102, in fftconv_func
    return FFTConvFunc.apply(u, k, D, dropout_mask, gelu, force_fp16_output,
  File "/localscratch/safari/src/ops/fftconv.py", line 79, in forward
    out = fftconv_fwd(u, k_f, D, v, head_dim, q, dropout_mask, gelu, False, False, fft_size, force_fp16_output, output_hbl_layout, fftfp16)
RuntimeError: u must have shape (batch_size, H, L)

The text was updated successfully, but these errors were encountered:

janEbert · 2023-06-13T10:31:48Z

Hey, I think the code for fftconv expects the model to only have a single head and number of blocks, while the model code has already integrated support for multiple heads and blocks (which then breaks fftconv as you noticed). Also at some point the code expects a transposed version of the input. You can patch src/models/sequence/hyena.py like this to get it running for now:

@@ -314,13 +314,13 @@ class HyenaOperator(nn.Module):
         
         uc = self.short_filter(u)[...,:l_filter] 
         
-        uc = rearrange(uc, 'b (ho v) (z l) -> b ho v z l', 
-            z=self.num_blocks, 
-            ho=self.num_heads, 
-            v=self.head_dim * (self.order + 1)
-        )
+        # uc = rearrange(uc, 'b (ho v) (z l) -> b ho v z l',
+        #     z=self.num_blocks,
+        #     ho=self.num_heads,
+        #     v=self.head_dim * (self.order + 1)
+        # )
 
-        *x, v = uc.split(self.d_model, dim=2)
+        *x, v = uc.split(self.d_model, dim=1)
         k = self.filter_fn.filter(l_filter)
         
         # `c` is always 1 by default
@@ -339,7 +339,7 @@ class HyenaOperator(nn.Module):
                 v = self.dropout(v * x_i)
 
             # the bias term is broadcasted. Last dimension (l) is handled by fftconv
-            v = self.filter_fn(v, l_filter, k=k[o], bias=bias[o, None, :, None])
+            v = self.filter_fn(v, l_filter, k=k[o], bias=bias[o])
             
             if self.post_order_ffn: 
                 w = self.ord_proj_w[o]
@@ -347,7 +347,10 @@ class HyenaOperator(nn.Module):
                     rearrange(w, 'h1 h2 -> 1 h1 h2 1 1 1'), rearrange(v, 'b h v z l -> b h 1 v z l')
                 )
 
-        y = self.activation(rearrange(v * x[0], 'b h v z l -> b (z l) (h v)', z=self.num_blocks, h=self.num_heads))
+        y = self.activation(
+            (v * x[0]).transpose(-2, -1),
+            # rearrange(v * x[0], 'b h v z l -> b (z l) (h v)', z=self.num_blocks, h=self.num_heads)
+        )
         y = self.out_proj(y)
         
         if self.return_state:
@@ -356,4 +359,4 @@ class HyenaOperator(nn.Module):
 
     @property
     def d_output(self):
-        return self.d_model
\ No newline at end of file
+        return self.d_model

xiaobo-guo · 2023-07-12T15:55:28Z

Hi is there any update about the fftconv for multi-head support?

DanFu09 · 2023-09-26T17:43:32Z

The module already supports multi-head - you can find an example in the H3 code: https://github.com/HazyResearch/safari/blob/main/src/models/sequence/h3.py#L160

In H3, the names of the three branches (what Hyena calls x[0], x[1], and v) are called q, k, and v.

Passing in head_dim > 1 will trigger multi-head support:

y = fftconv_func(k, ssm_kernel, self.D,
                             dropout_mask, False, torch.is_autocast_enabled(), True,
                             v, self.head_dim, q)

See: HazyResearch/safari#26 (comment)

See: HazyResearch/safari#26 (comment) Signed-off-by: Guy Jacob <guyj@nvidia.com>

* Initial reference code commit, unchanged Signed-off-by: Guy Jacob <guyj@nvidia.com> * Hyena code changes for NeMO compatibility Signed-off-by: Guy Jacob <guyj@nvidia.com> * MCore spec override functionality + example config w. hyena Signed-off-by: Guy Jacob <guyj@nvidia.com> * Additional changes - now working on char-level TinyShakespeare * Add missing input LayerNorm to spec (in the default attention spec it's fused with the projection Linear layer, so not explicitly defined) * Shape conversion at start and end of Hyena forward Signed-off-by: Guy Jacob <guyj@nvidia.com> * Add fftconv cuda impl from safari Signed-off-by: Guy Jacob <guyj@nvidia.com> * Workaround for shape error in fftconv See: HazyResearch/safari#26 (comment) Signed-off-by: Guy Jacob <guyj@nvidia.com> * Explicitly convert kernel to FP32 (torch.fft doesn't support bf16) Signed-off-by: Guy Jacob <guyj@nvidia.com> * Working run configs Signed-off-by: Guy Jacob <guyj@nvidia.com> * Remove sharded_state_dict from HyenaOperator (made redundant by the default inmplementation in Megatron) Signed-off-by: Guy Jacob <guyj@nvidia.com> * Update configs Signed-off-by: Guy Jacob <guyj@nvidia.com> * Testing TE Linear classes in HyenaOperator Signed-off-by: Guy Jacob <guyj@nvidia.com> * Revert to FusedDense for in/out projections after merging with 24.01.01 Signed-off-by: Guy Jacob <guyj@nvidia.com> * Fix bug (use fused LNorm+Linear), bring back TE layers Signed-off-by: Guy Jacob <guyj@nvidia.com> * Configs rename + cleanup Signed-off-by: Guy Jacob <guyj@nvidia.com> * FlashFFTConv, Multi-head, some cleanup Signed-off-by: Guy Jacob <guyj@nvidia.com> * Bug fix - init FlashFFTConv with 2*seq_len Signed-off-by: Guy Jacob <guyj@nvidia.com> * ModuleSpec + replace nn.Conv1d with causal_conv1d Signed-off-by: Guy Jacob <guyj@nvidia.com> * Remove unneeded arguments Signed-off-by: Guy Jacob <guyj@nvidia.com> * More cleanup, remove fftconv ref functions Signed-off-by: Guy Jacob <guyj@nvidia.com> * Refactor HyenaFilter + more cleanup * Refactor in spirit of implementation in MAD-Lab repo: https://github.com/athms/mad-lab/blob/main/mad/model/layers/hyena.py Signed-off-by: Guy Jacob <guyj@nvidia.com> * Add missing attributions Signed-off-by: Guy Jacob <guyj@nvidia.com> * Remove fftconv sources Signed-off-by: Guy Jacob <guyj@nvidia.com> * Bug fixes Signed-off-by: Guy Jacob <guyj@nvidia.com> * Remove d_model from external API, take from TransformerConfig Signed-off-by: Guy Jacob <guyj@nvidia.com> * cleanup config Signed-off-by: Guy Jacob <guyj@nvidia.com> * Remove spec override logic (possibly push separately) Signed-off-by: Guy Jacob <guyj@nvidia.com> * Add tests Signed-off-by: Guy Jacob <guyj@nvidia.com> * Keep only megatron_gpt_config_hyena (w. 153m parameters) Signed-off-by: Guy Jacob <guyj@nvidia.com> * Black + isort formatting changes Signed-off-by: Guy Jacob <guyj@nvidia.com> * Fixes following PR review * Clearer names + more documentation for config params * Clearer README * Check seq len < 8K with safari-fftconv * Avoid 0*bias op during forward Signed-off-by: Guy Jacob <guyj@nvidia.com> * Fix tests following param name changes Signed-off-by: Guy Jacob <guyj@nvidia.com> --------- Signed-off-by: Guy Jacob <guyj@nvidia.com>

guyjacob added a commit to guyjacob/NeMo that referenced this issue Jan 25, 2024

Workaround for shape error in fftconv

eeec180

See: HazyResearch/safari#26 (comment)

guyjacob added a commit to guyjacob/NeMo that referenced this issue Feb 6, 2024

Workaround for shape error in fftconv

2bb88c8

See: HazyResearch/safari#26 (comment)

guyjacob added a commit to guyjacob/NeMo that referenced this issue Feb 21, 2024

Workaround for shape error in fftconv

1b3b7f2

See: HazyResearch/safari#26 (comment)

guyjacob added a commit to guyjacob/NeMo that referenced this issue May 7, 2024

Workaround for shape error in fftconv

eca76fc

See: HazyResearch/safari#26 (comment)

guyjacob added a commit to guyjacob/NeMo that referenced this issue May 16, 2024

Workaround for shape error in fftconv

3967dd7

See: HazyResearch/safari#26 (comment)

guyjacob added a commit to guyjacob/NeMo that referenced this issue May 16, 2024

Workaround for shape error in fftconv

dcdfeba

See: HazyResearch/safari#26 (comment) Signed-off-by: Guy Jacob <guyj@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: u must have shape (batch_size, H, L) #26

RuntimeError: u must have shape (batch_size, H, L) #26

kaansancak commented Jun 12, 2023

janEbert commented Jun 13, 2023

xiaobo-guo commented Jul 12, 2023

DanFu09 commented Sep 26, 2023

RuntimeError: u must have shape (batch_size, H, L)​ #26

RuntimeError: u must have shape (batch_size, H, L)​ #26

Comments

kaansancak commented Jun 12, 2023

janEbert commented Jun 13, 2023

xiaobo-guo commented Jul 12, 2023

DanFu09 commented Sep 26, 2023

RuntimeError: u must have shape (batch_size, H, L) #26

RuntimeError: u must have shape (batch_size, H, L) #26