Can't seem to run GPT-J in CPU mode: "LayerNormKernelImpl" not implemented for 'Half' #16378

monsieurpooh · 2022-03-23T23:34:48Z

Environment info

transformers version: 4.15.0
Platform: Windows-10-10.0.19041-SP0
Python version: 3.8.5
PyTorch version (GPU?): 1.10.2+cu113 (True)
Tensorflow version (GPU?): 2.5.1 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help

@patrickvonplaten @Narsil

Information

Model I am using (KoboldAI/GPT-J-6B-Adventure):

The problem arises when using:

A simple script that calls model.generate() after loading it via GPTJForCausalLM.from_pretrained and input_ids = tokenizer(prompt, return_tensors="pt").input_ids without using anything cuda-related.

The tasks I am working on is:

run GPT-J in CPU mode for calibration purposes for the game I am making called AI Roguelite (I am willing to wait a long time as this is a calibration preprocessing task rather than a real-time task).

To reproduce

Steps to reproduce the behavior:

Call generate.py for gpt-j in cpu-only mode
Observe the error was "LayerNormKernelImpl" not implemented for 'Half'

Expected behavior

Runs it without that error

The text was updated successfully, but these errors were encountered:

patil-suraj · 2022-03-24T11:05:26Z

Hey @monsieurpooh , this is because the model was saved in fp16 as you can see here https://huggingface.co/KoboldAI/GPT-J-6B-Adventure/blob/main/config.json#L34

You can pass the torch_dtype argument to from_pretrained, to convert it to fp32 for CPU.

model = GPTJForCausalLM.from_pretrained("KoboldAI/GPT-J-6B-Adventure", torch_dtype=torch.float32)

monsieurpooh · 2022-04-02T02:29:31Z

Thanks for the quick response; however, I tried your suggestion and it did not work and I got the same error.

Here's the minimal repro code:


import os
import re
import random
from transformers import GPTNeoForCausalLM, GPTJForCausalLM, GPT2Tokenizer
import torch
from pynvml import *
import json
import sys


model = GPTJForCausalLM.from_pretrained("..\\gpt-neo-master\\saved_models_dir\\KoboldAI_GPT-J-6B-Adventure", low_cpu_mem_usage=True, torch_dtype=torch.float32)
tokenizer = GPT2Tokenizer.from_pretrained("..\\gpt-neo-master\\saved_models_dir\\KoboldAI_GPT-J-6B-Adventure", low_cpu_mem_usage=True, torch_dtype=torch.float32)


input_ids = tokenizer("test prompt", return_tensors="pt").input_ids
generated_outputs = model.generate(input_ids)

The output was:


C:\Max\gpt_calibration>python gpt-j-bug.py
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Traceback (most recent call last):
  File "gpt-j-bug.py", line 16, in <module>
    generated_outputs = model.generate(input_ids)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\generation_utils.py", line 1109, in generate
    return self.greedy_search(
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\generation_utils.py", line 1406, in greedy_search
    outputs = self(
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\models\gptj\modeling_gptj.py", line 786, in forward
    transformer_outputs = self.transformer(
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\models\gptj\modeling_gptj.py", line 640, in forward
    outputs = block(
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\models\gptj\modeling_gptj.py", line 279, in forward
    hidden_states = self.ln_1(hidden_states)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\normalization.py", line 189, in forward
    return F.layer_norm(
  File "C:\Users\jerkm\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\functional.py", line 2347, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

monsieurpooh · 2022-04-02T02:34:27Z

Nevermind. I removed "low_cpu_mem_usage" arg. It seems to be working now. Thanks again.

tahercoolguy · 2022-07-20T10:41:40Z

@patil-suraj I have same problem for GPT-Neox model. Any quick treatment

patrickvonplaten assigned patil-suraj Mar 23, 2022

monsieurpooh closed this as completed Apr 2, 2022

patil-suraj mentioned this issue Apr 4, 2022

handle torch_dtype in low cpu mem usage #16580

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't seem to run GPT-J in CPU mode: "LayerNormKernelImpl" not implemented for 'Half' #16378

Can't seem to run GPT-J in CPU mode: "LayerNormKernelImpl" not implemented for 'Half' #16378

monsieurpooh commented Mar 23, 2022

patil-suraj commented Mar 24, 2022

monsieurpooh commented Apr 2, 2022

monsieurpooh commented Apr 2, 2022

tahercoolguy commented Jul 20, 2022

Can't seem to run GPT-J in CPU mode: "LayerNormKernelImpl" not implemented for 'Half' #16378

Can't seem to run GPT-J in CPU mode: "LayerNormKernelImpl" not implemented for 'Half' #16378

Comments

monsieurpooh commented Mar 23, 2022

Environment info

Who can help

Information

To reproduce

Expected behavior

patil-suraj commented Mar 24, 2022

monsieurpooh commented Apr 2, 2022

monsieurpooh commented Apr 2, 2022

tahercoolguy commented Jul 20, 2022