Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline image-to-text task and Bitsandbytes error #24834

Closed
3 of 4 tasks
mediocreatmybest opened this issue Jul 15, 2023 · 8 comments · Fixed by #24947
Closed
3 of 4 tasks

Pipeline image-to-text task and Bitsandbytes error #24834

mediocreatmybest opened this issue Jul 15, 2023 · 8 comments · Fixed by #24947

Comments

@mediocreatmybest
Copy link

System Info

Python 3.10.6
Transformers 4.30.0
Bitsandbytes 0.39.1

Windows / Linux

Who can help?

@NAR

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Using an 4 or 8bit quantised model such as:

https://huggingface.co/Mediocreatmybest/blip2-opt-2.7b_8bit

Expected behavior

The Pipeline image processor to detect the model is running with a 4 or 8bit model with bitsandbytes.

I apologise if this should be a feature request or if it’s a bug, I couldn’t find any examples of what I was trying to do.

When running through the pipeline examples from the hugging face website, if I try using an 8bit model, the model seems to be detected correctly and casts it to 8bit, but the Processor doesn’t seem to follow suit and runs at its default, throwing an error that they both should be set at the same floating point.

I’ve uploaded a few models set at 8bit to save on size and memory, as BLIP2 is pretty heavy, using it on consumer devices is oviously challenging.

The models I’ve uploaded to HuggingFace are:

Mediocreatmybest/blip2-opt-2.7b_8bit
Mediocreatmybest/blip2-opt-6.7b_8bit
Mediocreatmybest/blip2-flan-t5-xxl_8bit

I can get them working with regular methods, but as I’m a beginner it’s obviously challenging. Thanks again for all the great work!

@mediocreatmybest
Copy link
Author

Based on this document, it should be possible, but maybe this is just an issue with multimodal or image processors with pipeline?

https://huggingface.co/docs/transformers/main/pipeline_tutorial

_# pip install accelerate bitsandbytes
import torch
from transformers import pipeline

pipe = pipeline(model="facebook/opt-1.3b", device_map="auto", model_kwargs={"load_in_8bit": True})
output = pipe("This is a cool example!", do_sample=True, top_p=0.95)_

@mediocreatmybest
Copy link
Author

Also I did create a huggingface.co spaces using pipeline with the ability to try load in 8bit (obviously errors)

https://huggingface.co/spaces/Mediocreatmybest/PipelineImageCaption

Thanks.

@mediocreatmybest
Copy link
Author

Adding the stack trace from google colab.


RuntimeError Traceback (most recent call last)
in <cell line: 19>()
17 captioner
18 # caption
---> 19 caption = captioner(image)[0]['generated_text']
20 print(caption)

16 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
457 weight, bias, self.stride,
458 _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
460 self.padding, self.dilation, self.groups)
461

RuntimeError: Input type (float) and bias type (c10::Half) should be the same

@mediocreatmybest mediocreatmybest changed the title Pipeline + Bitsandbytes Pipeline task image-to-text task and Bitsandbytes error Jul 18, 2023
@mediocreatmybest mediocreatmybest changed the title Pipeline task image-to-text task and Bitsandbytes error Pipeline image-to-text task and Bitsandbytes error Jul 18, 2023
@sgugger
Copy link
Collaborator

sgugger commented Jul 18, 2023

cc @younesbelkada

@younesbelkada
Copy link
Contributor

younesbelkada commented Jul 18, 2023

Hi @mediocreatmybest
Thanks for the issue, it seems the input image needs to be converted into half-precision (torch.float16), can you share a small handy reproducible snippet that leads to your bug?

@mediocreatmybest
Copy link
Author

mediocreatmybest commented Jul 18, 2023

Thanks for the fast response!

The snippet I was using to test on google colab and on my personal device was:

from transformers import pipeline
import torch


image = "https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png"
model = "Salesforce/blip-image-captioning-base"

model_kwargs = {"load_in_8bit": True, "torch_dtype": torch.float16}
captioner = pipeline(task="image-to-text",
model=model,
max_new_tokens=30,
model_kwargs=model_kwargs, use_fast=True
)
# load model
captioner
# caption
caption = captioner(image)[0]['generated_text']
print(caption)

(Copy and pasted from my mobile device, hopefully this formatted correctly)

Thanks 🙏

@JimAllanson
Copy link
Contributor

I encountered similar errors while using Blip/Blip2/Git models in an image_to_text pipeline. In my case, I was working with float16 instead of 8bit precision, as under my setup I was encountering additional issues with 8bit. I think there's a very good chance that the fix I've made in #24947 might also fix your issue (for the three models I've implemented the fix for). If you're able to give it a try I'd be interested in hearing if it fixes your issue too.

@mediocreatmybest
Copy link
Author

I encountered similar errors while using Blip/Blip2/Git models in an image_to_text pipeline. In my case, I was working with float16 instead of 8bit precision, as under my setup I was encountering additional issues with 8bit. I think there's a very good chance that the fix I've made in #24947 might also fix your issue (for the three models I've implemented the fix for). If you're able to give it a try I'd be interested in hearing if it fixes your issue too.

Thanks @JimAllanson, happy to try test, but I'm pretty new to Python, what is the best way to test this for you? editing the site-packages with the change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants