Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What're the modifications in llava/train/transformers_replace? #24

Open
ys-zong opened this issue Apr 4, 2024 · 7 comments
Open

What're the modifications in llava/train/transformers_replace? #24

ys-zong opened this issue Apr 4, 2024 · 7 comments

Comments

@ys-zong
Copy link

ys-zong commented Apr 4, 2024

Hi, thanks for the nice work! I wonder what are the main modifications in llava/train/transformers_replace compared to the original implementation in transformers==4.31.0, as specified in the pyproject.toml. Also, in environment_setup.sh, transformers==4.36.2 is installed:

pip install git+https://github.com/huggingface/transformers@v4.36.2

I wonder why we want to install different versions of transformers?

If I want to use a higher version of transformers, e.g. 4.38, are there changes needed for the files in this folder? Many thanks!

@Lyken17
Copy link
Collaborator

Lyken17 commented Apr 15, 2024

We have manually changed some original implementations to better support grouping strategy and flash attn, and recommend every VILA user to do so.

Though our codebase should work with higher version transformer, we haven't tested throughfully thus cannot promise anything. Please use v4.36.2 for reproducement.

@wusize
Copy link

wusize commented Apr 16, 2024

I found "transformers_version": "4.38.1", in the config.json of VILA-2.7B. Which version of transformers should we use to run VILA-2.7B? By the way, is there any plan to allow loading VILA models using transformers API only without relying on the current repo?

@weichow23
Copy link

I noticed that you mainly manually implemented class LlamaForCausalLM(LlamaPreTrainedModel) in I noticed that you mainly manually implemented class LlamaForCausalLM(LlamaPreTrainedModel) in replace. however, in the code (either train or infer), your implemented class is not be used. What's wrong with it?
Thanks for your reply

Lyken17 pushed a commit that referenced this issue Jul 3, 2024
remove tf utils to avoid import error
@amitbcp
Copy link

amitbcp commented Jul 22, 2024

@Lyken17 : Could you please share some details for this ?

@Lyken17
Copy link
Collaborator

Lyken17 commented Aug 1, 2024

please use 4.36.2 for now, we will upgrade to 4.37 in next release :)

@JBurtn
Copy link

JBurtn commented Aug 8, 2024

Also why manually replace the files?
Why not use monkeypatching, or reregister your versions with transformers using exist_ok=True?

@collinmccarthy
Copy link

Also interested in understanding this better, as I'm trying to combine a few things in VILA and LLaVA-NeXt and this makes me concerned that something might break in an unexpected way (e.g. run but give poor results).

In environment_setup.sh, if we use transformers v4.37.2 but skip the following, what are the repercussions?

# What happens if we skip this?
cp -rv ./llava/train/transformers_replace/* $site_pkg_path/transformers/
cp -rv ./llava/train/deepspeed_replace/* $site_pkg_path/deepspeed/

Are there PR's for merging these into the respective libraries or do they break other things and thus can't be merged?

gheinrich pushed a commit to gheinrich/VILA that referenced this issue Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants