Skip to content

Commit fa92188

Browse files
committed
Clean code
1 parent 4bf3fb4 commit fa92188

12 files changed

+224
-186
lines changed

INSTALLATION.md

+9-3
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,12 @@
2525

2626
- Install `flash-attn==2.3.6`:
2727

28+
```bash
29+
pip install flash-attn==2.3.6 --no-build-isolation
30+
```
31+
32+
Alternatively you can compile from source:
33+
2834
```bash
2935
git clone https://github.com/Dao-AILab/flash-attention.git
3036
cd flash-attention
@@ -35,9 +41,9 @@
3541
- Install `timm==0.9.12` and `mmcv-full==1.6.2`:
3642

3743
```bash
38-
pip install -U openmim
3944
pip install timm==0.9.12
40-
mim install mmcv-full==1.6.2
45+
pip install -U openmim
46+
mim install mmcv-full==1.6.2 # (optional, for mmsegmentation)
4147
```
4248

4349
- Install `transformers==4.36.2`:
@@ -62,6 +68,6 @@
6268

6369
```bash
6470
pip install opencv-python termcolor yacs pyyaml scipy
65-
pip install deepspeed==0.10.0
71+
pip install deepspeed==0.13.5
6672
pip install pycocoevalcap tqdm
6773
```

README.md

+50-53
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
- `2024/01/24`: InternVL-Chat-V1.1 is released, it supports Chinese and has stronger OCR capability, see [here](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1) or try our [demo](https://internvl.opengvlab.com/).
1515
- `2024/01/16`: We release our [customized mmcv/mmsegmentation/mmdetection code](https://github.com/OpenGVLab/InternVL-MMDetSeg), integrated with DeepSpeed, which can be used for training large-scale object detection and semantic segmentation models.
1616

17-
1817
## Compared with SOTA VLLMs
1918

2019
<img width="1229" alt="image" src="https://github.com/OpenGVLab/InternVL/assets/23737120/e9065a58-86fa-47ef-be9a-eb734532e73f">
@@ -29,26 +28,25 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
2928

3029
**Vision Large Language Model**
3130

32-
| Model | Date | Download | Note |
33-
| ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ---------------------------------- |
34-
| InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
35-
| InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus) | more SFT data and stronger |
36-
| InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2) | scaling up LLM to 34B |
37-
| InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1) | support Chinese and stronger OCR |
38-
| InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px) | 448 resolution |
39-
| InternVL−Chat−19B | 2023.12.25 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B) | English multimodal dialogue |
40-
| InternVL−Chat−13B | 2023.12.25 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B) | English multimodal dialogue |
41-
31+
| Model | Date | Download | Note |
32+
| ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
33+
| InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new) |
34+
| InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus) | more SFT data and stronger |
35+
| InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2) | scaling up LLM to 34B |
36+
| InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1) | support Chinese and stronger OCR |
37+
| InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px) | 448 resolution |
38+
| InternVL−Chat−19B | 2023.12.25 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B) | English multimodal dialogue |
39+
| InternVL−Chat−13B | 2023.12.25 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B) | English multimodal dialogue |
4240

4341
**Vision-Language Foundation Model**
4442

45-
| Model | Date | Download | Note |
46-
| ----------------------- | ---------- | ---------------------------------------------------------------------- | -------------------------------- |
43+
| Model | Date | Download | Note |
44+
| ----------------------- | ---------- | ---------------------------------------------------------------------- | ---------------------------------------------------- |
4745
| InternViT−6B−448px−V1.5 | 2024.04.20 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5) | support dynamic resolution, super strong OCR (🔥new) |
48-
| InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2) | 448 resolution |
49-
| InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0) | 448 resolution |
50-
| InternViT−6B−224px | 2023.12.22 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-224px) | vision foundation model |
51-
| InternVL−14B−224px | 2023.12.22 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-14B-224px) | vision-language foundation model |
46+
| InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2) | 448 resolution |
47+
| InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0) | 448 resolution |
48+
| InternViT−6B−224px | 2023.12.22 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternViT-6B-224px) | vision foundation model |
49+
| InternVL−14B−224px | 2023.12.22 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-14B-224px) | vision-language foundation model |
5250

5351
## What can InternVL do?
5452

@@ -578,47 +576,46 @@ response = model.chat(tokenizer, pixel_values, question, generation_config)
578576
<summary>Launch a local chat demo (click to expand)</summary>
579577

580578
**Launch a controller**
581-
582-
```shell
583-
# run the command in the `internvl_chat_llava` folder
584-
python -m llava.serve.controller --host 0.0.0.0 --port 10000
585-
```
579+
580+
```shell
581+
# run the command in the `internvl_chat_llava` folder
582+
python -m llava.serve.controller --host 0.0.0.0 --port 10000
583+
```
586584

587585
**Launch a gradio web server**
588-
589-
```shell
590-
# run the command in the `internvl_chat_llava` folder
591-
python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
592-
```
593586

594-
**Launch a model worker**
595-
596-
```shell
597-
# OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598-
# run the command in the `internvl_chat_llava` folder
599-
python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
600-
601-
# OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602-
# run the command in the `internvl_chat_llava` folder
603-
python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
604-
605-
# OpenGVLab/InternVL-Chat-V1-1
606-
# run the command in the `internvl_chat` folder
607-
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
608-
609-
# OpenGVLab/InternVL-Chat-V1-2
610-
# run the command in the `internvl_chat` folder
611-
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
612-
613-
# OpenGVLab/InternVL-Chat-V1-2-Plus
614-
# run the command in the `internvl_chat` folder
615-
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
616-
617-
# OpenGVLab/InternVL-Chat-V1-5
618-
# run the command in the `internvl_chat` folder
619-
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
587+
```shell
588+
# run the command in the `internvl_chat_llava` folder
589+
python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
620590
```
621591

592+
**Launch a model worker**
593+
594+
```shell
595+
# OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
596+
# run the command in the `internvl_chat_llava` folder
597+
python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598+
599+
# OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
600+
# run the command in the `internvl_chat_llava` folder
601+
python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602+
603+
# OpenGVLab/InternVL-Chat-V1-1
604+
# run the command in the `internvl_chat` folder
605+
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
606+
607+
# OpenGVLab/InternVL-Chat-V1-2
608+
# run the command in the `internvl_chat` folder
609+
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
610+
611+
# OpenGVLab/InternVL-Chat-V1-2-Plus
612+
# run the command in the `internvl_chat` folder
613+
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
614+
615+
# OpenGVLab/InternVL-Chat-V1-5
616+
# run the command in the `internvl_chat` folder
617+
python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
618+
```
622619

623620
</details>
624621

0 commit comments

Comments
 (0)