14
14
- ` 2024/01/24 ` : InternVL-Chat-V1.1 is released, it supports Chinese and has stronger OCR capability, see [ here] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) or try our [ demo] ( https://internvl.opengvlab.com/ ) .
15
15
- ` 2024/01/16 ` : We release our [ customized mmcv/mmsegmentation/mmdetection code] ( https://github.com/OpenGVLab/InternVL-MMDetSeg ) , integrated with DeepSpeed, which can be used for training large-scale object detection and semantic segmentation models.
16
16
17
-
18
17
## Compared with SOTA VLLMs
19
18
20
19
<img width =" 1229 " alt =" image " src =" https://github.com/OpenGVLab/InternVL/assets/23737120/e9065a58-86fa-47ef-be9a-eb734532e73f " >
@@ -29,26 +28,25 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
29
28
30
29
** Vision Large Language Model**
31
30
32
- | Model | Date | Download | Note |
33
- | ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ---------------------------------- |
34
- | InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5 ) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
35
- | InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus ) | more SFT data and stronger |
36
- | InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2 ) | scaling up LLM to 34B |
37
- | InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) | support Chinese and stronger OCR |
38
- | InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px ) | 448 resolution |
39
- | InternVL−Chat−19B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B ) | English multimodal dialogue |
40
- | InternVL−Chat−13B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B ) | English multimodal dialogue |
41
-
31
+ | Model | Date | Download | Note |
32
+ | ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
33
+ | InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5 ) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new) |
34
+ | InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus ) | more SFT data and stronger |
35
+ | InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2 ) | scaling up LLM to 34B |
36
+ | InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) | support Chinese and stronger OCR |
37
+ | InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px ) | 448 resolution |
38
+ | InternVL−Chat−19B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B ) | English multimodal dialogue |
39
+ | InternVL−Chat−13B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B ) | English multimodal dialogue |
42
40
43
41
** Vision-Language Foundation Model**
44
42
45
- | Model | Date | Download | Note |
46
- | ----------------------- | ---------- | ---------------------------------------------------------------------- | -------------------------------- |
43
+ | Model | Date | Download | Note |
44
+ | ----------------------- | ---------- | ---------------------------------------------------------------------- | ---------------------------------------------------- |
47
45
| InternViT−6B−448px−V1.5 | 2024.04.20 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5 ) | support dynamic resolution, super strong OCR (🔥new) |
48
- | InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2 ) | 448 resolution |
49
- | InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0 ) | 448 resolution |
50
- | InternViT−6B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-224px ) | vision foundation model |
51
- | InternVL−14B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-14B-224px ) | vision-language foundation model |
46
+ | InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2 ) | 448 resolution |
47
+ | InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0 ) | 448 resolution |
48
+ | InternViT−6B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-224px ) | vision foundation model |
49
+ | InternVL−14B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-14B-224px ) | vision-language foundation model |
52
50
53
51
## What can InternVL do?
54
52
@@ -578,47 +576,46 @@ response = model.chat(tokenizer, pixel_values, question, generation_config)
578
576
<summary >Launch a local chat demo (click to expand)</summary >
579
577
580
578
** Launch a controller**
581
-
582
- ``` shell
583
- # run the command in the `internvl_chat_llava` folder
584
- python -m llava.serve.controller --host 0.0.0.0 --port 10000
585
- ```
579
+
580
+ ``` shell
581
+ # run the command in the `internvl_chat_llava` folder
582
+ python -m llava.serve.controller --host 0.0.0.0 --port 10000
583
+ ```
586
584
587
585
** Launch a gradio web server**
588
-
589
- ``` shell
590
- # run the command in the `internvl_chat_llava` folder
591
- python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
592
- ```
593
586
594
- ** Launch a model worker**
595
-
596
- ``` shell
597
- # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598
- # run the command in the `internvl_chat_llava` folder
599
- python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
600
-
601
- # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602
- # run the command in the `internvl_chat_llava` folder
603
- python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
604
-
605
- # OpenGVLab/InternVL-Chat-V1-1
606
- # run the command in the `internvl_chat` folder
607
- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
608
-
609
- # OpenGVLab/InternVL-Chat-V1-2
610
- # run the command in the `internvl_chat` folder
611
- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
612
-
613
- # OpenGVLab/InternVL-Chat-V1-2-Plus
614
- # run the command in the `internvl_chat` folder
615
- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
616
-
617
- # OpenGVLab/InternVL-Chat-V1-5
618
- # run the command in the `internvl_chat` folder
619
- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
587
+ ``` shell
588
+ # run the command in the `internvl_chat_llava` folder
589
+ python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
620
590
```
621
591
592
+ ** Launch a model worker**
593
+
594
+ ``` shell
595
+ # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
596
+ # run the command in the `internvl_chat_llava` folder
597
+ python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598
+
599
+ # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
600
+ # run the command in the `internvl_chat_llava` folder
601
+ python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602
+
603
+ # OpenGVLab/InternVL-Chat-V1-1
604
+ # run the command in the `internvl_chat` folder
605
+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
606
+
607
+ # OpenGVLab/InternVL-Chat-V1-2
608
+ # run the command in the `internvl_chat` folder
609
+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
610
+
611
+ # OpenGVLab/InternVL-Chat-V1-2-Plus
612
+ # run the command in the `internvl_chat` folder
613
+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
614
+
615
+ # OpenGVLab/InternVL-Chat-V1-5
616
+ # run the command in the `internvl_chat` folder
617
+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
618
+ ```
622
619
623
620
</details >
624
621
0 commit comments