-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Cogagent #445
add Cogagent #445
Conversation
Thanks for your contribution! |
paddlemix/models/cogagent/README.md
Outdated
## 1. 模型简介 | ||
|
||
该模型是 [CogAgent](https://arxiv.org/abs/2312.08914) 的 paddle 实现。对齐的是 huggingface 上的 `THUDM/cogagent-chat-hf`, tokenizer 采用的是 huggingface 上的 `lmsys/vicuna-7b-v1.5` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个简介写详细一点,可以参考qwen-vl的
paddlemix/models/cogagent/README.md
Outdated
|
||
### 2.1 依赖安装 | ||
|
||
1) 安装PaddleNLP develop版本 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里如果没有特殊的依赖包,可以引导到首页的环境安装那里,不用具体写
from functools import partial | ||
from typing import Optional, Tuple, Union | ||
|
||
from paddlenlp.transformers.bit.modeling import drop_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import 统一放到上面
super().__init__() | ||
# >>>>>> img_size = timm.layers.to_2tuple(img_size) | ||
img_size = (img_size, img_size) | ||
# >>>>>> patch_size = timm.layers.to_2tuple(patch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删掉这种
paddlemix/models/cogagent/visual.py
Outdated
boi = self.boi.expand(shape=[x.shape[0], -1, -1]) | ||
eoi = self.eoi.expand(shape=[x.shape[0], -1, -1]) | ||
x = paddle.concat(x=(boi, x, eoi), axis=1) | ||
return x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个visual与cross_visual能否合成一个脚本
paddlemix/models/cogagent/README.md
Outdated
|
||
```bash | ||
python paddlemix/examples/cogagent/chat_demo.py | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个写一下可选参数说明
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
以上做了相应修改~
## 1. 模型介绍 | ||
|
||
该模型是 [CogAgent](https://arxiv.org/abs/2312.08914) 的 paddle 实现。对齐的是 huggingface 上的 `THUDM/cogagent-chat-hf`, tokenizer 采用的是 huggingface 上的 `lmsys/vicuna-7b-v1.5` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对齐的是 huggingface 这种直接去掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
parser.add_argument("--from_pretrained", type=str, default="THUDM/cogagent-chat-hf", help="pretrained ckpt") | ||
parser.add_argument("--local_tokenizer", type=str, default="lmsys/vicuna-7b-v1.5") | ||
parser.add_argument("--local_tokenizer", type=str, default="lmsys/vicuna-7b-v1.5") | ||
args = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多了一个local_tokenizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
|
||
torch_type = "float32" | ||
print("========Use torch type as:{} with device:{}========\n\n".format(torch_type, DEVICE)) | ||
paddle.set_device(DEVICE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除torch相关的命名和print,DEVICE可要可不要,paddle默认gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
|
||
|
||
class EVA2CLIPModel(paddle.nn.Layer): | ||
def __init__(self, config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个eva2clip,对比一下paddlemix/model/eva02,看是否相同能复用,如果可以尽量复用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MLP组网设计与paddlemix/model/eva02中的实现略有不同,hf上的实现,多了一个可学习的gate_proj当作门来控制mlp里面第一次linear后的hidden feature,paddle上就是直接两个linear串起来,应该不能复用
Co-authored-by: LokeZhou <aishenghuoaiqq@163.com>
No description provided.