Releases · shibing624/MedicalGPT

17 Feb 03:18

shibing624

2.4.0

3b55cc5

v2.4.0 Latest

Latest

v2.4.0

新增GRPO训练方法，GRPO通过纯RL方法可以体验aha moment，https://github.com/shibing624/MedicalGPT/blob/main/run_grpo.sh
支持了 DeepSeek-V3, DeepSeek-R1 模型, template_name=deepseek3

Assets 2

02 Aug 10:59

shibing624

2.2.0

d475416

v2.2.0

支持了角色扮演模型训练
新增了医患对话SFT数据生成脚本role_play_data

造角色扮演对话

本数据集使用OpenAI API接口生成，流程：

种子特征集和基础设定：
- 手工编写的种子集包含基本角色特征。
- LLM从这个种子集生成角色的基础设定。
角色设定的进化：
- 第二个种子集包含指导角色设定进化的指令Prompt。
- 这些进化角色的指令Prompt被放到一个指令池中。基于这些进化Prompt，LLM对基础设定实施进化。
反馈循环：
- 由人类评估者和GPT-4组成的混合评价系统。此系统对进化后的设定给出反馈。
- 反馈用于迭代更新种子集。如此迭代，我们最终得到一个细致的角色设定数据集。
角色扮演和对话生成：
- 使用self-instruction框架基于角色设定生成角色的对话数据。

生成角色设定，分别生成护士角色和患者角色

cd role_play_data

python role_generate.py

生成医患之间的多轮对话
LLM选择：分别用gpt-4o的api和豆包的doubao-character-pro-32k的api生成对话

python roleplay_data_generate_gpt4.py

python roleplay_data_generate_doubao.py

What's Changed

add full_train.py and run_full_train.sh by @ZhuangXialie in #394

Full Changelog: 2.1.0...2.2.0

Contributors

LIE-24

Assets 2

11 Jun 03:25

shibing624

2.1.0

99bdd2f

2.1.0

v2.1版本：

支持了 Qwen2 系列模型微调训练

What's Changed

增加中文数据集汇总，本项目支持格式 by @ZhuangXialie in #370
Change Llama tokenizer from LlamaTokenizer to AutoTokenizer by @princepride in #380

New Contributors

@princepride made their first contribution in #380

Full Changelog: 2.0.0...2.1.0

Contributors

princepride and LIE-24

Assets 2

27 Apr 05:36

shibing624

2.0.0

2336bfe

2.0.0

v2.0版本：

支持了 Meta Llama 3 系列模型微调训练
发布了适用于ORPO/DPO/RM模型的偏好数据集shibing624/DPO-En-Zh-20k-Preference
基于llama-3-8b-instruct-262k模型使用ORPO方法微调，得到模型权重：https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese ，及对应的lora权重：https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese-lora

What's Changed

Updates for readme and demo ipynb and a small update for deprecated function by @ker2xu in #360
Typo by @ker2xu in #362
add max_length and max_prompt_length by @ZhuangXialie in #367

New Contributors

@ker2xu made their first contribution in #360
@ZhuangXialie made their first contribution in #367

Full Changelog: 1.9.0...2.0.0

Contributors

ker2xu and LIE-24

Assets 2

17 Apr 09:01

shibing624

1.9.0

9f61e99

1.9.0

v1.9版本

支持了 ORPO，详细用法请参照 run_orpo.sh。不需要参考模型的优化方法，通过ORPO，LLM可以同时学习指令遵循和满足人类偏好，可以直接用base模型训练ORPO，训练相较SFT+DRO更简单，相对需要更多偏好数据集数据。
新增了支持微调qwen1.5, cohere 模型，和对应的template。

What's Changed

Update transformers in requirements.txt by @dividez in #321

Full Changelog: 1.8.0...1.9.0

Contributors

dividez

Assets 2

26 Jan 10:20

shibing624

1.8.0

14098d4

v1.8.0

v1.8版本

支持微调Mixtral混合专家MoE模型 Mixtral 8x7B，SFT中如果用lora微调模型，可以开启4bit量化和QLoRA--load_in_4bit True --qlora True以节省显存，建议设置--target_modules q_proj,k_proj,v_proj,o_proj，这样可以避免对MoE专家网络的MLP层量化，因为它们很稀疏且量化后会导致性能效果下降。
新增了支持微调deepseek, deepseekcoder, orion 模型，和对应的template。

Full Changelog: 1.7.0...1.8.0

Assets 2

14 Jan 04:09

shibing624

1.7.0

f0c0956

v1.7.0

v1.7版本：

新增检索增强生成(RAG)的基于文件问答ChatPDF功能，代码chatpdf.py，可以基于微调后的LLM结合知识库文件问答提升行业问答准确率。运行python chatpdf.py调用rag问答。

Full Changelog: 1.6.0...1.7.0

Assets 2

23 Oct 08:01

shibing624

1.6.0

ed30fe0

v1.6.0

v1.6版本：

新增了RoPE插值来扩展GPT模型的上下文长度，通过位置插值方法，在增量数据上进行训练，使模型获得长文本处理能力，使用 --rope_scaling linear 参数训练模型；
针对LLaMA模型支持了FlashAttention-2，如果您使用的是 RTX4090、A100 或 H100 GPU，请使用 --flash_attn 参数以启用 FlashAttention-2；
新增了LongLoRA 提出的 $S^2$-Attn，使模型获得长文本处理能力，SFT中使用 --shift_attn 参数以启用该功能；
支持了NEFTune给embedding加噪SFT训练方法，NEFTune paper, 使用 --neft_alpha 参数启用 NEFTune，例如 --neft_alpha 5；
PT增量预训练支持qlora方法，如果使用的是 RTX4090、A100 或 H100 GPU，支持nf4，使用--qlora True --load_in_kbits 4参数启用qlora训练。

What's Changed

About validation_file_dir by @Billccx in #196
fix similar to issue #194 by @kinghuin in #200
fix lm_head type changed bug by @jiangtann in #215

New Contributors

@Billccx made their first contribution in #196
@kinghuin made their first contribution in #200
@jiangtann made their first contribution in #215

Full Changelog: 1.5.0...1.6.0

Contributors

kinghuin, jiangtann, and Billccx

Assets 2

27 Aug 16:05

shibing624

1.5.0

391a2af

v1.5.0

v1.5版本

新增DPO(直接偏好优化)方法，DPO通过直接优化语言模型来实现对其行为的精确控制，而无需使用复杂的强化学习，也可以有效学习到人类偏好，DPO相较于RLHF更容易实现且易于训练，效果更好。

提供完整PT+SFT+DPO全阶段串起来训练的pipeline：run_training_dpo_pipeline.ipynb ，其对应的colab：，运行完大概需要15分钟，我运行成功后的副本colab：

What's Changed

Update rl_training.py by @dividez in #159
Update pretraining.py by @anwuzhiab in #167
Dpo by @shibing624 in #180
update dpo pynb by @shibing624 in #181

New Contributors

@dividez made their first contribution in #159
@anwuzhiab made their first contribution in #167

Full Changelog: 1.4.0...1.5.0

Contributors

shibing624, anwuzhiab, and dividez

Assets 2

08 Aug 07:41

shibing624

1.4.0

046263b

v1.4.0

v1.4版本

发布基于ShareGPT4数据集微调的中英文Vicuna-13B模型shibing624/vicuna-baichuan-13b-chat，和对应的LoRA模型shibing624/vicuna-baichuan-13b-chat-lora，效果提升，并支持多轮问答。

演示shibing624/vicuna-baichuan-13b-chat模型效果：

What's Changed

update tokenizer for multi round task by @shibing624 in #151
Dev round by @shibing624 in #153

Full Changelog: 1.3.0...1.4.0

Contributors

shibing624

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.4.0

v2.2.0

造角色扮演对话

What's Changed

Contributors

v2.1版本：

What's Changed

New Contributors

Contributors

v2.0版本：

What's Changed

New Contributors

Contributors

v1.9版本

What's Changed

Contributors

v1.8版本

v1.7版本：

v1.6版本：

What's Changed

New Contributors

Contributors

v1.5版本

What's Changed

New Contributors

Contributors

v1.4版本

What's Changed

Contributors

Releases: shibing624/MedicalGPT

v2.4.0

v2.4.0

v2.2.0

v2.2.0

造角色扮演对话

What's Changed

Contributors

2.1.0

v2.1版本：

What's Changed

New Contributors

Contributors

2.0.0

v2.0版本：

What's Changed

New Contributors

Contributors

1.9.0

v1.9版本

What's Changed

Contributors

v1.8.0

v1.8版本

v1.7.0

v1.7版本：

v1.6.0

v1.6版本：

What's Changed

New Contributors

Contributors

v1.5.0

v1.5版本

What's Changed

New Contributors

Contributors

v1.4.0

v1.4版本

What's Changed

Contributors