[Disscussion] Can we align GLM-130B to human like chatgpt? #43

AnShengqiang · 2022-12-10T15:44:09Z

No description provided.

Xiao9905 · 2022-12-15T09:14:58Z

Certainly. The alignment for GLM-130B could be important, and we are on preliminary surveying.

conceptofmind · 2023-01-05T01:22:16Z

You could use the current glm-10b on huggingface with trl/trlx to construct a model with rlhf.

smeyerhot · 2023-01-08T06:21:39Z

What is trl/trlx? I am very interested in this use case. Why must the 10-b parameter model be used for rlhf?

smeyerhot · 2023-01-08T06:24:57Z

I am actively working on this task and would be very interested in further development coordination.

conceptofmind · 2023-01-08T07:46:45Z

@smeyerhot Trl is Transformer Reinforcement Learning a library built by Huggingface for training language models with PPO. Trlx is an extension of Trl built by CarperAI. Both cover the same use-case for training models using reinforcement learning with human feedback. You can also build the same functionality with actor-critic ppo in PyTorch although it would require more extensive domain knowledge. You do not have to use glm-10b but it is publicly available on Huggingface's model hub unlike 130b which requires you to apply for access. You can use any encoder-decoder or decoder-only model. We are on an issue relating to GLM for aligning human feedback with the model which is why I suggested using the 10b parameter one.

Syno8 · 2023-02-23T08:57:32Z

chatgpt can generate the format text and image. this need to keep the pertaining data in original format

beautifull4frank · 2023-03-10T04:07:27Z

hi gays, I use bloom to implement ppo successfully.

But I found the Bloom model use the AutoModelForCausalLM function.

however, the glm is using the AutoModelForSeq2SeqLM function.

there is no LM in AutoModelForSeq2SeqLM model, so do u know how to correct ?

AnShengqiang changed the title ~~Can we align GLM-130B to human like chatgpt?~~ [Disscussion] Can we align GLM-130B to human like chatgpt? Dec 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Disscussion] Can we align GLM-130B to human like chatgpt? #43

[Disscussion] Can we align GLM-130B to human like chatgpt? #43

AnShengqiang commented Dec 10, 2022

Xiao9905 commented Dec 15, 2022

conceptofmind commented Jan 5, 2023

smeyerhot commented Jan 8, 2023

smeyerhot commented Jan 8, 2023

conceptofmind commented Jan 8, 2023

Syno8 commented Feb 23, 2023

beautifull4frank commented Mar 10, 2023

[Disscussion] Can we align GLM-130B to human like chatgpt? #43

[Disscussion] Can we align GLM-130B to human like chatgpt? #43

Comments

AnShengqiang commented Dec 10, 2022

Xiao9905 commented Dec 15, 2022

conceptofmind commented Jan 5, 2023

smeyerhot commented Jan 8, 2023

smeyerhot commented Jan 8, 2023

conceptofmind commented Jan 8, 2023

Syno8 commented Feb 23, 2023

beautifull4frank commented Mar 10, 2023