Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap - trl 0.2 #64

Closed
21 of 26 tasks
younesbelkada opened this issue Dec 29, 2022 · 6 comments
Closed
21 of 26 tasks

Roadmap - trl 0.2 #64

younesbelkada opened this issue Dec 29, 2022 · 6 comments

Comments

@younesbelkada
Copy link
Contributor

younesbelkada commented Dec 29, 2022

A list of cool things that we can aim for trl 0.2! :

API:

Documentation

Improvements

Any suggestion very welcome!

@younesbelkada younesbelkada changed the title Roadmap - trl 2.0 Roadmap - trl 1.O Dec 29, 2022
@younesbelkada younesbelkada changed the title Roadmap - trl 1.O Roadmap - trl 1.0 Dec 29, 2022
@LouisCastricato
Copy link

BTW, I can confirm that SetFit does make for a really good zero shot RM. There are some issues with using contrastive models as RMs though. It often requires very careful data cleaning and identifying what kinds of clusters work as RMs is a dark art to the point where we decided that it wasn't worth seriously pursing further after CARP CoOp. Rerank models are much better.

@TristanThrush
Copy link
Contributor

I think that the "coolest" dataset we can use to train a model could be https://huggingface.co/datasets/openai/webgpt_comparisons, but it is hard to evaluate this sort of model after we train it. I might start by adding a summarization example, and then some decent ways by which it can be evaluated. Then the webgpt comparisons example

@AlexWortega
Copy link

https://colab.research.google.com/drive/1hkPBFtMP5xBAjNYMjWH7NqYn118kRLOJ?usp=sharing
I am trying to implement own gpt + trl with QA retrival reward, but i think something is wrong with reward/or generation

@natolambert
Copy link
Contributor

@AlexWortega can you open a separate issue / PR for this? Looks interesting, but may get loss in this big 1.0 roadmap thread.

@lvwerra lvwerra changed the title Roadmap - trl 1.0 Roadmap - trl 0.2 Feb 7, 2023
@lvwerra
Copy link
Member

lvwerra commented Feb 7, 2023

We ended up calling this release 0.2 (not 1.0). I am closing the issue and will move the open tasks to a new issue.

@lvwerra lvwerra closed this as completed Feb 7, 2023
@AlexWortega
Copy link

Hi @lvwerra i opened PR #149 with this feature(?) idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants