vwxyzjn
Follow
😃
RLHF @allenai, CS Ph.D. from Drexel University in RL.
- Philadelphia, PA
-
08:51
(UTC -05:00) - https://costa.sh
- @vwxyzjn
Pinned Loading
-
huggingface/trl
huggingface/trl PublicTrain transformer language models with reinforcement learning.
-
lm-human-preference-details
lm-human-preference-details PublicRLHF implementation details of OAI's 2019 codebase
-
ppo-implementation-details
ppo-implementation-details PublicThe source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
-
portwarden
portwarden PublicCreate Encrypted Backups of Your Bitwarden Vault with Attachments
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.