Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement the baseline EIPO in this Project? #1

Open
Charlie0257 opened this issue Jan 11, 2025 · 2 comments
Open

How to implement the baseline EIPO in this Project? #1

Charlie0257 opened this issue Jan 11, 2025 · 2 comments

Comments

@Charlie0257
Copy link

Hi, @williamd4112

Thanks for your great work HEPO!
I want to know the command for implementing the baseline EIPO because I noticed self.use_switch in the project:)

Thanks for any advice!
Best,
Charlie

@williamd4112
Copy link
Collaborator

Hi Charlie,

We will add EIPO to the current codebase soon. Please let me know if you have other questions and feel free to remind me at zwhong@mit.edu

@Charlie0257
Copy link
Author

@williamd4112 Thanks for your reply!

When I trained my task with HEPO, I noticed a performance collapse in HEPO's performance, not as good as ref policy, much less heuristics.

I don't know if you have encountered this when tuning your hyperparameters.

Can you offer any advice on hyperparameter tuning or anything else?

Best,
Charlie

HEPO+bottle
PPO+H+Bottle
HEPO and H

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants