Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Make adding new Policy Models flexible #327

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Conversation

engmubarak48
Copy link
Collaborator

Fixes #293

This PR tries to make adding new function approximations (policy models) as flexible as possible.

  • Modification of the base.py file to dynamically import and instantiate policy models based on configuration
  • Allowing the addition of new model types (e.g., CNN, GNN, Transformer) without altering the core logic.
  • Implementation of a clean interface in the ModelBase class

@engmubarak48 engmubarak48 linked an issue Jun 18, 2024 that may be closed by this pull request
Copy link
Collaborator

@josephdviviano josephdviviano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - a comment, but non blocking.

gflownet/policy/mlp.py Outdated Show resolved Hide resolved
@alexhernandezgarcia
Copy link
Owner

Great that you're taking a stab at this, which is one of the important things that remain to be done! Looks good so far!

Comment on lines +312 to +314
if self.flatten:
return self.states2proxy(states).flatten(start_dim=1).to(self.float)
return self.states2proxy(states).to(self.float)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexhernandezgarcia This is a temporary solution to make the CNN policy work on Tetris env. But normally the flattening should happen inside the model but not in the environment (see my other comments)

if you are okay with that, then I can update.

config/env/tetris.yaml Show resolved Hide resolved
@@ -75,6 +75,7 @@ def __init__(
height: int = 20,
pieces: List = ["I", "J", "L", "O", "S", "T", "Z"],
rotations: List = [0, 90, 180, 270],
flatten: bool = True,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we move the flattening from the environment to the policy, then we don't need this.

gflownet/policy/mlp.py Show resolved Hide resolved
@engmubarak48
Copy link
Collaborator Author

engmubarak48 commented Jun 28, 2024

I think any policy could be added now.. i have tested tetris on CNN and everything seems to work well with minimal changes.. I think next thing we could do is have another PR which better documents this and show how one could simply change the Policy etc.

test with python main.py env=tetris proxy=tetris policy=cnn env.flatten=False env.width=4 env.height=4
@alexhernandezgarcia feel free to review..

@engmubarak48 engmubarak48 marked this pull request as ready for review June 28, 2024 04:31
@AlexandraVolokhova
Copy link
Collaborator

Thank you for the great work! I've added a bunch of suggestions. It seems to me that it would be worth to add a config file for a case when one wants to create a uniform or a random policy, but I don't have a strong opinion about it.

@engmubarak48
Copy link
Collaborator Author

Hi @AlexandraVolokhova just got time, let me address your comments...

Regarding:

It seems to me that it would be worth to add a config file for a case when one wants to create a uniform or a random policy, but I don't have a strong opinion about it.

I think I agree with you, maybe it would have been nice to have a separate config for fixed and uniform policies (I am hoping you are referring to the two functions inside the base policy class). To be honest I don't even still know why we need them.. We could also keep them as it is with the default configuration.

@engmubarak48 engmubarak48 removed the request for review from michalkoziarski July 9, 2024 23:42
@engmubarak48 engmubarak48 self-assigned this Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flexible Policy Definition
4 participants