Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Environment Test Script #52

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

sambhavnoobcoder
Copy link

During my initial setup, I found myself writing a quick test to verify if my environment was properly configured on my local . This simple script checks PyTorch installation, model loading, and basic inference - basically making sure everything's working before diving into actual development. Sharing it in case others find it useful, but feel free to discard the PR if it doesn't fit the project's needs .

also kudos on the initiative . this is really awesome !!

@qgallouedec
Copy link
Member

Thanks for the suggestion. How is it specific to Open-R1? It seems more related to transformers.
Usually we tests using unittest and a CI, I'd prefer this option

@sambhavnoobcoder
Copy link
Author

Thanks for the feedback! I completely agree with your points about unittest and CI being the proper approach.

However, I had a slightly different use case in mind for this script. While we work towards building a robust test suite, this simple verification tool could help new contributors quickly validate their environment setup when they first clone the repo. I've found it particularly helpful for catching common setup issues before diving into development.

That said, I see this as a temporary solution. Once we have comprehensive unittest coverage and CI in place, we can deprecate this script in favor of proper testing infrastructure.

Would love to hear your thoughts on this perspective. Happy to take a different approach if you feel the focus should be on building out the test suite first.

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"\nModel response:\n{response}")

total_time = time.time() - start_time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you 100% sure this does not need cuda synchronize clocks 1?
I am not sure here since there is a decoding step which takes the response and prints it so it is probably okay, just an off the cuff thought that I had.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , even though here tokenizer.decode() does serve as an implicit synchronisation of sorts , but it is better to make a robust and explicit behaviour rather than relying on implicit behaviour , hence this is addressed in commit 005fc7c


os.environ["TOKENIZERS_PARALLELISM"] = "false"

def test_installation():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, however would be great if we could parametrize this for better maintainability. Like, model_name, device, prompt, verbose flag (for detailing time info) could be passed in the function definition.

This way it will be easier to separate concerns when MPS does not work for some scenario, CUDA fails for some version, etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , i agree . I've parametrized the function to make it more flexible and maintainable in commit eeb3a1d

)
print("✓ Model loaded successfully")

# Test inference
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have the one liner comments provided presumably by one of LLMs, they are a great hint which sections could be abstracted to test methods that could be used in the future for different installation configurations. Like test_inference could be taken out of this function so we have test setup - loading model and setting up the prompt, running inference, displaying the result.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay ,now I've refactored the code to separate concerns into distinct test methods:

  1. _test_pytorch_environment(): 2c486a8 commit
  2. _load_model_and_tokenizer(): fef029b commit
  3. _run_inference(): 20d4f08 commit

@sambhavnoobcoder
Copy link
Author

resolved all comments , kindly have a look and inform me of any more required changes .

@qgallouedec
Copy link
Member

I still don't quite understand how this is specific to Open-R1. It loads a model and runs a generation if I understand correctly. So it's a demo code for the transformers library basically? Maybe it would make more sense as a transformers lib demo script?
I think this script can stay in this branch, so we can refer to it if needed, but I wouldn't merge it in main to keep the repo as simple as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants