Add Environment Test Script #52

sambhavnoobcoder · 2025-01-26T11:06:34Z

During my initial setup, I found myself writing a quick test to verify if my environment was properly configured on my local . This simple script checks PyTorch installation, model loading, and basic inference - basically making sure everything's working before diving into actual development. Sharing it in case others find it useful, but feel free to discard the PR if it doesn't fit the project's needs .

also kudos on the initiative . this is really awesome !!

qgallouedec · 2025-01-26T11:46:48Z

Thanks for the suggestion. How is it specific to Open-R1? It seems more related to transformers.
Usually we tests using unittest and a CI, I'd prefer this option

sambhavnoobcoder · 2025-01-26T13:17:48Z

Thanks for the feedback! I completely agree with your points about unittest and CI being the proper approach.

However, I had a slightly different use case in mind for this script. While we work towards building a robust test suite, this simple verification tool could help new contributors quickly validate their environment setup when they first clone the repo. I've found it particularly helpful for catching common setup issues before diving into development.

That said, I see this as a temporary solution. Once we have comprehensive unittest coverage and CI in place, we can deprecate this script in favor of proper testing infrastructure.

Would love to hear your thoughts on this perspective. Happy to take a different approach if you feel the focus should be on building out the test suite first.

pepinu · 2025-01-26T15:14:37Z

test_installation.py

+        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+        print(f"\nModel response:\n{response}")
+
+        total_time = time.time() - start_time


Are you 100% sure this does not need cuda synchronize clocks 1?
I am not sure here since there is a decoding step which takes the response and prints it so it is probably okay, just an off the cuff thought that I had.

yes , even though here tokenizer.decode() does serve as an implicit synchronisation of sorts , but it is better to make a robust and explicit behaviour rather than relying on implicit behaviour , hence this is addressed in commit 005fc7c

pepinu · 2025-01-26T15:17:42Z

test_installation.py

+
+os.environ["TOKENIZERS_PARALLELISM"] = "false"
+
+def test_installation():


This is nice, however would be great if we could parametrize this for better maintainability. Like, model_name, device, prompt, verbose flag (for detailing time info) could be passed in the function definition.

This way it will be easier to separate concerns when MPS does not work for some scenario, CUDA fails for some version, etc.

yes , i agree . I've parametrized the function to make it more flexible and maintainable in commit eeb3a1d

pepinu · 2025-01-26T15:20:48Z

test_installation.py

+        )
+        print("✓ Model loaded successfully")
+
+        # Test inference


you have the one liner comments provided presumably by one of LLMs, they are a great hint which sections could be abstracted to test methods that could be used in the future for different installation configurations. Like test_inference could be taken out of this function so we have test setup - loading model and setting up the prompt, running inference, displaying the result.

Okay ,now I've refactored the code to separate concerns into distinct test methods:

_test_pytorch_environment(): 2c486a8 commit

_load_model_and_tokenizer(): fef029b commit

_run_inference(): 20d4f08 commit

sambhavnoobcoder · 2025-01-26T21:16:17Z

resolved all comments , kindly have a look and inform me of any more required changes .

qgallouedec · 2025-01-26T22:19:13Z

I still don't quite understand how this is specific to Open-R1. It loads a model and runs a generation if I understand correctly. So it's a demo code for the transformers library basically? Maybe it would make more sense as a transformers lib demo script?
I think this script can stay in this branch, so we can refer to it if needed, but I wouldn't merge it in main to keep the repo as simple as possible.

added script for verifying successful installation

36af372

pepinu reviewed Jan 26, 2025

View reviewed changes

sambhavnoobcoder added 6 commits January 27, 2025 02:09

added parameterisation

eeb3a1d

pytorch env test

2c486a8

test load and tokenise model

fef029b

inference test

20d4f08

test installation

91a5e47

test_installation.py

005fc7c

sambhavnoobcoder requested a review from pepinu January 26, 2025 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Environment Test Script #52

Add Environment Test Script #52

sambhavnoobcoder commented Jan 26, 2025

qgallouedec commented Jan 26, 2025

sambhavnoobcoder commented Jan 26, 2025

pepinu Jan 26, 2025

sambhavnoobcoder Jan 26, 2025

pepinu Jan 26, 2025

sambhavnoobcoder Jan 26, 2025

pepinu Jan 26, 2025

sambhavnoobcoder Jan 26, 2025

sambhavnoobcoder commented Jan 26, 2025

qgallouedec commented Jan 26, 2025


		os.environ["TOKENIZERS_PARALLELISM"] = "false"

		def test_installation():

Add Environment Test Script #52

Are you sure you want to change the base?

Add Environment Test Script #52

Conversation

sambhavnoobcoder commented Jan 26, 2025

qgallouedec commented Jan 26, 2025

sambhavnoobcoder commented Jan 26, 2025

pepinu Jan 26, 2025

Choose a reason for hiding this comment

sambhavnoobcoder Jan 26, 2025

Choose a reason for hiding this comment

pepinu Jan 26, 2025

Choose a reason for hiding this comment

sambhavnoobcoder Jan 26, 2025

Choose a reason for hiding this comment

pepinu Jan 26, 2025

Choose a reason for hiding this comment

sambhavnoobcoder Jan 26, 2025

Choose a reason for hiding this comment

sambhavnoobcoder commented Jan 26, 2025

qgallouedec commented Jan 26, 2025