-
Notifications
You must be signed in to change notification settings - Fork 239
Add Pydantic Configuration System with CLI Override Support #946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add log_to_driver=False to ray.init() to suppress worker/raylet log forwarding - Set RAY_BACKEND_LOG_LEVEL=fatal to suppress C++ metrics exporter errors - Add SKYRL_DEBUG_LOGGING env var to re-enable verbose logging when needed This significantly reduces stdout noise during training runs while still preserving all logs in Ray's log files for debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add log_to_driver=False to ray.init() to suppress worker/raylet log forwarding - Set RAY_BACKEND_LOG_LEVEL=fatal to suppress C++ metrics exporter errors - Use existing LOG_LEVEL=DEBUG to re-enable verbose logging when needed This significantly reduces stdout noise during training runs while still preserving all logs in Ray's log files for debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive Pydantic-based configuration system, significantly improving type safety and programmatic configuration, including new support for CLI overrides. A critical security vulnerability has been identified in the implementation of CLI overrides within set_nested_attr and get_nested_attr. The use of unvalidated dot-notation paths with getattr and setattr allows for Attribute Injection and Class Pollution, which could enable an attacker to corrupt the application state or bypass security controls if training scripts are executed with untrusted command-line arguments. Strict validation is recommended to ensure only defined Pydantic fields can be accessed or modified. Additionally, a critical functional issue was found in the CLI override parsing logic that could cause it to fail for certain argument formats, and a high-severity bug in default value handling might lead to user-provided arguments being ignored. There are also several medium-severity suggestions for code quality, such as removing unused imports, adhering to import location best practices, and updating the project documentation.
| # It's a flag or argument with value | ||
| # Check if it's defined in parser | ||
| try: | ||
| # Try to parse just this arg to see if it's known | ||
| test_args = known_args + [arg] | ||
| if i + 1 < len(args) and not args[i + 1].startswith("--"): | ||
| test_args.append(args[i + 1]) | ||
| parser.parse_known_args(test_args) | ||
| # If successful, it's a known arg | ||
| known_args.append(arg) | ||
| if i + 1 < len(args) and not args[i + 1].startswith("--"): | ||
| known_args.append(args[i + 1]) | ||
| i += 2 | ||
| else: | ||
| i += 1 | ||
| except: | ||
| # Unknown flag, might be override without = | ||
| # Treat as override if next item exists and doesn't start with -- | ||
| if i + 1 < len(args) and not args[i + 1].startswith("--"): | ||
| override_str = f"{arg[2:]}={args[i + 1]}" | ||
| overrides.append(override_str) | ||
| i += 2 | ||
| else: | ||
| known_args.append(arg) | ||
| i += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to distinguish between known argparse arguments and unknown config overrides is flawed. The try-except block on lines 158-171 will not catch unknown arguments as intended.
parser.parse_known_args() does not raise an exception for unknown arguments; instead, it returns them in the second element of the result tuple. Your current implementation will incorrectly classify unknown arguments (like --trainer.policy.model.path ... when not using =) as known arguments, which will cause parser.parse_args() to fail later.
This logic needs to be revised to correctly check the unknown list returned by parse_known_args() to properly identify and handle config overrides.
| def set_nested_attr(obj: Any, path: str, value: Any) -> None: | ||
| """ | ||
| Set a nested attribute on a Pydantic model using dot notation. | ||
|
|
||
| Args: | ||
| obj: The root Pydantic model | ||
| path: Dot-separated path (e.g., "trainer.policy.model.path") | ||
| value: Value to set | ||
|
|
||
| Example: | ||
| >>> cfg = SkyRLConfig(...) | ||
| >>> set_nested_attr(cfg, "trainer.policy.model.path", "Qwen/Qwen2.5-1.5B") | ||
| """ | ||
| parts = path.split(".") | ||
| for part in parts[:-1]: | ||
| obj = getattr(obj, part) | ||
| setattr(obj, parts[-1], value) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set_nested_attr function is vulnerable to Attribute Injection (and potentially Class Pollution). It uses getattr and setattr to traverse and modify the configuration object based on a dot-separated path provided via CLI overrides. Since the path is not validated, an attacker can provide a path like __class__.__init__ to overwrite the constructor of the class, or access other internal Python attributes.
To remediate this, you should validate that each part of the path corresponds to a valid field defined in the Pydantic model. For Pydantic v2 models, you can check obj.model_fields to ensure only intended configuration fields are modified.
def set_nested_attr(obj: Any, path: str, value: Any) -> None:
parts = path.split(".")
for part in parts[:-1]:
if hasattr(obj, "model_fields") and part not in obj.model_fields:
raise ValueError(f"Invalid config field: {part}")
if part.startswith("__"):
raise ValueError(f"Access to internal attribute {part} is forbidden")
obj = getattr(obj, part)
last_part = parts[-1]
if hasattr(obj, "model_fields") and last_part not in obj.model_fields:
raise ValueError(f"Invalid config field: {last_part}")
if last_part.startswith("__"):
raise ValueError(f"Access to internal attribute {last_part} is forbidden")
setattr(obj, last_part, value)| data_dir = args.data_dir or os.environ.get("DATA_DIR", None) | ||
| num_gpus = args.num_gpus or int(os.environ.get("NUM_GPUS", "4")) | ||
| logger = args.logger or os.environ.get("LOGGER", "wandb") | ||
| inference_backend = args.inference_backend or os.environ.get("INFERENCE_BACKEND", "vllm") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the or operator for setting default values can lead to unexpected behavior. For instance, if a user provides 0 for --num-gpus or an empty string for a path, these falsy values will be ignored, and the default from the environment variable or the hardcoded value will be used instead. This can be surprising and lead to incorrect configurations.
To ensure user-provided values are always respected, it's safer to explicitly check for None before falling back to a default.
| data_dir = args.data_dir or os.environ.get("DATA_DIR", None) | |
| num_gpus = args.num_gpus or int(os.environ.get("NUM_GPUS", "4")) | |
| logger = args.logger or os.environ.get("LOGGER", "wandb") | |
| inference_backend = args.inference_backend or os.environ.get("INFERENCE_BACKEND", "vllm") | |
| data_dir = args.data_dir if args.data_dir is not None else os.environ.get("DATA_DIR", None) | |
| num_gpus = args.num_gpus if args.num_gpus is not None else int(os.environ.get("NUM_GPUS", "4")) | |
| logger = args.logger if args.logger is not None else os.environ.get("LOGGER", "wandb") | |
| inference_backend = args.inference_backend if args.inference_backend is not None else os.environ.get("INFERENCE_BACKEND", "vllm") |
| def get_nested_attr(obj: Any, path: str) -> Any: | ||
| """ | ||
| Get a nested attribute from a Pydantic model using dot notation. | ||
|
|
||
| Args: | ||
| obj: The root Pydantic model | ||
| path: Dot-separated path (e.g., "trainer.policy.model.path") | ||
|
|
||
| Returns: | ||
| The value at the specified path | ||
|
|
||
| Example: | ||
| >>> cfg = SkyRLConfig(...) | ||
| >>> model_path = get_nested_attr(cfg, "trainer.policy.model.path") | ||
| """ | ||
| parts = path.split(".") | ||
| for part in parts: | ||
| obj = getattr(obj, part) | ||
| return obj | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_nested_attr function allows arbitrary attribute access because it does not validate the path argument. An attacker could use this to read sensitive internal attributes of the Python objects. Similar to set_nested_attr, you should validate that the path only traverses allowed fields of the Pydantic model.
def get_nested_attr(obj: Any, path: str) -> Any:
parts = path.split(".")
for part in parts:
if hasattr(obj, "model_fields") and part not in obj.model_fields:
raise ValueError(f"Invalid config field: {part}")
if part.startswith("__"):
raise ValueError(f"Access to internal attribute {part} is forbidden")
obj = getattr(obj, part)
return obj| 1. **No CLI overrides for Python configs**: When using pure Python configs, you can't override via CLI like Hydra does | ||
| - **Workaround**: Use environment variables or modify Python code | ||
| - **Planned**: Add argparse-based override system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation appears to be out of sync with the implementation in this pull request. The PR successfully adds CLI override support for Python configurations, but this document still lists it as a current limitation and a future to-do item.
To align the documentation with the code, please consider the following updates:
- Line 37: Mark
[ ] Add CLI override support for Python-based configsas completed ([x]). - Lines 223-225: Remove this section, as the limitation has been addressed.
- Line 236: Mark
[ ] **CLI Override Support**as completed ([x]).
| import os | ||
| from pathlib import Path | ||
|
|
||
| from skyrl_train.config.configs import create_default_config, SkyRLConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| from skyrl_train.config.configs import ( | ||
| DataConfig, TrainerConfig, GeneratorConfig, EnvironmentConfig, | ||
| PlacementConfig, PolicyConfig, RefConfig, CriticConfig, | ||
| ModelConfig, CriticModelConfig, OptimizerConfig, AlgorithmConfig, | ||
| SamplingParamsConfig | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import os | ||
| from pathlib import Path | ||
|
|
||
| from skyrl_train.config.configs import create_default_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| from skyrl_train.config.configs import ( | ||
| SkyRLConfig, DataConfig, TrainerConfig, GeneratorConfig, EnvironmentConfig, | ||
| PlacementConfig, PolicyConfig, RefConfig, CriticConfig, | ||
| ModelConfig, CriticModelConfig, OptimizerConfig, AlgorithmConfig, | ||
| SamplingParamsConfig | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Renamed cli_overrides.py to cli.py (cleaner name) - Simplified main() function to remove redundant argparse arguments - All config fields can be overridden via --key.path=value syntax - Environment variables still supported for convenience params Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add _validate_path() to prevent access to private/dunder attributes - Remove unused create_default_config import from run_gsm8k.py - Replace flawed parse_args_with_overrides with simpler collect_overrides Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
thoughts on making these just simple python dataclasses rather than opting for full blown pydantic configs? just feeling that pydantic sometimes comes with heavy weight and annoying validation which can slow down projects (for example, it's sometimes pretty annoying for testing to construct valid pydantic configs that have just a subset of arguments). We do a lot of config validation in skyrl-train already, and adding new fields to a dataclass is easier. We could use pydantic selectively for some specific configs that need extra validation, which seems like a common pattern. For example, vLLM and huggingface EngineArgs and TrainingArgs use python dataclasses for the top level config, but have some specific configs that use pydantic for validation. |
🎯 Overview
This PR introduces a comprehensive Pydantic-based configuration system for SkyRL training while maintaining full backward compatibility with existing YAML/Hydra workflows.
✨ Key Features
1. Type-Safe Pydantic Models
ppo_base_config.yamlstructure2. Unified Entry Point
run_training(cfg: Union[DictConfig, SkyRLConfig])function3. CLI Override Support ✨
--trainer.epochs=30 --trainer.policy.model.path=Qwen/Qwen2.5-7B4. Compositional Config Building
5. Full Backward Compatibility
.shscripts work unchanged📦 What's New
skyrl_train/config/configs.py- Pydantic models (800+ lines)skyrl_train/config/cli_overrides.py- CLI override utilitiesexamples/gsm8k/run_gsm8k.py- Python-based config exampleexamples/gsm8k/test_run_gsm8k.py- Test configPYDANTIC_CONFIG_PROJECT.md- Full documentation🧪 Testing
✅ Successfully ran full training with Pydantic config on 4 GPUs
✅ Verified backward compatibility with YAML configs
✅ Tested CLI override parsing and application
See
PYDANTIC_CONFIG_PROJECT.mdfor complete documentation!