Add Pydantic Configuration System with CLI Override Support #946

tyler-griggs · 2026-01-25T00:57:24Z

🎯 Overview

This PR introduces a comprehensive Pydantic-based configuration system for SkyRL training while maintaining full backward compatibility with existing YAML/Hydra workflows.

✨ Key Features

1. Type-Safe Pydantic Models

30+ Pydantic model classes with full type annotations
1:1 mapping to ppo_base_config.yaml structure
IDE autocomplete and type checking support
Runtime validation of configuration values

2. Unified Entry Point

New run_training(cfg: Union[DictConfig, SkyRLConfig]) function
Accepts both YAML-based DictConfig and Python-based Pydantic configs
Automatic type detection and conversion

3. CLI Override Support ✨

NEW: CLI overrides work with Python configs!
Syntax: --trainer.epochs=30 --trainer.policy.model.path=Qwen/Qwen2.5-7B
Supports all Python types (int, float, bool, strings, lists)

4. Compositional Config Building

RLlib-style nested configuration construction
More readable than mutation-based style

5. Full Backward Compatibility

All existing .sh scripts work unchanged
YAML + Hydra CLI overrides still supported

📦 What's New

skyrl_train/config/configs.py - Pydantic models (800+ lines)
skyrl_train/config/cli_overrides.py - CLI override utilities
examples/gsm8k/run_gsm8k.py - Python-based config example
examples/gsm8k/test_run_gsm8k.py - Test config
PYDANTIC_CONFIG_PROJECT.md - Full documentation

🧪 Testing

✅ Successfully ran full training with Pydantic config on 4 GPUs
✅ Verified backward compatibility with YAML configs
✅ Tested CLI override parsing and application

See PYDANTIC_CONFIG_PROJECT.md for complete documentation!

- Add log_to_driver=False to ray.init() to suppress worker/raylet log forwarding - Set RAY_BACKEND_LOG_LEVEL=fatal to suppress C++ metrics exporter errors - Add SKYRL_DEBUG_LOGGING env var to re-enable verbose logging when needed This significantly reduces stdout noise during training runs while still preserving all logs in Ray's log files for debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add log_to_driver=False to ray.init() to suppress worker/raylet log forwarding - Set RAY_BACKEND_LOG_LEVEL=fatal to suppress C++ metrics exporter errors - Use existing LOG_LEVEL=DEBUG to re-enable verbose logging when needed This significantly reduces stdout noise during training runs while still preserving all logs in Ray's log files for debugging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces a comprehensive Pydantic-based configuration system, significantly improving type safety and programmatic configuration, including new support for CLI overrides. A critical security vulnerability has been identified in the implementation of CLI overrides within set_nested_attr and get_nested_attr. The use of unvalidated dot-notation paths with getattr and setattr allows for Attribute Injection and Class Pollution, which could enable an attacker to corrupt the application state or bypass security controls if training scripts are executed with untrusted command-line arguments. Strict validation is recommended to ensure only defined Pydantic fields can be accessed or modified. Additionally, a critical functional issue was found in the CLI override parsing logic that could cause it to fail for certain argument formats, and a high-severity bug in default value handling might lead to user-provided arguments being ignored. There are also several medium-severity suggestions for code quality, such as removing unused imports, adhering to import location best practices, and updating the project documentation.

gemini-code-assist · 2026-01-25T01:00:03Z

skyrl-train/skyrl_train/config/cli.py

+                # It's a flag or argument with value
+                # Check if it's defined in parser
+                try:
+                    # Try to parse just this arg to see if it's known
+                    test_args = known_args + [arg]
+                    if i + 1 < len(args) and not args[i + 1].startswith("--"):
+                        test_args.append(args[i + 1])
+                    parser.parse_known_args(test_args)
+                    # If successful, it's a known arg
+                    known_args.append(arg)
+                    if i + 1 < len(args) and not args[i + 1].startswith("--"):
+                        known_args.append(args[i + 1])
+                        i += 2
+                    else:
+                        i += 1
+                except:
+                    # Unknown flag, might be override without =
+                    # Treat as override if next item exists and doesn't start with --
+                    if i + 1 < len(args) and not args[i + 1].startswith("--"):
+                        override_str = f"{arg[2:]}={args[i + 1]}"
+                        overrides.append(override_str)
+                        i += 2
+                    else:
+                        known_args.append(arg)
+                        i += 1


The logic to distinguish between known argparse arguments and unknown config overrides is flawed. The try-except block on lines 158-171 will not catch unknown arguments as intended.

parser.parse_known_args() does not raise an exception for unknown arguments; instead, it returns them in the second element of the result tuple. Your current implementation will incorrectly classify unknown arguments (like --trainer.policy.model.path ... when not using =) as known arguments, which will cause parser.parse_args() to fail later.

This logic needs to be revised to correctly check the unknown list returned by parse_known_args() to properly identify and handle config overrides.

gemini-code-assist · 2026-01-25T01:00:03Z

skyrl-train/skyrl_train/config/configs.py

+def set_nested_attr(obj: Any, path: str, value: Any) -> None:
+    """
+    Set a nested attribute on a Pydantic model using dot notation.
+
+    Args:
+        obj: The root Pydantic model
+        path: Dot-separated path (e.g., "trainer.policy.model.path")
+        value: Value to set
+
+    Example:
+        >>> cfg = SkyRLConfig(...)
+        >>> set_nested_attr(cfg, "trainer.policy.model.path", "Qwen/Qwen2.5-1.5B")
+    """
+    parts = path.split(".")
+    for part in parts[:-1]:
+        obj = getattr(obj, part)
+    setattr(obj, parts[-1], value)
+


The set_nested_attr function is vulnerable to Attribute Injection (and potentially Class Pollution). It uses getattr and setattr to traverse and modify the configuration object based on a dot-separated path provided via CLI overrides. Since the path is not validated, an attacker can provide a path like __class__.__init__ to overwrite the constructor of the class, or access other internal Python attributes.

To remediate this, you should validate that each part of the path corresponds to a valid field defined in the Pydantic model. For Pydantic v2 models, you can check obj.model_fields to ensure only intended configuration fields are modified.

def set_nested_attr(obj: Any, path: str, value: Any) -> None: parts = path.split(".") for part in parts[:-1]: if hasattr(obj, "model_fields") and part not in obj.model_fields: raise ValueError(f"Invalid config field: {part}") if part.startswith("__"): raise ValueError(f"Access to internal attribute {part} is forbidden") obj = getattr(obj, part) last_part = parts[-1] if hasattr(obj, "model_fields") and last_part not in obj.model_fields: raise ValueError(f"Invalid config field: {last_part}") if last_part.startswith("__"): raise ValueError(f"Access to internal attribute {last_part} is forbidden") setattr(obj, last_part, value)

gemini-code-assist · 2026-01-25T01:00:03Z

skyrl-train/examples/gsm8k/run_gsm8k.py

+    data_dir = args.data_dir or os.environ.get("DATA_DIR", None)
+    num_gpus = args.num_gpus or int(os.environ.get("NUM_GPUS", "4"))
+    logger = args.logger or os.environ.get("LOGGER", "wandb")
+    inference_backend = args.inference_backend or os.environ.get("INFERENCE_BACKEND", "vllm")


Using the or operator for setting default values can lead to unexpected behavior. For instance, if a user provides 0 for --num-gpus or an empty string for a path, these falsy values will be ignored, and the default from the environment variable or the hardcoded value will be used instead. This can be surprising and lead to incorrect configurations.

To ensure user-provided values are always respected, it's safer to explicitly check for None before falling back to a default.

Suggested change

data_dir = args.data_dir or os.environ.get("DATA_DIR", None)

num_gpus = args.num_gpus or int(os.environ.get("NUM_GPUS", "4"))

logger = args.logger or os.environ.get("LOGGER", "wandb")

inference_backend = args.inference_backend or os.environ.get("INFERENCE_BACKEND", "vllm")

data_dir = args.data_dir if args.data_dir is not None else os.environ.get("DATA_DIR", None)

num_gpus = args.num_gpus if args.num_gpus is not None else int(os.environ.get("NUM_GPUS", "4"))

logger = args.logger if args.logger is not None else os.environ.get("LOGGER", "wandb")

inference_backend = args.inference_backend if args.inference_backend is not None else os.environ.get("INFERENCE_BACKEND", "vllm")

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/skyrl_train/config/configs.py

+def get_nested_attr(obj: Any, path: str) -> Any:
+    """
+    Get a nested attribute from a Pydantic model using dot notation.
+
+    Args:
+        obj: The root Pydantic model
+        path: Dot-separated path (e.g., "trainer.policy.model.path")
+
+    Returns:
+        The value at the specified path
+
+    Example:
+        >>> cfg = SkyRLConfig(...)
+        >>> model_path = get_nested_attr(cfg, "trainer.policy.model.path")
+    """
+    parts = path.split(".")
+    for part in parts:
+        obj = getattr(obj, part)
+    return obj
+


The get_nested_attr function allows arbitrary attribute access because it does not validate the path argument. An attacker could use this to read sensitive internal attributes of the Python objects. Similar to set_nested_attr, you should validate that the path only traverses allowed fields of the Pydantic model.

def get_nested_attr(obj: Any, path: str) -> Any: parts = path.split(".") for part in parts: if hasattr(obj, "model_fields") and part not in obj.model_fields: raise ValueError(f"Invalid config field: {part}") if part.startswith("__"): raise ValueError(f"Access to internal attribute {part} is forbidden") obj = getattr(obj, part) return obj

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/PYDANTIC_CONFIG_PROJECT.md

+1. **No CLI overrides for Python configs**: When using pure Python configs, you can't override via CLI like Hydra does
+   - **Workaround**: Use environment variables or modify Python code
+   - **Planned**: Add argparse-based override system


This documentation appears to be out of sync with the implementation in this pull request. The PR successfully adds CLI override support for Python configurations, but this document still lists it as a current limitation and a future to-do item.

To align the documentation with the code, please consider the following updates:

Line 37: Mark [ ] Add CLI override support for Python-based configs as completed ([x]).

Lines 223-225: Remove this section, as the limitation has been addressed.

Line 236: Mark [ ] **CLI Override Support** as completed ([x]).

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/examples/gsm8k/run_gsm8k.py

+import os
+from pathlib import Path
+
+from skyrl_train.config.configs import create_default_config, SkyRLConfig


The create_default_config import is not used in this file and can be removed to keep the imports clean.

Suggested change

from skyrl_train.config.configs import create_default_config, SkyRLConfig

from skyrl_train.config.configs import SkyRLConfig

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/examples/gsm8k/run_gsm8k.py

+    from skyrl_train.config.configs import (
+        DataConfig, TrainerConfig, GeneratorConfig, EnvironmentConfig,
+        PlacementConfig, PolicyConfig, RefConfig, CriticConfig,
+        ModelConfig, CriticModelConfig, OptimizerConfig, AlgorithmConfig,
+        SamplingParamsConfig
+    )


According to PEP 8, imports should generally be placed at the top of the file. Moving this local import out of the get_gsm8k_config function would improve readability and align with standard Python conventions. There doesn't appear to be a circular dependency that would require a local import here.

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/examples/gsm8k/test_run_gsm8k.py

+import os
+from pathlib import Path
+
+from skyrl_train.config.configs import create_default_config


The create_default_config import is unused in this file and can be removed.

gemini-code-assist · 2026-01-25T01:00:04Z

skyrl-train/examples/gsm8k/test_run_gsm8k.py

+    from skyrl_train.config.configs import (
+        SkyRLConfig, DataConfig, TrainerConfig, GeneratorConfig, EnvironmentConfig,
+        PlacementConfig, PolicyConfig, RefConfig, CriticConfig,
+        ModelConfig, CriticModelConfig, OptimizerConfig, AlgorithmConfig,
+        SamplingParamsConfig
+    )


According to PEP 8, imports should generally be placed at the top of the file. Moving this local import out of the get_test_config function would improve readability and align with standard Python conventions. There doesn't appear to be a circular dependency that would require a local import here.

- Renamed cli_overrides.py to cli.py (cleaner name) - Simplified main() function to remove redundant argparse arguments - All config fields can be overridden via --key.path=value syntax - Environment variables still supported for convenience params Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add _validate_path() to prevent access to private/dunder attributes - Remove unused create_default_config import from run_gsm8k.py - Replace flawed parse_args_with_overrides with simpler collect_overrides Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

erictang000 · 2026-01-26T23:55:58Z

thoughts on making these just simple python dataclasses rather than opting for full blown pydantic configs?

just feeling that pydantic sometimes comes with heavy weight and annoying validation which can slow down projects (for example, it's sometimes pretty annoying for testing to construct valid pydantic configs that have just a subset of arguments). We do a lot of config validation in skyrl-train already, and adding new fields to a dataclass is easier. We could use pydantic selectively for some specific configs that need extra validation, which seems like a common pattern.

For example, vLLM and huggingface EngineArgs and TrainingArgs use python dataclasses for the top level config, but have some specific configs that use pydantic for validation.

tyler-griggs and others added 2 commits January 25, 2026 00:49

gemini-code-assist bot reviewed Jan 25, 2026

View reviewed changes

tyler-griggs and others added 2 commits January 25, 2026 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Pydantic Configuration System with CLI Override Support #946

Add Pydantic Configuration System with CLI Override Support #946

Uh oh!

tyler-griggs commented Jan 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

gemini-code-assist bot Jan 25, 2026

Uh oh!

erictang000 commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	from skyrl_train.config.configs import create_default_config, SkyRLConfig
	from skyrl_train.config.configs import SkyRLConfig

Add Pydantic Configuration System with CLI Override Support #946

Are you sure you want to change the base?

Add Pydantic Configuration System with CLI Override Support #946

Uh oh!

Conversation

tyler-griggs commented Jan 25, 2026

🎯 Overview

✨ Key Features

1. Type-Safe Pydantic Models

2. Unified Entry Point

3. CLI Override Support ✨

4. Compositional Config Building

5. Full Backward Compatibility

📦 What's New

🧪 Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

erictang000 commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants