-
Notifications
You must be signed in to change notification settings - Fork 0
Custom curves upload, validation and to/from df #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces custom curve upload functionality to pyetm, allowing users to upload custom curves from pandas DataFrames to the API. The implementation includes validation for curve data, serialization/deserialization to/from DataFrames, and a new upload runner service.
- Adds custom curve upload via
update_custom_curves()
method on scenarios - Implements validation for curve length (8760 values) and numeric data types
- Provides DataFrame serialization/deserialization for custom curves
Reviewed Changes
Copilot reviewed 12 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
src/pyetm/services/scenario_runners/update_custom_curves.py | New runner service for uploading custom curves to the API |
src/pyetm/models/custom_curves.py | Adds DataFrame serialization, validation, and upload functionality |
src/pyetm/models/scenario.py | Adds update_custom_curves() method for uploading curves |
src/pyetm/models/base.py | Implements from_dataframe() class method infrastructure |
src/pyetm/models/warnings.py | Removes deprecated to_legacy_dict() method |
tests/ | Comprehensive test coverage for new functionality |
examples/excel_to_scenario.ipynb | Example notebook demonstrating curve upload workflow |
try: | ||
if curve.file_path and curve.file_path.exists(): | ||
# Use file | ||
with open(curve.file_path, "rb") as f: |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opening the file in binary mode ("rb") but then trying to iterate over curve data as text in the else branch (line 45) creates inconsistency. The file should be opened in text mode ("r") for CSV files, or the string conversion logic needs to handle bytes properly.
with open(curve.file_path, "rb") as f: | |
with open(curve.file_path, "r") as f: |
Copilot uses AI. Check for mistakes.
pd.to_numeric(raw_data.iloc[:, 0], errors='raise') | ||
except (ValueError, TypeError): | ||
curve_warnings.add( | ||
curve.key, "Curve contains non-numeric values" |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling for non-numeric values uses a generic ValueError/TypeError catch, but the user-facing error message 'Curve contains non-numeric values' doesn't provide information about which specific values failed or where they are located in the data.
curve.key, "Curve contains non-numeric values" | |
# Find non-numeric values and their indices | |
col = raw_data.iloc[:, 0] | |
numeric_col = pd.to_numeric(col, errors='coerce') | |
non_numeric_mask = numeric_col.isna() & ~col.isna() | |
non_numeric_indices = col[non_numeric_mask].index.tolist() | |
non_numeric_values = col[non_numeric_mask].tolist() | |
# Show up to 5 offending values for brevity | |
details = ", ".join( | |
f"row {i+1}: '{v}'" for i, v in zip(non_numeric_indices[:5], non_numeric_values[:5]) | |
) | |
if len(non_numeric_indices) > 5: | |
details += f", ... ({len(non_numeric_indices)} total)" | |
curve_warnings.add( | |
curve.key, | |
f"Curve contains non-numeric values at {details}" |
Copilot uses AI. Check for mistakes.
raise ScenarioError(f"Could not update custom curves: {error_summary}") | ||
|
||
# Upload curves | ||
result = UpdateCustomCurvesRunner.run(BaseClient(), self, custom_curves) |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new BaseClient() instance instead of using the existing client could lead to authentication or configuration issues. The client should be passed as a parameter or obtained from the scenario context.
result = UpdateCustomCurvesRunner.run(BaseClient(), self, custom_curves) | |
result = UpdateCustomCurvesRunner.run(self.client, self, custom_curves) |
Copilot uses AI. Check for mistakes.
def temp_curve_files(): | ||
"""Fixture that creates temporary curve files for testing""" | ||
temp_dir = Path("/tmp/test_update_curves") | ||
temp_dir.mkdir(exist_ok=True) |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hardcoded /tmp
path is not portable across operating systems (Windows doesn't have /tmp
). Use tempfile.mkdtemp()
or pytest.tmp_path
fixture for cross-platform temporary directories.
temp_dir.mkdir(exist_ok=True) | |
def temp_curve_files(tmp_path): | |
"""Fixture that creates temporary curve files for testing""" | |
temp_dir = tmp_path |
Copilot uses AI. Check for mistakes.
test_data.to_csv(temp_file, index=False, header=False) | ||
original.file_path = temp_file | ||
|
||
try: |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hardcoded /tmp
path is not portable across operating systems. Use tempfile.mkdtemp()
or pytest's tmp_path
fixture for better cross-platform compatibility and automatic cleanup.
try: | |
with tempfile.TemporaryDirectory() as temp_dir: | |
temp_dir_path = Path(temp_dir) | |
temp_file = temp_dir_path / "test_curve.csv" | |
test_data.to_csv(temp_file, index=False, header=False) | |
original.file_path = temp_file |
Copilot uses AI. Check for mistakes.
if len(raw_data) != 8760: | ||
curve_warnings.add( | ||
curve.key, | ||
f"Curve must contain exactly 8,760 values, found {len(raw_data)}", |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic number 8760 (hours in a year) is hardcoded in multiple places. Consider defining this as a named constant like HOURS_PER_YEAR = 8760
to improve maintainability.
f"Curve must contain exactly 8,760 values, found {len(raw_data)}", | |
if len(raw_data) != HOURS_PER_YEAR: | |
curve_warnings.add( | |
curve.key, | |
f"Curve must contain exactly {HOURS_PER_YEAR:,} values, found {len(raw_data)}", |
Copilot uses AI. Check for mistakes.
* Custom curves to and from dataframe * Upload custom curves runner, scenario methods and validation on the model * Custom curves: tests for runner, scenario and validation
Reviewing myself and with copilot. These changes allow you to upload custom curves from a pandas dataframe to the API. For now the example reads in csv files manually to convert to a pandas dataframe in advance of implementing the reverse packer functionality which will handle unpacking excel/csv.
Uploaded curves are given the type 'custom' inside pyetm. The type (capacity profile etc) will be inferred by the API on upload, so this really just signifies the source of the curve within pyetm.
Validation is on length and unit type (float). Validation also tries to coerce units into floats if they are not already.