-
Notifications
You must be signed in to change notification settings - Fork 101
306 refactor data generators #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
306 refactor data generators #331
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request refactors the data generator modules for the interactive regression models and updates documentation examples and imports across multiple modules. Key changes include the addition of new data generator functions under the doubleml/irm/datasets package, modifications to docstrings and sample code block formatting in multiple files, and adjustment of import paths in tests and examples.
Reviewed Changes
Copilot reviewed 73 out of 73 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| doubleml/irm/datasets/dgp_irm_data.py | New data generator for interactive regression models. |
| doubleml/irm/datasets/dgp_iivm_data.py | New data generator for interactive IV models. |
| doubleml/irm/datasets/dgp_heterogeneous_data.py | New data generator for heterogeneous treatment effects. |
| doubleml/irm/datasets/dgp_confounded_irm_data.py | New data generator for confounded IRM models with a proposed minor improvement to a logical condition. |
| doubleml/irm/cvar.py, doubleml/irm/apos.py, doubleml/double_ml.py, doubleml/did/* | Updated documentation examples and import paths. |
| doubleml/datasets/, doubleml/data/, .github/ISSUE_TEMPLATE/bug_report.yml | Updated example snippets’ formatting and module import paths. |
Comments suppressed due to low confidence (3)
doubleml/double_ml.py:1170
- The example code block formatting appears to concatenate two import statements on one line; please split them into separate lines to ensure proper formatting in documentation.
>>> import numpy as np >>> import doubleml as dml
doubleml/double_ml.py:1286
- The code snippet delimiter and the following command are merged on one line; adjust the formatting to ensure each command is on a separate line for clarity.
-------- >>> import numpy as np
.github/ISSUE_TEMPLATE/bug_report.yml:26
- The code block opening in the issue template is misformatted due to extra spacing; ensure the markdown syntax for code blocks is correctly placed on its own line.
Please provide a short reproducible code snippet. Example: ```python
| 1 / (m_long * (1 - m_long)) | ||
| ) | ||
| cf_d_atte = (np.mean(propensity_ratio_long) - np.mean(propensity_ratio_short)) / np.mean(propensity_ratio_long) | ||
| if (beta_a == 0) | (gamma_a == 0): |
Copilot
AI
Jun 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the logical 'or' operator instead of the bitwise '|' for scalar comparisons to improve code clarity (e.g., 'if beta_a == 0 or gamma_a == 0:').
| if (beta_a == 0) | (gamma_a == 0): | |
| if (beta_a == 0) or (gamma_a == 0): |
All tests passed, except [RDD, not tested at all, not changed] and one test for DID:
> agg_weights[selected_unique_e_values >= 0] = 1 / np.sum(selected_unique_e_values >= 0) E numpy.core._exceptions._UFuncInputCastingError: Cannot cast ufunc 'greater_equal' input 0 from dtype('<m8[M]') to dtype('<m8') with casting rule 'same_kind' doubleml\did\utils\_aggregation.py:203: UFuncTypeError