Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor lib.py utilities to reduce code duplication and excessive nesting of function calls #190

Closed
5 tasks
tomvothecoder opened this issue Mar 27, 2023 · 0 comments · Fixed by #213
Closed
5 tasks
Assignees
Labels
refactor Clean up or optimize code

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Mar 27, 2023

Many utilities within lib.py contain duplicated code (conditionals, for loops, etc.). Refactoring these utilities can make it easier to extend functionality and fix bugs as needed. The script is lengthy and not well documented, thus is not easy to modify and maintain by developers.

Modules to refactor

  • handle_variables()
    • Break up logic into smaller, private functions
    • Clean up and optimize logic
    • Move functions into VarHandler class (cmorize method wraps this function right now)
    • Remove call to handle_simple() if simple=True (this should be a separate function call)
  • handle_simple()`
    • Reuse logic from handle_variables() via private functions
@tomvothecoder tomvothecoder added the refactor Clean up or optimize code label Mar 27, 2023
@tomvothecoder tomvothecoder self-assigned this Sep 19, 2023
@tomvothecoder tomvothecoder changed the title Refactor lib.py utilities to reduce code duplication Refactor lib.py utilities to reduce code duplication and excessive nesting of function calls Oct 30, 2023
tomvothecoder added a commit that referenced this issue Nov 8, 2023
1. Refactor `lib.py` (now deleted)
   - Closes #190
   - Replace `run_parallel()` and `run_serial()` with `__main__.py` -> `E3SMtoCMIP._run_parallel()` and
`E3SMtoCMIP._run_serial()`
     - Updated var name `filepaths` to `vars_to_filepaths` for clarity (its a dict mapping var key to list of string paths)
     - Update args passed to handler method, remove unnecessary args
     - Add `try` and `except` statement for submitting `pool` jobs to maintain compatibility with MPAS variable handlers, which use different handler method arguments
   - Replace `handle_variables()`, `get_dimension_data()` and `load_axis()` with `VarHandler.cmorize()`
   - Delete `handle_simple()` -- _will be re-implemented from scratch in #130_
3. Update `handler.py`
   - Update `VarHandler.cmorize()`
      - Refactored significantly -- extracted logic into smaller, maintainable private methods
      - Replaced `data` dictionary storing `xr.DataArrays`  with `ds` (`xr.Dataset` object)
	  - Distinguish CMOR usage with "cmor" string in function names and python variables
	  - Add support for hybrid sigma variables: `pfull`, `phalf`
   - Add `VarHandler._cmor_write()`
     - Add support for CMORizing fixed-time variables
       - Separate PR for #217
     - If time dimension exists, CMORize the variable with all time and time bound values with a single call to `cmor.write()` instead of looping over each time value index and CMORizing each slice -- **this should improve performance and removes the `tqdm` progress bar.**
4. Update `handlers.yaml`
    - Add `phalf` and `pfull` entries
      - Closes #115
      - Delete `phalf.py` and `pfull.py`
    - Add `clcalispo` entry
      - Closes #218
      - Delete `clcalipso.py`
5. Update `_formulas.py`
   - Update all function argument types and return types from `np.ndarray` to `xr.DataArray`
   - Add formula functions for `pfull` and `phalf`
   - Add `convert_units()` function, which handles 1-to-1 unit conversions -- replaces `default.default_handlers.write_data()`
6. Refactor `default.py` (now deleted)
   - Replaced by `VarHandler.cmorize()` and `_formulas.py.convert_units()`

7. Remove `cdms2` and `cdutil` from dependencies in `dev.yml` and `ci.yml`7
8.  Clean up legacy code in `clisccp.py`
    - Separate PR for #218
9. Add `Makefile` for easy access to commonly used commands (e.g., building and installing package)
tomvothecoder added a commit that referenced this issue Nov 10, 2023
…atm handlers (#213)

1. Refactor `lib.py` (now deleted)
   - Closes #190
   - Replace `run_parallel()` and `run_serial()` with `__main__.py` -> `E3SMtoCMIP._run_parallel()` and
`E3SMtoCMIP._run_serial()`
     - Updated var name `filepaths` to `vars_to_filepaths` for clarity (its a dict mapping var key to list of string paths)
     - Update args passed to handler method, remove unnecessary args
     - Add `try` and `except` statement for submitting `pool` jobs to maintain compatibility with MPAS variable handlers, which use different handler method arguments
   - Replace `handle_variables()`, `get_dimension_data()` and `load_axis()` with `VarHandler.cmorize()`
   - Delete `handle_simple()` -- _will be re-implemented from scratch in #130_
3. Update `handler.py`
   - Update `VarHandler.cmorize()`
      - Refactored significantly -- extracted logic into smaller, maintainable private methods
      - Replaced `data` dictionary storing `xr.DataArrays`  with `ds` (`xr.Dataset` object)
	  - Distinguish CMOR usage with "cmor" string in function names and python variables
	  - Add support for hybrid sigma variables: `pfull`, `phalf`
   - Add `VarHandler._cmor_write()`
     - Add support for CMORizing fixed-time variables
       - Separate PR for #217
     - If time dimension exists, CMORize the variable with all time and time bound values with a single call to `cmor.write()` instead of looping over each time value index and CMORizing each slice -- **this should improve performance and removes the `tqdm` progress bar.**
4. Update `handlers.yaml`
    - Add `phalf` and `pfull` entries
      - Closes #115
      - Delete `phalf.py` and `pfull.py`
    - Add `clcalispo` entry
      - Closes #218
      - Delete `clcalipso.py`
5. Update `_formulas.py`
   - Update all function argument types and return types from `np.ndarray` to `xr.DataArray`
   - Add formula functions for `pfull` and `phalf`
   - Add `convert_units()` function, which handles 1-to-1 unit conversions -- replaces `default.default_handlers.write_data()`
6. Refactor `default.py` (now deleted)
   - Replaced by `VarHandler.cmorize()` and `_formulas.py.convert_units()`
7. Remove `cdms2` and `cdutil` from dependencies in `dev.yml` and `ci.yml`7
8.  Clean up legacy code in `clisccp.py`
    - Separate PR for #218
9. Add `Makefile` for easy access to commonly used commands (e.g., building and installing package)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Clean up or optimize code
Projects
1 participant