Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: statistical functions #1420

Merged
merged 9 commits into from
Oct 19, 2024
Merged

Conversation

weibullguy
Copy link
Collaborator

@weibullguy weibullguy commented Oct 19, 2024

Does this pull request introduce a breaking change?

  • Yes
  • No

Purpose of this pull request

This pull request refactors the statistical functions across multiple distribution types (normal, exponential, lognormal, and Weibull) to improve maintainability, abstraction, and testing. The key goals include:

  • Abstracting shared functionality: Common statistical calculations such as hazard rate, MTBF, and survival functions have been abstracted into a central location to reduce code duplication.
  • Refining distribution-specific methods: Refactored the get_* functions in each distribution to call the abstracted methods without breaking the existing API.
  • Enhancing unit test coverage: Added new test cases for edge conditions (e.g., zero, negative, and very large parameters) to ensure robustness and consistency across distributions.

Benefits of the pull request

  • Increased code maintainability: By centralizing shared logic, the overall complexity of distribution-specific files has been reduced, improving readability and ease of future modifications.
  • Consistency across distributions: The refactoring ensures a more consistent approach to how statistical methods are implemented and tested for different distributions.
  • Better test coverage: New test cases have been introduced for boundary values and edge cases, increasing confidence in the reliability of these statistical functions.
  • Prevention of breaking changes: The refactored functions maintain the same public API, ensuring that existing code relying on these functions remains unaffected.

Any particular area(s) reviewers should focus on

  • Abstracted functions in the new statistical utility file: Reviewers should ensure that the logic for shared functionality is sound and correctly handles all distribution types.
  • Tests for edge cases: Special attention should be given to the newly added tests for negative, zero, and large values for distribution parameters to verify their correctness.
  • Distribution-specific files: Make sure the refactored get_* functions correctly call the centralized methods without introducing unintended side effects.

Any other pertinent information

Pull Request Checklist

  • Code Style

    • Code is following code style guidelines.
  • Static Checks

    • Failing static checks are only applicable to code outside the scope of
      this PR.
  • Tests

    • At least one test for all newly created functions/methods?
  • Chores

    • Issue(s) have been raised for problem areas outside the scope of
      this PR. These problem areas have been decorated with an ISSUE: # comment.

Summary by Sourcery

Refactor statistical functions to centralize shared logic for hazard rate, MTBF, and survival calculations, improving maintainability and consistency across distribution types. Enhance unit test coverage by adding new test cases for edge conditions to ensure robustness.

Enhancements:

  • Refactor statistical functions to centralize shared logic for hazard rate, MTBF, and survival calculations across multiple distribution types, improving maintainability and consistency.

Tests:

  • Enhance unit test coverage by adding new test cases for edge conditions, including zero, negative, and very large parameters, to ensure robustness and consistency across distributions.

Copy link
Contributor

sourcery-ai bot commented Oct 19, 2024

Reviewer's Guide by Sourcery

This pull request refactors the statistical functions across multiple distribution types (normal, exponential, lognormal, and Weibull) to improve maintainability, abstraction, and testing. The key changes include centralizing common statistical calculations, refining distribution-specific methods, and enhancing unit test coverage.

Class diagram for refactored statistical functions

classDiagram
    class Distributions {
        +calculate_hazard_rate(time: float, location: float, scale: Optional[float], shape: Optional[float], dist_type: str) float
        +calculate_mtbf(shape: float, location: float, scale: float, dist_type: str) float
        +calculate_survival(shape: float, time: float, location: float, scale: float, dist_type: str) float
    }

    class Normal {
        +get_hazard_rate(location: float, scale: float, time: float) float
        +get_mtbf(location: float, scale: float) float
        +get_survival(location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float]
    }

    class Exponential {
        +get_hazard_rate(scale: float, location: float) float
        +get_mtbf(rate: float, location: float) float
        +get_survival(scale: float, time: float, location: float) float
        +do_fit(data, kwargs) Tuple[float, float]
    }

    class Lognormal {
        +get_hazard_rate(shape: float, location: float, scale: float, time: float) float
        +get_mtbf(shape: float, location: float, scale: float) float
        +get_survival(shape: float, location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float, float]
    }

    class Weibull {
        +get_hazard_rate(shape: float, location: float, scale: float, time: float) float
        +get_mtbf(shape: float, location: float, scale: float) float
        +get_survival(shape: float, location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float, float]
    }

    Distributions <|-- Normal
    Distributions <|-- Exponential
    Distributions <|-- Lognormal
    Distributions <|-- Weibull
Loading

File-Level Changes

Change Details Files
Centralized common statistical calculations into a new distributions.py file
  • Created calculate_hazard_rate(), calculate_mtbf(), and calculate_survival() functions
  • Implemented support for exponential, lognormal, normal, and Weibull distributions
  • Added error handling for unsupported distribution types
src/ramstk/analyses/statistics/distributions.py
Refactored distribution-specific files to use the new centralized functions
  • Updated get_hazard_rate(), get_mtbf(), and get_survival() functions to use the new centralized calculations
  • Simplified do_fit() functions by removing redundant code
  • Improved error handling for empty datasets
src/ramstk/analyses/statistics/exponential.py
src/ramstk/analyses/statistics/lognormal.py
src/ramstk/analyses/statistics/normal.py
src/ramstk/analyses/statistics/weibull.py
Enhanced unit tests for all distribution types
  • Added tests for edge cases (e.g., zero, negative, and very large parameter values)
  • Implemented tests for empty datasets and invalid distribution types
  • Updated existing tests to use pytest.approx() for float comparisons
tests/analyses/statistics/bounds_unit_test.py
tests/analyses/statistics/exponential_unit_test.py
tests/analyses/statistics/lognormal_unit_test.py
tests/analyses/statistics/normal_unit_test.py
tests/analyses/statistics/weibull_unit_test.py
tests/analyses/statistics/distributions_unit_test.py
Improved code style and documentation
  • Updated function docstrings for clarity and consistency
  • Standardized import statements across files
  • Removed redundant comments and improved existing ones
src/ramstk/analyses/statistics/bounds.py
src/ramstk/analyses/statistics/exponential.py
src/ramstk/analyses/statistics/lognormal.py
src/ramstk/analyses/statistics/normal.py
src/ramstk/analyses/statistics/weibull.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions github-actions bot added the type: refactor Issue or PR dealing with refactoring of RAMSTK code. label Oct 19, 2024
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @weibullguy - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

_z_norm = (
stats.norm.ppf(1.0 - ((1.0 - alpha / 100.0) / 2.0))
if alpha > 1.0
else 1.0 - ((1.0 - alpha) / 2.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Potential bug in _z_norm calculation for alpha <= 1.0

The calculation for _z_norm when alpha <= 1.0 seems incorrect. It should probably use stats.norm.ppf() here as well, with appropriate scaling of alpha.

@weibullguy weibullguy added priority: low Issue or PR is low priority. status: inprogress Issue or PR is open, milestoned, and assigned. labels Oct 19, 2024
@weibullguy weibullguy merged commit 1862211 into master Oct 19, 2024
18 of 19 checks passed
@trafico-bot trafico-bot bot added the endgame: merged Pull Request has been merged successfully label Oct 19, 2024
@weibullguy weibullguy deleted the refactor/statistical_functions branch October 19, 2024 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
endgame: merged Pull Request has been merged successfully priority: low Issue or PR is low priority. status: inprogress Issue or PR is open, milestoned, and assigned. type: refactor Issue or PR dealing with refactoring of RAMSTK code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant