refactor: statistical functions #1420

weibullguy · 2024-10-19T16:10:07Z

Does this pull request introduce a breaking change?

Yes
No

Purpose of this pull request

This pull request refactors the statistical functions across multiple distribution types (normal, exponential, lognormal, and Weibull) to improve maintainability, abstraction, and testing. The key goals include:

Abstracting shared functionality: Common statistical calculations such as hazard rate, MTBF, and survival functions have been abstracted into a central location to reduce code duplication.
Refining distribution-specific methods: Refactored the get_* functions in each distribution to call the abstracted methods without breaking the existing API.
Enhancing unit test coverage: Added new test cases for edge conditions (e.g., zero, negative, and very large parameters) to ensure robustness and consistency across distributions.

Benefits of the pull request

Increased code maintainability: By centralizing shared logic, the overall complexity of distribution-specific files has been reduced, improving readability and ease of future modifications.
Consistency across distributions: The refactoring ensures a more consistent approach to how statistical methods are implemented and tested for different distributions.
Better test coverage: New test cases have been introduced for boundary values and edge cases, increasing confidence in the reliability of these statistical functions.
Prevention of breaking changes: The refactored functions maintain the same public API, ensuring that existing code relying on these functions remains unaffected.

Any particular area(s) reviewers should focus on

Abstracted functions in the new statistical utility file: Reviewers should ensure that the logic for shared functionality is sound and correctly handles all distribution types.
Tests for edge cases: Special attention should be given to the newly added tests for negative, zero, and large values for distribution parameters to verify their correctness.
Distribution-specific files: Make sure the refactored get_* functions correctly call the centralized methods without introducing unintended side effects.

Any other pertinent information

Pull Request Checklist

Code Style
- Code is following code style guidelines.
Static Checks
- Failing static checks are only applicable to code outside the scope of
  this PR.
Tests
- At least one test for all newly created functions/methods?
Chores
- Issue(s) have been raised for problem areas outside the scope of
  this PR. These problem areas have been decorated with an ISSUE: # comment.

Summary by Sourcery

Refactor statistical functions to centralize shared logic for hazard rate, MTBF, and survival calculations, improving maintainability and consistency across distribution types. Enhance unit test coverage by adding new test cases for edge conditions to ensure robustness.

Enhancements:

Refactor statistical functions to centralize shared logic for hazard rate, MTBF, and survival calculations across multiple distribution types, improving maintainability and consistency.

Tests:

Enhance unit test coverage by adding new test cases for edge conditions, including zero, negative, and very large parameters, to ensure robustness and consistency across distributions.

sourcery-ai · 2024-10-19T16:10:13Z

Reviewer's Guide by Sourcery

This pull request refactors the statistical functions across multiple distribution types (normal, exponential, lognormal, and Weibull) to improve maintainability, abstraction, and testing. The key changes include centralizing common statistical calculations, refining distribution-specific methods, and enhancing unit test coverage.

Class diagram for refactored statistical functions

classDiagram
    class Distributions {
        +calculate_hazard_rate(time: float, location: float, scale: Optional[float], shape: Optional[float], dist_type: str) float
        +calculate_mtbf(shape: float, location: float, scale: float, dist_type: str) float
        +calculate_survival(shape: float, time: float, location: float, scale: float, dist_type: str) float
    }

    class Normal {
        +get_hazard_rate(location: float, scale: float, time: float) float
        +get_mtbf(location: float, scale: float) float
        +get_survival(location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float]
    }

    class Exponential {
        +get_hazard_rate(scale: float, location: float) float
        +get_mtbf(rate: float, location: float) float
        +get_survival(scale: float, time: float, location: float) float
        +do_fit(data, kwargs) Tuple[float, float]
    }

    class Lognormal {
        +get_hazard_rate(shape: float, location: float, scale: float, time: float) float
        +get_mtbf(shape: float, location: float, scale: float) float
        +get_survival(shape: float, location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float, float]
    }

    class Weibull {
        +get_hazard_rate(shape: float, location: float, scale: float, time: float) float
        +get_mtbf(shape: float, location: float, scale: float) float
        +get_survival(shape: float, location: float, scale: float, time: float) float
        +do_fit(data, kwargs) Tuple[float, float, float]
    }

    Distributions <|-- Normal
    Distributions <|-- Exponential
    Distributions <|-- Lognormal
    Distributions <|-- Weibull

File-Level Changes

Change	Details	Files
Centralized common statistical calculations into a new distributions.py file	Created calculate_hazard_rate(), calculate_mtbf(), and calculate_survival() functions Implemented support for exponential, lognormal, normal, and Weibull distributions Added error handling for unsupported distribution types	`src/ramstk/analyses/statistics/distributions.py`
Refactored distribution-specific files to use the new centralized functions	Updated get_hazard_rate(), get_mtbf(), and get_survival() functions to use the new centralized calculations Simplified do_fit() functions by removing redundant code Improved error handling for empty datasets	`src/ramstk/analyses/statistics/exponential.py` `src/ramstk/analyses/statistics/lognormal.py` `src/ramstk/analyses/statistics/normal.py` `src/ramstk/analyses/statistics/weibull.py`
Enhanced unit tests for all distribution types	Added tests for edge cases (e.g., zero, negative, and very large parameter values) Implemented tests for empty datasets and invalid distribution types Updated existing tests to use pytest.approx() for float comparisons	`tests/analyses/statistics/bounds_unit_test.py` `tests/analyses/statistics/exponential_unit_test.py` `tests/analyses/statistics/lognormal_unit_test.py` `tests/analyses/statistics/normal_unit_test.py` `tests/analyses/statistics/weibull_unit_test.py` `tests/analyses/statistics/distributions_unit_test.py`
Improved code style and documentation	Updated function docstrings for clarity and consistency Standardized import statements across files Removed redundant comments and improved existing ones	`src/ramstk/analyses/statistics/bounds.py` `src/ramstk/analyses/statistics/exponential.py` `src/ramstk/analyses/statistics/lognormal.py` `src/ramstk/analyses/statistics/normal.py` `src/ramstk/analyses/statistics/weibull.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @weibullguy - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 1 issue found
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-10-19T16:11:56Z

src/ramstk/analyses/statistics/bounds.py

+    _z_norm = (
+        stats.norm.ppf(1.0 - ((1.0 - alpha / 100.0) / 2.0))
+        if alpha > 1.0
+        else 1.0 - ((1.0 - alpha) / 2.0)


issue (bug_risk): Potential bug in _z_norm calculation for alpha <= 1.0

The calculation for _z_norm when alpha <= 1.0 seems incorrect. It should probably use stats.norm.ppf() here as well, with appropriate scaling of alpha.

weibullguy added 7 commits October 15, 2024 14:15

refactor: statistical bound functions

2771d60

test: add and update test for statistical bound functions

6c4cc93

refactor: exponential distribution functions

c898626

test: add and update test for exponential distribution functions

72786cf

refactor: abstract s-distribution functions

ca7ab0a

enhancement: add check for empty data array in do_fit()

6c7b4d1

test: update and add tests for refactored functions

d98e12a

github-actions bot added the type: refactor Issue or PR dealing with refactoring of RAMSTK code. label Oct 19, 2024

fix: fix mypy error

0c9f745

sourcery-ai bot reviewed Oct 19, 2024

View reviewed changes

weibullguy added priority: low Issue or PR is low priority. status: inprogress Issue or PR is open, milestoned, and assigned. labels Oct 19, 2024

fix: fix pylint errors

69ad04e

weibullguy merged commit 1862211 into master Oct 19, 2024
18 of 19 checks passed

trafico-bot bot added the endgame: merged Pull Request has been merged successfully label Oct 19, 2024

weibullguy deleted the refactor/statistical_functions branch October 19, 2024 20:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: statistical functions #1420

refactor: statistical functions #1420

weibullguy commented Oct 19, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 19, 2024 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Oct 19, 2024

refactor: statistical functions #1420

refactor: statistical functions #1420

Conversation

weibullguy commented Oct 19, 2024 • edited by sourcery-ai bot Loading

Does this pull request introduce a breaking change?

Purpose of this pull request

Benefits of the pull request

Any particular area(s) reviewers should focus on

Any other pertinent information

Pull Request Checklist

Summary by Sourcery

sourcery-ai bot commented Oct 19, 2024 • edited Loading

Reviewer's Guide by Sourcery

Class diagram for refactored statistical functions

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Oct 19, 2024

Choose a reason for hiding this comment

weibullguy commented Oct 19, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 19, 2024 •

edited

Loading