Creating Shannon Entropy calculation function and editing test datasets #65

willdavidson05 · 2024-07-02T21:39:51Z

Description

This PR introduces an entropy.py module that computes the Shannon Entropy, based on the output from (calculate_loc_changes) in git_parser.py.

Within the module, safeguards were used to ensure that probability values cannot be negative. Additionally, a test suite test_entropy.py has been created to validate against negative outputs.

The test datasets were edited to have multiple files in a repository, this is done in the high_entropy repo. To support repositories with multiple files, modifications were made to (calculate_loc_changes) in git_parser.py. It now accepts an additional parameter, file_names: list[str].

A PR has been created to work on refining the test repositories, as well as the expanding on the testing suites, referenced here #69

I appreciate any comments or feedback!

Closes #62 , #64

What is the nature of your change?

Content additions or updates (adds or updates content)
Bug fix (fixes an issue).
Enhancement (adds functionality).
Breaking change (these changes would cause existing functionality to not work as expected).

Checklist

Please ensure that all boxes are checked before indicating that this pull request is ready for review.

I have read the CONTRIBUTING.md guidelines.
My code follows the style guidelines of this project.
I have performed a self-review of my own contributions.
I have commented my content, particularly in hard-to-understand areas.
I have made corresponding changes to related documentation (outside of book content).
My changes generate no new warnings.
New and existing tests pass locally with my changes.
I have added tests that prove my additions are effective or that my feature works.
I have deleted all non-relevant text in this pull request template.

d33bs

Nice work! I left a few comments and suggestions throughout this review. Please don't hesitate to let me know if you have any questions.

src/almanack/entropy.py

tests/test_entropy.py

d33bs · 2024-07-04T14:34:01Z

tests/test_entropy.py

+        entropies = calculate_shannon_entropy(
+            repo_path, source_commit, target_commit, file_sets[label]
+        )
+        for _, entropy in entropies.items():


Consider adding a comparison for high entropy vs low entropy with an additional assert below (should one be higher than the other)?

Referenced in test dataset issue

d33bs · 2024-07-04T14:35:23Z

tests/test_git_parser.py

+            repo_path, source_commit, target_commit, file_sets[label]
+        )
+        results[label] = loc_changes
+


Consider adding another check to make sure that the file sets are different from one another.

Referenced in test dataset issue

tests/data/almanack/entropy/add_data.py

src/almanack/entropy.py

willdavidson05 · 2024-07-15T19:52:57Z

Thank you for the review, @d33bs! Following your feedback on the test datasets, along with my own concerns, I've created a new PR (#69) to address many of the issues. Any comments you made regarding testing suites or the setup of repositories were referenced in issue #66. Your other feedback has been implemented, and I have left a question for you above!

d33bs

Looks great, thanks for addressing all those comments. I left a couple additional thoughts. Looking forward to the changes in #69 . Feel free to merge when you feel things are ready.

CITATION.cff

Will Davidson and others added 7 commits June 18, 2024 09:56

Removing .zip files

f551f61

Merge branch 'software-gardening:main' into main

6638e1d

Merge branch 'main' of https://github.com/willdavidson05/almanac

c2e40ec

Creating shannon entropy file

b742361

Saving changes

474dac2

SaVING CHNAGES

31df3f5

Creating entropy function and editing test dataset

8290449

willdavidson05 added this to the Entropy Linter milestone Jul 2, 2024

willdavidson05 requested review from falquaddoomi, d33bs and gwaybio July 2, 2024 21:40

Will Davidson added 3 commits July 2, 2024 15:49

pre-commit edits

28af269

Pre-commit chnages

f52b98f

Editing docstring

f8c0bfa

d33bs reviewed Jul 4, 2024

View reviewed changes

Changing descriptions and adding comments

4d3771c

willdavidson05 mentioned this pull request Jul 11, 2024

Edit test repositories to have multiple files and trivial titles #66

Closed

2 tasks

Will Davidson and others added 3 commits July 11, 2024 11:21

Adding dictionary comprehension

3f00bcb

Pre-commit chnages

ba8b432

Merge branch 'software-gardening:main' into Entropy_Formula

75eca4e

willdavidson05 closed this Jul 11, 2024

willdavidson05 deleted the Entropy_Formula branch July 11, 2024 17:57

willdavidson05 restored the Entropy_Formula branch July 11, 2024 17:58

willdavidson05 reopened this Jul 11, 2024

Will Davidson added 4 commits July 12, 2024 09:25

Documentation changes

c2dcd44

pre-commit chnages

7ed102a

pre commit

15fc7c1

Adding citations, and changing entropy function name

b15169b

willdavidson05 requested a review from d33bs July 15, 2024 19:53

Adding not to citation

52308a3

d33bs approved these changes Jul 15, 2024

View reviewed changes

CITATION.cff Outdated Show resolved Hide resolved

Will Davidson added 2 commits July 15, 2024 16:31

Changing: citation order, entropy f'n name

4dafd52

Pre-commit chanegs

e424e91

willdavidson05 merged commit 92fbb12 into software-gardening:main Jul 15, 2024
10 checks passed

willdavidson05 deleted the Entropy_Formula branch July 25, 2024 16:25

d33bs mentioned this pull request Oct 22, 2024

Add book content surrounding early metrics added to Almanack #121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating Shannon Entropy calculation function and editing test datasets #65

Creating Shannon Entropy calculation function and editing test datasets #65

willdavidson05 commented Jul 2, 2024 •

edited

Loading

d33bs left a comment

d33bs Jul 4, 2024

willdavidson05 Jul 15, 2024

d33bs Jul 4, 2024

willdavidson05 Jul 15, 2024

willdavidson05 commented Jul 15, 2024

d33bs left a comment

Creating Shannon Entropy calculation function and editing test datasets #65

Creating Shannon Entropy calculation function and editing test datasets #65

Conversation

willdavidson05 commented Jul 2, 2024 • edited Loading

Description

What is the nature of your change?

Checklist

d33bs left a comment

Choose a reason for hiding this comment

d33bs Jul 4, 2024

Choose a reason for hiding this comment

willdavidson05 Jul 15, 2024

Choose a reason for hiding this comment

d33bs Jul 4, 2024

Choose a reason for hiding this comment

willdavidson05 Jul 15, 2024

Choose a reason for hiding this comment

willdavidson05 commented Jul 15, 2024

d33bs left a comment

Choose a reason for hiding this comment

willdavidson05 commented Jul 2, 2024 •

edited

Loading