Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added comments to methods implementation #41

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

goruha
Copy link
Member

@goruha goruha commented Sep 22, 2024

What

  • Added comments to the implementation alphabet method
  • Added comments to the implementation intervals method
  • Added comments to the implementation order method

masha-mtc
masha-mtc previously approved these changes Sep 22, 2024
@goruha goruha requested a review from masha-mtc September 22, 2024 09:51
@coveralls
Copy link

coveralls commented Sep 22, 2024

Pull Request Test Coverage Report for Build 12099465068

Details

  • 1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.06%) to 94.779%

Totals Coverage Status
Change from base Build 11987743049: -0.06%
Covered Lines: 236
Relevant Lines: 249

💛 - Coveralls

data = np.asanyarray(X)
if data.ndim > 1: # Checking for d1 array
raise Not1DArrayException(
{"message": f"Incorrect array form. Expected d1 array, exists {data.ndim}"}
)

# Sort data positions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be clearer to write "Sort element indices" or "Get original indices of sorted data array"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perm = data.argsort(kind="mergesort")

mask_shape = data.shape
unique_mask = np.empty(mask_shape, dtype=bool)
# Create tmp mask array to store True on positions where appears new value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create mask array to store True on positions where new value appears for the first time in the sorted array to distinguish where subarray of one element ends and another begins"

# unique_mask = [True, False, True, False, True, True]
# a a c c d e
unique_mask = np.empty(data.shape, dtype=bool)
# First element is new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"First element is always new"

unique_mask[1:] = data[perm[1:]] != data[perm[:-1]]

# Create tmp array that will store reverse sorted mask array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create mask array to store True on positions of the data array where new value appears for the first time"

result_mask[perm[unique_mask]] = True

# Return elements that are first appears of unique values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Return array of first occurrences of elements in the data array"

unique_mask[1:] = data[perm[1:]] != data[perm[:-1]]

# Create tmp array that will store reverse sorted mask array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create mask array to store True on positions of the data array where new value appears for the first time"

# perm = [ 0, 5, 1, 2, 4, 3]
# perm[unique_mask] = [ 0, 1, 4, 3]
# result_mask = [True, True, False, True, True, False]
# a c c e d a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have moved this one line higher and added that this is "data" array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that on top in sorted data on bottom original order data

# perm = [ 0, 5, 1, 2, 4, 3]
# perm[unique_mask] = [ 0, 1, 4, 3]
# result_mask = [True, True, False, True, True, False]
# a c c e d a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have moved this one line higher and added that this is "data" array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that on top in sorted data on bottom original order data

mask_shape = ar.shape
mask = np.empty(mask_shape[0] + 1, dtype=bool)
mask[:1] = True
mask[1:-1] = ar[perm[1:]] != ar[perm[:-1]]
mask[-1:] = True # or mask[-1] = True

# Save masks first and last appears of elements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create masks of first and last occurrences of elements by excluding first and last elements from unique_mask accordingly"

inverse_perm[perm] = np.arange(data.shape[0])

# Create result array that count unique values starting from 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" Fill tmp result array with cumulative sums of number of unique values (starting from 0) to represent each element as a number in resulting order."

Copy link

Change Before [627f151] After [fbd35c6] Ratio Benchmark (Parameter)
- 4.55±0.05ms 3.72±0.1ms 0.82 bench_alphabet.AlphabetSuite.time_alphabet(500000, 'Best')

data = np.asanyarray(X)
if data.ndim > 1: # Checking for d1 array
raise Not1DArrayException(
{"message": f"Incorrect array form. Expected d1 array, exists {data.ndim}"}
)

# Sort data positions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-authored-by: Igor Rodionov  <goruha@gmail.com>
data = np.asanyarray(X)
if data.ndim > 1: # Checking for d1 array
raise Not1DArrayException(
{"message": f"Incorrect array form. Expected d1 array, exists {data.ndim}"}
)

# Sort data positions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Sort data positions
# Indices that would sort data array

unique_mask[1:] = data[perm[1:]] != data[perm[:-1]]

# Create tmp array that will store reverse sorted mask array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Create tmp array that will store reverse sorted mask array
# Create mask array to store True on positions of the data array where new value appears for the first time

result_mask[perm[unique_mask]] = True

# Return elements that are first appears of unique values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Return elements that are first appears of unique values
# Return array of first occurrences of elements in the data array

ar = ar[::-1]

# Sort data positions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Sort data positions
# Get original indices of sorted data array

perm = ar.argsort(kind="mergesort")

# Create tmp mask array to store True on positions where appears new value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Create tmp mask array to store True on positions where appears new value.
# Create mask array to store True on positions where new value appears for the first time in the sorted array to distinguish where subarray of one element ends and another begins

result = intervals[inverse_perm]
elif mod == mode.cycle:
# For cycle mode we permute intervals array to the original order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For cycle mode we permute intervals array to the original order
# For cycle mode we permute intervals array to the original arrangement

# result = [1, 2, 1, 4, 5, 5]

# Create 2-dimensional array size of (2, len(ar))
# Zero row is for intervals the first appearance of the element and intervals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Zero row is for intervals the first appearance of the element and intervals
# Zero row is for the intervals of the first appearance of the element and intervals

# Create 2-dimensional array size of (2, len(ar))
# Zero row is for intervals the first appearance of the element and intervals
# for intermediate appearances
# First row will store intervals for the last appearance of the element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# First row will store intervals for the last appearance of the element
# First row will store only intervals for the last appearance of the elements

result[last_mask, 1] = len(ar) - perm[last_mask]

# Permute intervals array to the original order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Permute intervals array to the original order
# Permute intervals array to the original arrangement

result = result[result != 0]

if bind == binding.end:
# For binding to the end, we need to reverse the result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For binding to the end, we need to reverse the result
# For binding to the end, we need to reverse the result back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants