Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds documentation for some of the classes and methods #31

Merged
merged 19 commits into from
Jun 20, 2022

Conversation

zaouk
Copy link
Contributor

@zaouk zaouk commented Apr 5, 2022

No description provided.

@juanmc2005 juanmc2005 added documentation Improvements or additions to documentation labels Apr 7, 2022
@juanmc2005 juanmc2005 modified the milestone: Version 0.3 Apr 7, 2022
@juanmc2005 juanmc2005 changed the base branch from main to develop April 7, 2022 13:44
Copy link
Owner

@juanmc2005 juanmc2005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this very needed documentation!
In general I prefer docstrings to follow the same format as in AggregationStrategy.__call__ (see here) so before I merge I'd like to reformat these docstrings.

Apart from that I added some suggestions that in my opinion are a bit clearer.

@@ -288,6 +288,19 @@ def __init__(
metric: Optional[str] = "cosine",
max_speakers: int = 20
):
"""Initializes an object for constrained incremental online clustering of speakers
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to put this docstring at the level of the class instead of the constructor.

Also I prefer this definition:

Suggested change
"""Initializes an object for constrained incremental online clustering of speakers
"""Constrained online speaker clustering for global speaker tracking.

"""Initializes an object for constrained incremental online clustering of speakers

Args:
tau_active (float): Threshold for detecting active speakers. This threshold is applied on the maximum value of each output
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tau_active (float): Threshold for detecting active speakers. This threshold is applied on the maximum value of each output
tau_active (float): Threshold for detecting active speakers. This is applied on the maximum value of per-speaker segmentation


Args:
tau_active (float): Threshold for detecting active speakers. This threshold is applied on the maximum value of each output
activation of the local segmentation model.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
activation of the local segmentation model.
activations from the local segmentation model.

tau_active (float): Threshold for detecting active speakers. This threshold is applied on the maximum value of each output
activation of the local segmentation model.
rho_update (float): Threshold for considering the extracted embedding when updating the centroid of the local speaker.
The centroid to which a local speaker is mapped is only updated if the duration of speech of a
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The centroid to which a local speaker is mapped is only updated if the duration of speech of a
The centroid to which a local speaker is mapped is only updated if its ratio of speech

activation of the local segmentation model.
rho_update (float): Threshold for considering the extracted embedding when updating the centroid of the local speaker.
The centroid to which a local speaker is mapped is only updated if the duration of speech of a
local speaker is greater than this threshold.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local speaker is greater than this threshold.
is greater than this threshold.

assignments (Iterable[Tuple[int, int]]): An iterable of tuples for assigning each local speaker to one global speaker.

Returns:
SpeakerMap: A SpeakerMap object wrapping a mapping matrix between local speakers and global speakers.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SpeakerMap: A SpeakerMap object wrapping a mapping matrix between local speakers and global speakers.
a `SpeakerMap` with the given assignments

@@ -82,6 +92,16 @@ class SpeakerMapBuilder:
def hard_map(
shape: Tuple[int, int], assignments: Iterable[Tuple[int, int]], maximize: bool
) -> SpeakerMap:
"""Returns a SpeakerMap object based on the given assignments.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Returns a SpeakerMap object based on the given assignments.
"""Builds a `SpeakerMap` that forces the given assignments.


Args:
shape (Tuple[int, int]): shape of the mapping matrix
assignments (Iterable[Tuple[int, int]]): An iterable of tuples for assigning each local speaker to one global speaker.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assignments (Iterable[Tuple[int, int]]): An iterable of tuples for assigning each local speaker to one global speaker.
assignments (Iterable[Tuple[int, int]]): Iterable of tuples (source_speaker, target_speaker) representing an assignment.

Args:
shape (Tuple[int, int]): shape of the mapping matrix
assignments (Iterable[Tuple[int, int]]): An iterable of tuples for assigning each local speaker to one global speaker.
maximize (bool): whether or not to use a MaximizationObjective
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
maximize (bool): whether or not to use a MaximizationObjective
maximize (bool): Whether to use scores where higher is better (true) or where lower is better (false).

maximize (bool): whether or not to use a MaximizationObjective

Returns:
SpeakerMap: A SpeakerMap object.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SpeakerMap: A SpeakerMap object.
a `SpeakerMap` with the given assignments.

@juanmc2005 juanmc2005 removed this from the Version 0.3 milestone May 12, 2022
@juanmc2005 juanmc2005 added this to the Version 0.4 milestone May 20, 2022
Copy link
Owner

@juanmc2005 juanmc2005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, just some style changes that I'll commit before merging.

src/diart/functional.py Outdated Show resolved Hide resolved
src/diart/functional.py Outdated Show resolved Hide resolved
src/diart/functional.py Outdated Show resolved Hide resolved
src/diart/functional.py Outdated Show resolved Hide resolved
src/diart/functional.py Outdated Show resolved Hide resolved
src/diart/mapping.py Outdated Show resolved Hide resolved
src/diart/mapping.py Outdated Show resolved Hide resolved
src/diart/mapping.py Outdated Show resolved Hide resolved
src/diart/mapping.py Outdated Show resolved Hide resolved
src/diart/mapping.py Outdated Show resolved Hide resolved
@juanmc2005 juanmc2005 merged commit a6ddb07 into juanmc2005:develop Jun 20, 2022
@juanmc2005 juanmc2005 mentioned this pull request Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants