Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add audacity marker file support and creating annotation from rttm file #92

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

yojul
Copy link

@yojul yojul commented Jun 30, 2023

Problem

When evaluating speaker diarization pipelines, one might want to use Annotation objects and creating annotation from rttm (or other format) as well as serializing/writing annotation to rttm (or other format).
Audacity marker track feature is a very convenient (and free) way to create ground truth segmentation for speaker diarization. The format is a .txt file very similar to already implemented LAB file support but tab separated.

Solution

Refactor methods for various file format support

  • Create a generic _serialize method to replace multiple to_{format} methods.
  • Create a generic _write method method to replace multiple write_{format} methods.
  • to_<format> and write_<format> are now partial methods from generic methods.

Currently supported formats are :

  • Rttm : annotation.to_rttm() and annotation.write_rttm(file).
  • Audacity : annotation.to_audacity() and annotation.write_audacity(file).
  • Lab : annotation.to_lab() and annotation.write_lab(file).

Therefore, to add a new format one only need to implement _iter_{format} methods similarly to _iter_rttm or _iter_lab.

Creating annotation from audacity or rttm

Similarly to the from_df class methods, I created from_audacity and from_rttm class methods to create easily annotations from those file formats.

Usage :

with open('file.rttm') as f : 
      annotation = Annotation.from_rttm(f)

@hbredin
Copy link
Member

hbredin commented Jul 16, 2023

Thanks for this PR.

Note that RTTM files may contain annotations for multiple audio files (hence the second uri field) in which case I am not sure what the Annotation.from_rttm method should do:

  • raise an error?
  • return a {uri: Annotation} mapping?

One okayish solution could be to add an option as_dict: bool = False to force returning a dict (second option) and raise an error if set to False and RTTM file contains multiple audio files...

@yojul
Copy link
Author

yojul commented Jul 17, 2023

Thank you for your feedback.

For more consistency with other "from" methods and the Annotation object itself, I suggest that from_rttm works as follow :

  • if no uri is specified, the default_uri is taken from the first line of the rttm file. Then, if there is more than one uri in the file, it raises an Exception asking to specify a uri
  • if uri is specified as parameter, from_rttm is only reading the lines with the specified uri.

Thus, it insures consistency with Annotation uri and rttm uri and that from_rttm is still creating a single Annotation object (as other similar methods).

I also added a condition to only read lines starting with "SPEAKER" corresponding to speech segments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants