Contributing

The framework is designed to be easily extensible to new datasets and models.

Datasets

To add a new dataset you can follow the steps below:

Provide the steps to download the dataset in the data_download/README.md file.
Create a dataset class following the template below:

class DATASET_NAME():

    def __init__(
        self,
        path: str,
        verbose = False,
    ):
        '''
        :param config_path: path to root folder for the dataset.
        :param verbose: if True, print some information about the dataset
        '''
        pass

    def _load_data(self):
        '''
        Load the train, validation and test data
        '''
        self.num_classes = ...
        pass

    def evaluate(
        self,
        ...args...
    ):
        '''
        Evaluate the model on the dataset running train/validation/test tests.
        '''
        pass

You can find an example of a dataset class in the esc50.py file. Inside the evaluate method you can use the ClassificationModel class for the linear evaluation of the model on the dataset and the ClassificationDataset class to use the standardized evaluation procedure.

Models

Adding a new model is one of the main use-cases of the framework. Once you have a model that you want to evaluate on the benchmark, you can follow the steps below:

Create a model wrapper class following the example provided in the w2v2_wrapper.py file. The model wrapper class should inherit from the Model class and implement the following methods:

__init__: initialize the model wrapper class
get_embeddings: given an input audio as a numpy array, return the embeddings of the audio generated by the model
get_token_embeddings: given an input audio as a numpy array, return one embedding for each frame of the audio generated by the model. !WARNING! This method is not used in the current version of the benchmark.
get_classification_embedding_size: return the size of the embedding generated by the model
get_token_embedding_size: return the size of the embedding generated by the model for each frame of the audio. !WARNING! This method is not used in the current version of the benchmark.
get_sampling_rate: return the sampling rate used by the model.
get_embedding_layer: return the size of the embedding generated by the model. This is used to create a linear classifier on top of the model without altering the size of the embeddings.

Import the model wrapper class in the evaluation script and use it to evaluate the model on the desired dataset.

If you have any questions, suggestions, or you find any bug, please open an issue or contact us using the information below.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing

Datasets

Models

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing

Datasets

Models