Skip to content

4. API Reference

Romain Hennequin edited this page Feb 17, 2021 · 6 revisions

Logging

In order to manage library logging, you can use spleeter.utils.logging module which offers two following functions :

def enable_logging():
    """ Enable INFO level logging for tensorflow, INFO for spleeter. """
    ...

def enable_tensorflow_logging()
    """ Enable ERROR level logging for tensorflow. """
    ...

Plus if you don't want to Tensorflow verbose warnings to be displayed, you can run following code before any Tensorflow library import1:

import warnings
warnings.filterwarnings('ignore')

1 thus, any Spleeter library import as well.

Audio

In order to provide abstraction for audio based I/O operation, Spleeter use an abstract class spleeter.audio.adapter.AudioAdapter with following specification :

class AudioAdapter(ABC):

    @abstractmethod
    def load(
            self, audio_descriptor, offset, duration,
            sample_rate, dtype=np.float32):
        pass 

    @abstractmethod
    def save(
            self, path, data, sample_rate,
            codec=None, bitrate=None):
        pass

Concerning the audio_descriptor parameter, it is rule-free. Such audio_descriptor could be anything describing a way of retrieving audio content as long as the method implementation return the associated waveform (for instance, a audio file path, or an audio track identifier allowing to perform storage lookup). Although the save() method aims to write audio data on file.

A default implementation is provided based on ffmpeg that takes file path as audio_descriptor.

Separator

In order to use Spleeter in your own development pipeline, a generic class for performing audio source separation is available in spleeter.separator module.

from spleeter.separator import Separator

# Using embedded configuration.
separator = Separator('spleeter:2stems')

# Using custom configuration file.
separator = Separator('/path/to/config.json')

RAW waveform based separation

Once you have a Separator instance you can easily perform separation on audio waveforms (as numpy arrays), using separate(waveform) method :

# Use audio loader explicitly for loading audio waveform :
from spleeter.audio.adapter import AudioAdapter

audio_loader = AudioAdapter.default()
sample_rate = 44100
waveform, _ = audio_loader.load('/path/to/audio/file', sample_rate=sample_rate)

# Perform the separation :
prediction = separator.separate(waveform)

prediction output is a dictionary whose keys contain the name of the instruments, and values the associated instrument separated waveforms. For instance, using 2stems model, the returned dictionary would have two keys vocals and accompaniment with corresponding numpy arrays waveforms as value.

File based separation

If you do not want to deal directly with audio waveform loading, you can use separate_to_file() method which will handle audio adapter interaction :

separator.separate_to_file('/path/to/audio', '/path/to/output/directory')

It will infer default audio adapter instance. You can also control separation process using following parameters :

Parameter Description
audio_descriptor Audio descriptor that identify audio resource to separate
destination Path of the directory to write separate source into
audio_adapter (Optional) Audio adapter to use for audio I/O
offset (Optional) Loading offset parameter
duration (Optional) Loading duration parameter
codec (Optional) Saving codec parameter
bitrate (Optional) Saving bitrate parameter
filename_format (Optional) Format string for output filename
synchronous (Optional) Boolean flag to control partial asynchronous processing

Asynchronous audio export

As seen previously, separate_to_file have a synchronous parameter flag. By default, the method will perform source separation and then apply asynchronous audio export of the predicted data, using Python multiprocessing module. If the synchronous flag is True the method will then wait for export tasks2 to be finished before to return, although if the flag is False it will return directly allowing to queue up audio export task as following :

# List of input to process.
audio_descriptors = [...]

# Batch separation export.
for i in audio_descriptors:
    separator.separate_to_file(i, '/path/to/output/directory', synchronous=False)

# Wait for batch to finish.
separator.join()

2 for a given separation, one export task would be created for each separated instrument.

Training

Training and Evaluation are performed using Tensorflow Estimator API. Take a look at this file for details.