audio metrics: SNR, SI_SDR, SI_SNR (#292)

* add snr, si_sdr, si_snr * format * add noqa: F401 to __init__.py * remove types in doc, change estimate to preds, remove EPS * update functional.rst * update CHANGELOG.md * switch preds and target * switch preds and target in Example * add SNR, SI_SNR, SI_SDR module implementation * add test * add module doc * use _check_same_shape * to alphabetical order * update test * move Base to the top of Audio * add soundfile * gcc * fix mocking * image * doctest * mypy * fix requirements * fix dtype * something * update * adjust * Apply suggestions from code review * update test_snr * update test_si_snr * new snr: use torch.finfo(preds.dtype).eps * update test_snr.py * new si_sdr imp * update test_si_sdr * update test_si_snr * remove pb_bss_eval * add museval * update test files * remove museval * add funcs update return None annotation * add 'Setup ffmpeg' * update "Setup ffmpeg" * use setup-conda@v1 * multi-OS * update atol to 1e-5 * Apply suggestions from code review * change atol to 1e-2 * update * fix 'Setup Linux' not activated * add sudo * reduce Time to 100 to reduce the test time * increase timeoutInMinutes to 40 * install ffmpeg * timeout-minutes to 55 * +git * show-error-codes * .detach().cpu().numpy() first * add numpy * numpy * ignore_errors torchmetrics.audio.* * solve mypy no-redef error * remove --quiet * pypesq * apt * add # type: ignore * try without test_si_snr & test_si_sdr * test_import_speechmetrics * test_speechmetrics_si_sdr * test_si_sdr_functional * test audio only * install libsndfile1 * add sisnr sisdr test * test all & add quiet & remove test_speechmetrics * remove sudo & install libsndfile1 * add test * update * fix tests * typing * fix typing * fix bus error * SRMRpy * pesq * gcc * comment -u root cuda 10.2 whoami * env Co-authored-by: quancs <quancs@qq.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Lightning-AI · Jun 22, 2021 · fe03f3a · fe03f3a
1 parent a75445b
commit fe03f3a
Show file tree

Hide file tree

Showing 22 changed files with 986 additions and 9 deletions.
diff --git a/.github/workflows/ci_test-conda.yml b/.github/workflows/ci_test-conda.yml
@@ -17,7 +17,7 @@ jobs:
  pytorch-version: [1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9]
 
  # Timeout: https://stackoverflow.com/a/59076067/4521646
- timeout-minutes: 35
+ timeout-minutes: 55
  steps:
  - uses: actions/checkout@v2
 
@@ -54,9 +54,11 @@ jobs:
 
  - name: Update Environment
  run: |
+ sudo apt install libsndfile1
  conda info
  conda install mkl pytorch=${{ matrix.pytorch-version }} cpuonly
  conda install cpuonly $(python ./requirements/adjust-versions.py conda)
+ conda install -c conda-forge ffmpeg
  conda list
  pip --version
  python ./requirements/adjust-versions.py requirements.txt

diff --git a/.github/workflows/ci_test-full.yml b/.github/workflows/ci_test-full.yml
@@ -26,7 +26,7 @@ jobs:
  requires: 'minimal'
 
  # Timeout: https://stackoverflow.com/a/59076067/4521646
- timeout-minutes: 35
+ timeout-minutes: 55
 
  steps:
  - uses: actions/checkout@v2
@@ -43,7 +43,15 @@ jobs:
  - name: Setup macOS
  if: runner.os == 'macOS'
  run: |
- brew install libomp # https://github.com/pytorch/pytorch/issues/20030
+ brew install gcc libomp ffmpeg # https://github.com/pytorch/pytorch/issues/20030
+ - name: Setup Linux
+ if: runner.os == 'Linux'
+ run: |
+ sudo apt install -y ffmpeg
+ - name: Setup Windows
+ if: runner.os == 'windows'
+ run: |
+ choco install ffmpeg
 
  - name: Set min. dependencies
  if: matrix.requires == 'minimal'
@@ -70,7 +78,6 @@ jobs:
 
  - name: Install dependencies
  run: |
- python --version
  pip --version
  pip install --requirement requirements.txt --upgrade --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
  python ./requirements/adjust-versions.py requirements.txt

diff --git a/.github/workflows/code-format.yml b/.github/workflows/code-format.yml
@@ -52,7 +52,7 @@ jobs:
  pip list
  - name: mypy
  run: |
- mypy
+ mypy --show-error-codes
 
 # format-check-yapf:
 # runs-on: ubuntu-20.04

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -36,6 +36,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added `is_differentiable` property to `ConfusionMatrix`, `F1`, `FBeta`, `Hamming`, `Hinge`, `IOU`, `MatthewsCorrcoef`, `Precision`, `Recall`, `PrecisionRecallCurve`, `ROC`, `StatScores` ([#253](https://github.com/PyTorchLightning/metrics/pull/253))
 
 
+- Added audio metrics: SNR, SI_SDR, SI_SNR ([#292](https://github.com/PyTorchLightning/metrics/pull/292))
+
+
 - Added Inception Score metric to image module ([#299](https://github.com/PyTorchLightning/metrics/pull/299))
 
 

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -19,22 +19,24 @@ pr:
 jobs:
  - job: pytest
  # how long to run the job before automatically cancelling
- timeoutInMinutes: 35
+ timeoutInMinutes: 45
  # how much time to give 'run always even if cancelled tasks' before stopping them
  cancelTimeoutInMinutes: 2
 
  pool: gridai-spot-pool
 
  container:
- image: "pytorch/pytorch:1.7.1-cuda11.0-cudnn8-runtime"
- options: "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all"
+ image: "pytorch/pytorch:1.8.1-cuda10.2-cudnn7-runtime"
+ options: "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all --name ci-container -v /usr/bin/docker:/tmp/docker:ro"
 
  workspace:
  clean: all
 
  steps:
 
  - bash: |
+ whoami
+ id
  lspci | egrep 'VGA|3D'
  whereis nvidia
  nvidia-smi
@@ -43,8 +45,15 @@ jobs:
  pip list
  displayName: 'Image info & NVIDIA'
 
+ - script: |
+ /tmp/docker exec -t -u 0 ci-container \
+ sh -c "apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -o Dpkg::Options::="--force-confold" -y install sudo"
+ displayName: 'Install Sudo in container (thanks Microsoft!)'
+
  - bash: |
- #sudo apt-get install -y cmake
+ set -ex
+ sudo apt-get update
+ sudo apt-get install -y gcc cmake ffmpeg git libsndfile1
  # python -m pip install "pip==20.1"
  pip install --requirement ./requirements/devel.txt --upgrade-strategy only-if-needed
  pip uninstall -y torchmetrics

diff --git a/docs/source/references/functional.rst b/docs/source/references/functional.rst
@@ -5,6 +5,31 @@
 Functional metrics
 ##################
 
+*************
+Audio Metrics
+*************
+
+si_sdr [func]
+~~~~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.si_sdr
+ :noindex:
+
+
+si_snr [func]
+~~~~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.si_snr
+ :noindex:
+
+
+snr [func]
+~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.snr
+ :noindex:
+
+
 **********************
 Classification Metrics
 **********************

diff --git a/docs/source/references/modules.rst b/docs/source/references/modules.rst
@@ -18,6 +18,46 @@ your own metric type might be too burdensome.
 .. autoclass:: torchmetrics.AverageMeter
  :noindex:
 
+*************
+Audio Metrics
+*************
+
+About Audio Metrics
+~~~~~~~~~~~~~~~~~~~
+
+For the purposes of audio metrics, inputs (predictions, targets) must have the same size.
+If the input is 1D tensors the output will be a scalar. If the input is multi-dimensional with shape [..., time]` the metric will be computed over the `time` dimension.
+
+.. doctest::
+
+ >>> import torch
+ >>> from torchmetrics import SNR
+ >>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
+ >>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
+ >>> snr = SNR()
+ >>> snr_val = snr(preds, target)
+ >>> snr_val
+ tensor(16.1805)
+
+SI_SDR
+~~~~~~
+
+.. autoclass:: torchmetrics.SI_SDR
+ :noindex:
+
+SI_SNR
+~~~~~~
+
+.. autoclass:: torchmetrics.SI_SNR
+ :noindex:
+
+SNR
+~~~
+
+.. autoclass:: torchmetrics.SNR
+ :noindex:
+
+
 **********************
 Classification Metrics
 **********************

diff --git a/requirements/test.txt b/requirements/test.txt
@@ -19,3 +19,10 @@ nltk>=3.6
 
 # add extra requirements
 -r image.txt
+
+# audio
+pypesq
+mir_eval>=0.6
+#pesq @ https://github.com/ludlows/python-pesq/archive/refs/heads/master.zip
+#SRMRpy @ https://github.com/jfsantos/SRMRpy/archive/refs/heads/master.zip
+speechmetrics @ https://github.com/aliutkus/speechmetrics/archive/refs/heads/master.zip
diff --git a/tests/audio/__init__.py b/tests/audio/__init__.py
diff --git a/tests/audio/test_si_sdr.py b/tests/audio/test_si_sdr.py
@@ -0,0 +1,131 @@
+# Copyright The PyTorch Lightning team.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from collections import namedtuple
+from functools import partial
+
+import pytest
+import speechmetrics
+import torch
+from torch import Tensor
+
+from tests.helpers import seed_all
+from tests.helpers.testers import BATCH_SIZE, NUM_BATCHES, MetricTester
+from torchmetrics.audio import SI_SDR
+from torchmetrics.functional import si_sdr
+from torchmetrics.utilities.imports import _TORCH_GREATER_EQUAL_1_6
+
+seed_all(42)
+
+Time = 100
+
+Input = namedtuple('Input', ["preds", "target"])
+
+inputs = Input(
+ preds=torch.rand(NUM_BATCHES, BATCH_SIZE, 1, Time),
+ target=torch.rand(NUM_BATCHES, BATCH_SIZE, 1, Time),
+)
+
+speechmetrics_sisdr = speechmetrics.load('sisdr')
+
+
+def speechmetrics_si_sdr(preds: Tensor, target: Tensor, zero_mean: bool):
+ # shape: preds [BATCH_SIZE, 1, Time] , target [BATCH_SIZE, 1, Time]
+ # or shape: preds [NUM_BATCHES*BATCH_SIZE, 1, Time] , target [NUM_BATCHES*BATCH_SIZE, 1, Time]
+ if zero_mean:
+ preds = preds - preds.mean(dim=2, keepdim=True)
+ target = target - target.mean(dim=2, keepdim=True)
+ target = target.detach().cpu().numpy()
+ preds = preds.detach().cpu().numpy()
+ mss = []
+ for i in range(preds.shape[0]):
+ ms = []
+ for j in range(preds.shape[1]):
+ metric = speechmetrics_sisdr(preds[i, j], target[i, j], rate=16000)
+ ms.append(metric['sisdr'][0])
+ mss.append(ms)
+ return torch.tensor(mss)
+
+
+def average_metric(preds, target, metric_func):
+ # shape: preds [BATCH_SIZE, 1, Time] , target [BATCH_SIZE, 1, Time]
+ # or shape: preds [NUM_BATCHES*BATCH_SIZE, 1, Time] , target [NUM_BATCHES*BATCH_SIZE, 1, Time]
+ return metric_func(preds, target).mean()
+
+
+speechmetrics_si_sdr_zero_mean = partial(speechmetrics_si_sdr, zero_mean=True)
+speechmetrics_si_sdr_no_zero_mean = partial(speechmetrics_si_sdr, zero_mean=False)
+
+
+@pytest.mark.parametrize(
+ "preds, target, sk_metric, zero_mean",
+ [
+ (inputs.preds, inputs.target, speechmetrics_si_sdr_zero_mean, True),
+ (inputs.preds, inputs.target, speechmetrics_si_sdr_no_zero_mean, False),
+ ],
+)
+class TestSISDR(MetricTester):
+ atol = 1e-2
+
+ @pytest.mark.parametrize("ddp", [True, False])
+ @pytest.mark.parametrize("dist_sync_on_step", [True, False])
+ def test_si_sdr(self, preds, target, sk_metric, zero_mean, ddp, dist_sync_on_step):
+ self.run_class_metric_test(
+ ddp,
+ preds,
+ target,
+ SI_SDR,
+ sk_metric=partial(average_metric, metric_func=sk_metric),
+ dist_sync_on_step=dist_sync_on_step,
+ metric_args=dict(zero_mean=zero_mean),
+ )
+
+ def test_si_sdr_functional(self, preds, target, sk_metric, zero_mean):
+ self.run_functional_metric_test(
+ preds,
+ target,
+ si_sdr,
+ sk_metric,
+ metric_args=dict(zero_mean=zero_mean),
+ )
+
+ def test_si_sdr_differentiability(self, preds, target, sk_metric, zero_mean):
+ self.run_differentiability_test(
+ preds=preds,
+ target=target,
+ metric_module=SI_SDR,
+ metric_functional=si_sdr,
+ metric_args={'zero_mean': zero_mean}
+ )
+
+ @pytest.mark.skipif(
+ not _TORCH_GREATER_EQUAL_1_6, reason='half support of core operations on not support before pytorch v1.6'
+ )
+ def test_si_sdr_half_cpu(self, preds, target, sk_metric, zero_mean):
+ pytest.xfail("SI-SDR metric does not support cpu + half precision")
+
+ @pytest.mark.skipif(not torch.cuda.is_available(), reason='test requires cuda')
+ def test_si_sdr_half_gpu(self, preds, target, sk_metric, zero_mean):
+ self.run_precision_test_gpu(
+ preds=preds,
+ target=target,
+ metric_module=SI_SDR,
+ metric_functional=si_sdr,
+ metric_args={'zero_mean': zero_mean}
+ )
+
+
+def test_error_on_different_shape(metric_class=SI_SDR):
+ metric = metric_class()
+ with pytest.raises(RuntimeError, match='Predictions and targets are expected to have the same shape'):
+ metric(torch.randn(100, ), torch.randn(50, ))