-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add paddle audio dataset && backend #45939
Merged
ZeyuChen
merged 131 commits into
PaddlePaddle:develop
from
SmileGoat:audio_dataset_test
Oct 20, 2022
Merged
Changes from all commits
Commits
Show all changes
131 commits
Select commit
Hold shift + click to select a range
f792267
add audio feature dataset
SmileGoat 6a678ae
fix coding style
SmileGoat ad8e05d
fix coding style2
SmileGoat 725506d
rm librosa
SmileGoat e768d9b
rm voxceleb
SmileGoat cf2a687
rm librosa in test
SmileGoat b1833ba
add scipy fftpack
SmileGoat 417b5a5
add functional
SmileGoat 9d6b69c
fix setup
SmileGoat 0a98960
fix setup2
SmileGoat 6da8584
rm colorlog
SmileGoat 0b55b7a
refactor dataset __init__.py
SmileGoat 0f1344d
fix converage
SmileGoat 9219a55
fix librosa import error
SmileGoat 2198443
fix windows test
SmileGoat fcfa6be
fix windows ci
SmileGoat 7318db4
rm datasets
SmileGoat d446d92
fix setup
SmileGoat c4b34a6
Merge branch 'develop' of github.com:SmileGoat/Paddle into dataset_pa…
SmileGoat eaa5ceb
remove testdata
SmileGoat b83cc0c
add librosa in requirement
SmileGoat 057711b
add librosa in requirement2
SmileGoat 93ff919
change librosa to 0.8.1
SmileGoat f308d4d
update ci docker
SmileGoat 6b95712
fix ci error
SmileGoat 5fbae14
fix ci error2
SmileGoat 1782cb7
fix ci coverage
SmileGoat 57c77d4
fix converage
SmileGoat 1b1d84a
fix coverage
SmileGoat b7e3d4d
rm audio_base in test, notest,test=coverage
SmileGoat cf071c2
fix copyright
SmileGoat 8ae3226
rm backend
SmileGoat 5dab485
add datast in __init__
SmileGoat 15b789d
rm compliance&&add function test
SmileGoat 3ac7c7b
fix setup
SmileGoat cce788b
fix windows
SmileGoat ba67011
fix windows2
SmileGoat 38247c4
rm compliance
SmileGoat edb0872
fix test timeout
SmileGoat e5d9ef5
Merge branch 'dataset_paddle' into audio_backend_paddle2
SmileGoat 54da906
add backend & datasets
SmileGoat 8b5ab74
fix bugs
SmileGoat ede2da7
fix ci time issue
SmileGoat e698099
Merge branch 'dataset_paddle' into audio_backend_paddle2
SmileGoat 85fd94c
add dataset test
SmileGoat bc23946
rm test_audio_feature
SmileGoat c2a86d9
avoid windows isssue, tmp
SmileGoat 4914c02
note windows isssue
SmileGoat e5fa7eb
Merge branch 'dataset_paddle' into audio_backend_paddle2
SmileGoat 0f5f928
skip windows issue
SmileGoat 97c2e93
refactor dataset test
SmileGoat a9de571
Merge branch 'dataset_paddle' into audio_backend_paddle2
SmileGoat 2c5ad2b
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_back…
SmileGoat b194c09
add dataset.py
SmileGoat 5dfd340
fix dtype in layers.mfcc
SmileGoat d0ad3cb
Merge branch 'dataset_paddle' into audio_backend_paddle2
SmileGoat abd2df6
fix ci-static-check
SmileGoat e604a0f
fix dtype in layers.mfcc && fix ci-static-check
SmileGoat 79c387a
Merge branch 'develop' of github.com:SmileGoat/Paddle into dataset_pa…
SmileGoat 7e17b54
add relative accuracy
SmileGoat ca562be
modity API.spec
SmileGoat 0678a3c
skip cuda11.2 test
SmileGoat ab40ff0
skip cuda11.2 test2
SmileGoat 2faf87d
skip cuda11.2
SmileGoat 0f9f531
rm dataset test
SmileGoat ac4c6bc
change dataset name
SmileGoat 4cc9a8d
merge develop
SmileGoat dbaa398
fix format
SmileGoat d6a027d
update api.spec
SmileGoat ba611f3
update api.spec2
SmileGoat a7fdd78
fix coverage
SmileGoat c0aeb92
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_back…
SmileGoat 0469385
add dataset test
SmileGoat 3592058
rm download load dict
SmileGoat 1f9bff7
rm download load dict in init
SmileGoat 0e8fd14
update api.spec3
SmileGoat 1b88963
fix dataset coverage
SmileGoat d82dea9
fix coverage
SmileGoat 050dea1
fix coverage2
SmileGoat bb66649
restore api.spec
SmileGoat e3dcce6
restore api.spec2
SmileGoat 1479346
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_data…
SmileGoat c332d78
fix api-spec 3
SmileGoat 048fe8e
fix api-spec 4
SmileGoat 4ffe233
fix api.spec
SmileGoat 13efe0b
fix api.spec6
SmileGoat b3c4837
refactor init_backend
SmileGoat 0726233
fix typo
SmileGoat 009ee33
change paddleaudio backend set
SmileGoat 9e896bd
fix get_current_audio_backend()
SmileGoat a7d0fb8
fix format
SmileGoat dca47da
fix format2
SmileGoat 707578a
remove format in parameters
SmileGoat 40963a7
fix format2
SmileGoat 5499d32
add warning massage in wave_backend && remove redundant audio util
SmileGoat 122b832
merge develop
SmileGoat ab4175a
rm audio util in print_signatures
SmileGoat 55fe432
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_data…
SmileGoat c25c9cd
fix format3
SmileGoat d442072
add tess dataset license
SmileGoat b56e3ca
format warning
SmileGoat 87a58c8
add more info in warning msg
SmileGoat 39a93f0
add paddleaudio version check
SmileGoat 86e0196
replace dataset esc50 with tess
SmileGoat ef5b921
add tess dataset && rm numpy transform in dataset.py
SmileGoat f5e070c
fix set audio backend bug
SmileGoat 79d5464
fix equal error
SmileGoat a687516
fix format && coverage error
SmileGoat 35d465a
add api example
SmileGoat 4bf22b0
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_data…
SmileGoat ee97ba5
fix format
SmileGoat 5c66e88
fix error
SmileGoat e6f5008
fix typo
SmileGoat f681319
add noqa in __init__
SmileGoat 7b9be7d
fix backend doc example error
SmileGoat 17b493b
rm seed in dataset
SmileGoat 7ace395
update bakcend example
SmileGoat 40566c0
fix typo
SmileGoat b0aae4e
fix typo
SmileGoat 4238663
fix example err
SmileGoat 1c20ff2
fix typo
SmileGoat 7e359a6
fix ci dataset test
SmileGoat 2a0a3bf
fix example fil
SmileGoat d1da85a
try to fix ci
SmileGoat 8cbcff0
clean dataset doc
SmileGoat 7ee4a58
change get_current_audio_backend to get_current_backend
SmileGoat b33893d
creplace paddle.audio.backends.info with paddle.audio.info, same with…
SmileGoat 3a40c83
fix ci error
SmileGoat 1ca215a
Merge branch 'develop' of github.com:SmileGoat/Paddle into audio_data…
SmileGoat 160b50e
repalce api in test_audio_backend
SmileGoat 2c6742f
fix save&&set_backend exmaple
SmileGoat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
from . import init_backend | ||
from .init_backend import get_current_backend # noqa: F401 | ||
from .init_backend import list_available_backends # noqa: F401 | ||
from .init_backend import set_backend | ||
|
||
init_backend._init_set_audio_backend() | ||
|
||
__all__ = [ | ||
'get_current_backend', | ||
'list_available_backends', | ||
'set_backend', | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License | ||
|
||
import paddle | ||
|
||
from pathlib import Path | ||
from typing import Optional, Tuple, Union | ||
|
||
|
||
class AudioInfo: | ||
""" Audio info, return type of backend info function """ | ||
|
||
def __init__(self, sample_rate: int, num_samples: int, num_channels: int, | ||
bits_per_sample: int, encoding: str): | ||
self.sample_rate = sample_rate | ||
self.num_samples = num_samples | ||
self.num_channels = num_channels | ||
self.bits_per_sample = bits_per_sample | ||
self.encoding = encoding | ||
|
||
|
||
def info(filepath: str) -> AudioInfo: | ||
"""Get signal information of input audio file. | ||
|
||
Args: | ||
filepath: audio path or file object. | ||
|
||
Returns: | ||
AudioInfo: info of the given audio. | ||
|
||
Example: | ||
SmileGoat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.. code-block:: python | ||
|
||
import os | ||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
base_dir = os.getcwd() | ||
filepath = os.path.join(base_dir, "test.wav") | ||
|
||
paddle.audio.save(filepath, waveform, sample_rate) | ||
wav_info = paddle.audio.info(filepath) | ||
""" | ||
# for API doc | ||
raise NotImplementedError("please set audio backend") | ||
|
||
|
||
def load(filepath: Union[str, Path], | ||
frame_offset: int = 0, | ||
num_frames: int = -1, | ||
normalize: bool = True, | ||
channels_first: bool = True) -> Tuple[paddle.Tensor, int]: | ||
"""Load audio data from file.Load the audio content start form frame_offset, and get num_frames. | ||
|
||
Args: | ||
SmileGoat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
frame_offset: from 0 to total frames, | ||
num_frames: from -1 (means total frames) or number frames which want to read, | ||
normalize: | ||
if True: return audio which norm to (-1, 1), dtype=float32 | ||
if False: return audio with raw data, dtype=int16 | ||
|
||
channels_first: | ||
if True: return audio with shape (channels, time) | ||
|
||
Return: | ||
Tuple[paddle.Tensor, int]: (audio_content, sample rate) | ||
|
||
Exampels: | ||
.. code-block:: python | ||
|
||
import os | ||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
base_dir = os.getcwd() | ||
filepath = os.path.join(base_dir, "test.wav") | ||
|
||
paddle.audio.save(filepath, waveform, sample_rate) | ||
wav_data_read, sr = paddle.audio.load(filepath) | ||
""" | ||
# for API doc | ||
raise NotImplementedError("please set audio backend") | ||
|
||
|
||
def save( | ||
filepath: str, | ||
src: paddle.Tensor, | ||
sample_rate: int, | ||
channels_first: bool = True, | ||
encoding: Optional[str] = None, | ||
bits_per_sample: Optional[int] = 16, | ||
): | ||
""" | ||
Save audio tensor to file. | ||
|
||
Args: | ||
filepath: saved path | ||
src: the audio tensor | ||
sample_rate: the number of samples of audio per second. | ||
channels_first: src channel infomation | ||
if True, means input tensor is (channels, time) | ||
if False, means input tensor is (time, channels) | ||
encoding:encoding format, wave_backend only support PCM16 now. | ||
bits_per_sample: bits per sample, wave_backend only support 16 bits now. | ||
|
||
Returns: | ||
None | ||
|
||
Examples: | ||
SmileGoat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.. code-block:: python | ||
|
||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
filepath = "./test.wav" | ||
|
||
paddle.audio.save(filepath, waveform, sample_rate) | ||
""" | ||
# for API doc | ||
raise NotImplementedError("please set audio backend") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,185 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import sys | ||
import warnings | ||
from . import wave_backend | ||
from . import backend | ||
from typing import List | ||
|
||
import paddle | ||
|
||
|
||
def _check_version(version: str) -> bool: | ||
# require paddleaudio >= 1.0.2 | ||
ver_arr = version.split('.') | ||
v0 = int(ver_arr[0]) | ||
v1 = int(ver_arr[1]) | ||
v2 = int(ver_arr[2]) | ||
if v0 < 1: | ||
return False | ||
if v0 == 1 and v1 == 0 and v2 <= 1: | ||
return False | ||
return True | ||
|
||
|
||
def list_available_backends() -> List[str]: | ||
""" List available backends, the backends in paddleaudio and the default backend. | ||
|
||
Returns: | ||
List[str]: The list of available backends. | ||
|
||
Examples: | ||
.. code-block:: python | ||
|
||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
wav_path = "./test.wav" | ||
|
||
current_backend = paddle.audio.backends.get_current_backend() | ||
print(current_backend) # wave_backend, the default backend. | ||
backends = paddle.audio.backends.list_available_backends() | ||
# default backends is ['wave_backend'] | ||
# backends is ['wave_backend', 'soundfile'], if have installed paddleaudio >= 1.0.2 | ||
if 'soundfile' in backends: | ||
paddle.audio.backends.set_backend('soundfile') | ||
|
||
paddle.audio.save(wav_path, waveform, sample_rate) | ||
|
||
""" | ||
backends = [] | ||
try: | ||
import paddleaudio | ||
except ImportError: | ||
package = "paddleaudio" | ||
warn_msg = ( | ||
"Failed importing {}. \n" | ||
"only wave_banckend(only can deal with PCM16 WAV) supportted.\n" | ||
"if want soundfile_backend(more audio type suppported),\n" | ||
"please manually installed (usually with `pip install {} >= 1.0.2`). " | ||
).format(package, package) | ||
warnings.warn(warn_msg) | ||
|
||
if "paddleaudio" in sys.modules: | ||
version = paddleaudio.__version__ | ||
if _check_version(version) == False: | ||
err_msg = ( | ||
"the version of paddleaudio installed is {},\n" | ||
"please ensure the paddleaudio >= 1.0.2.").format(version) | ||
raise ImportError(err_msg) | ||
backends = paddleaudio.backends.list_audio_backends() | ||
backends.append("wave_backend") | ||
return backends | ||
|
||
|
||
def get_current_backend() -> str: | ||
""" Get the name of the current audio backend | ||
|
||
Returns: | ||
str: The name of the current backend, | ||
the wave_backend or backend imported from paddleaudio | ||
|
||
Examples: | ||
.. code-block:: python | ||
|
||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
wav_path = "./test.wav" | ||
|
||
current_backend = paddle.audio.backends.get_current_backend() | ||
print(current_backend) # wave_backend, the default backend. | ||
backends = paddle.audio.backends.list_available_backends() | ||
# default backends is ['wave_backend'] | ||
# backends is ['wave_backend', 'soundfile'], if have installed paddleaudio >= 1.0.2 | ||
|
||
if 'soundfile' in backends: | ||
paddle.audio.backends.set_backend('soundfile') | ||
|
||
paddle.audio.save(wav_path, waveform, sample_rate) | ||
|
||
""" | ||
current_backend = None | ||
if "paddleaudio" in sys.modules: | ||
import paddleaudio | ||
current_backend = paddleaudio.backends.get_audio_backend() | ||
if paddle.audio.load == paddleaudio.load: | ||
return current_backend | ||
return "wave_backend" | ||
|
||
|
||
def set_backend(backend_name: str): | ||
"""Set the backend by one of the list_audio_backend return. | ||
|
||
Args: | ||
backend (str): one of the list_audio_backend. "wave_backend" is the default. "soundfile" imported from paddleaudio. | ||
|
||
Returns: | ||
None | ||
|
||
Examples: | ||
.. code-block:: python | ||
|
||
import paddle | ||
|
||
sample_rate = 16000 | ||
wav_duration = 0.5 | ||
num_channels = 1 | ||
num_frames = sample_rate * wav_duration | ||
wav_data = paddle.linspace(-1.0, 1.0, num_frames) * 0.1 | ||
waveform = wav_data.tile([num_channels, 1]) | ||
wav_path = "./test.wav" | ||
|
||
current_backend = paddle.audio.backends.get_current_backend() | ||
print(current_backend) # wave_backend, the default backend. | ||
backends = paddle.audio.backends.list_available_backends() | ||
# default backends is ['wave_backend'] | ||
# backends is ['wave_backend', 'soundfile'], if have installed paddleaudio >= 1.0.2 | ||
|
||
if 'soundfile' in backends: | ||
paddle.audio.backends.set_backend('soundfile') | ||
|
||
paddle.audio.save(wav_path, waveform, sample_rate) | ||
|
||
""" | ||
if backend_name not in list_available_backends(): | ||
raise NotImplementedError() | ||
|
||
if backend_name == "wave_backend": | ||
module = wave_backend | ||
else: | ||
import paddleaudio | ||
paddleaudio.backends.set_audio_backend(backend_name) | ||
module = paddleaudio | ||
|
||
for func in ["save", "load", "info"]: | ||
setattr(backend, func, getattr(module, func)) | ||
setattr(paddle.audio, func, getattr(module, func)) | ||
|
||
|
||
def _init_set_audio_backend(): | ||
# init the default wave_backend. | ||
for func in ["save", "load", "info"]: | ||
setattr(backend, func, getattr(wave_backend, func)) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有几个疑问
paddle.audio.backends.get_current_audio_backend()
是否可以考虑简洁的方式,比如
paddle.audio.get_current_beckend()
paddle.audio.list_available_beckends()
paddle.audio.set_backend()
表示save的对象是一个backend吗?
backend是通过set_backend()全局设置的一个状态
如果是表示保存文件的话,是不是paddle.audio.save就可以了?