Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf.audio.decode_wav not working properly? #659

Closed
ark626 opened this issue May 26, 2020 · 5 comments
Closed

tf.audio.decode_wav not working properly? #659

ark626 opened this issue May 26, 2020 · 5 comments
Labels

Comments

@ark626
Copy link

ark626 commented May 26, 2020

🐛 Bug

x, sr = tf.audio.decode_wav(tf.io.read_file(ls[i]) doesnt return the proper values.

To Reproduce

A function like below is giving me wrong values to work with.

#Waveform array from path of folder containing wav files
def audio_array(path):
  ls = glob(f'{path}/*.wav')
  adata = []
  for i in range(len(ls)):
    x, sr = tf.audio.decode_wav(tf.io.read_file(ls[i]), 1)
    print(str(x))
    x = np.array(x, dtype=np.float32)
    adata.append(x)
  return np.array(adata)

The print part gives out a really odd tensor here:
Tensor("DecodeWav:0", shape=(?, 1), dtype=float32)

Expected behavior

The print(str(x)) operation should give something like below.

...
[ 0.0000000e+00]
[ 0.0000000e+00]
[ 0.0000000e+00]], shape=(55921, 1), dtype=float32)

Environment

Collecting environment information...
PyTorch version: 1.4.0a0+7f73f1d
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.10.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_train.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_train.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_etc.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_infer.so.8.0.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_train.so.8.0.0

Versions of relevant libraries:
[pip3] numpy==1.18.4
[pip3] torch==1.5.0
[pip3] torchaudio==0.6.0a0+313f4f5
[pip3] torchvision==0.6.0a0+b68adcf
[conda] Could not collect

Can be of course connected to the issues in the general setup. See issue #658

@mthrok
Copy link
Collaborator

mthrok commented May 26, 2020

Hi @ark626

Looks like your snippet is Tensorflow. Are you using torchaudio?

@ark626
Copy link
Author

ark626 commented May 26, 2020

@mthrok yes sadly this script is using both, torch and tensorflow.
The original code you can see here: https://github.com/marcoppasini/MelGAN-VC/blob/master/MelGAN_VC.ipynb

@mthrok
Copy link
Collaborator

mthrok commented May 26, 2020

@ark626

Well, since the snippet you posted concerns Tensorflow and not torchaudio, please ask it in Tensorflow's support.

@vincentqb
Copy link
Contributor

Based on previous comment by @mthrok, I will close this issue. Please feel free to re-open if this relates to torchaudio.

mthrok pushed a commit to mthrok/audio that referenced this issue Feb 26, 2021
@ark626
Copy link
Author

ark626 commented Jul 27, 2022

Just in case i ever get this issue.
Just call numpy() or eval()!!
Depending on the Tensorflow version <=1.15 use eval() >1.15 use numpy()
' x, sr = tf.audio.decode_wav(tf.io.read_file(ls[i]), 1)
#print(x.get_shape())
#x = x.eval(session=tf.compat.v1.Session())#.numpy()
x = x.numpy()
x = np.array(x).astype(dtype=np.float32)
'

This completly solves the issue.

@mthrok mthrok added the triaged label Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants