Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TorchScript-able "save" func to sox_io backend #732

Merged
merged 3 commits into from
Jul 1, 2020

Conversation

mthrok
Copy link
Collaborator

@mthrok mthrok commented Jun 18, 2020

This is a part of PRs to add new "sox_io" backend. #726 and depends on #718, #728 and #731.

This PR adds save function to "sox_io" backend, which can save Tensor to a file with the following audio formats;

  • wav
  • mp3
  • flac
  • ogg/vorbis *

* Note The current binary distribution of torchaudio does not contain ogg/vorbis codecs. To handle these files, one needs to build torchaudio from the source.

@mthrok mthrok force-pushed the save branch 2 times, most recently from fcc7e45 to 5ed66a3 Compare June 18, 2020 19:10
@mthrok mthrok mentioned this pull request Jun 18, 2020
6 tasks
@mthrok mthrok force-pushed the save branch 21 times, most recently from e96727e to c9fd168 Compare June 20, 2020 14:21
@codecov
Copy link

codecov bot commented Jun 20, 2020

Codecov Report

Merging #732 into master will decrease coverage by 0.21%.
The diff coverage is 56.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #732      +/-   ##
==========================================
- Coverage   89.23%   89.02%   -0.22%     
==========================================
  Files          32       32              
  Lines        2517     2532      +15     
==========================================
+ Hits         2246     2254       +8     
- Misses        271      278       +7     
Impacted Files Coverage Δ
torchaudio/backend/sox_io_backend.py 73.07% <56.25%> (-26.93%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4daf2fb...0130ca7. Read the comment docs.

@mthrok mthrok force-pushed the save branch 5 times, most recently from 2b50660 to 1f49bd9 Compare June 22, 2020 22:40
@mthrok mthrok force-pushed the save branch 17 times, most recently from e88aba5 to b9f8732 Compare June 26, 2020 17:21
@mthrok mthrok marked this pull request as ready for review June 26, 2020 20:51
@mthrok mthrok requested a review from vincentqb June 26, 2020 20:51
elif ext in ['ogg', 'vorbis']:
compression = 3.
else:
raise RuntimeError(f'Unsupported file type: "{ext}"')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincentqb I thought of calling for opening an issue for adding support for other audio formats, but differentiating the valid extensions that are supported by libsox from invalid ones is not trivial and there are many formats that require additional libraries to be installed. Meaning that adding support on our get_signalinfo function may not be enough for users to actually use a new format, depending on the format. So I think we want to be more cautious and not giving high expectations to users casually. Therefore I decided not to include call for opening a PR in the error message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me how the current backend exposes more backend? From our discussion, the user had to provide extra internal information, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing implementation asks users to construct the following data structure, which requires users to know the internal of libsox.

https://fossies.org/dox/sox-14.4.2/structsox__signalinfo__t.html
https://fossies.org/dox/sox-14.4.2/structsox__encodinginfo__t.html

Comment on lines +250 to +251
# note: torchaudio can load large vorbis file, but cannot save large volbis file
# the following test causes Segmentation fault
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be fixed eventually?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope I do not think this can be fixed. However, I wanted to leave a record of know limitation.

elif ext in ['ogg', 'vorbis']:
compression = 3.
else:
raise RuntimeError(f'Unsupported file type: "{ext}"')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me how the current backend exposes more backend? From our discussion, the user had to provide extra internal information, right?

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've added minor clarifying questions below.

@mthrok mthrok merged commit 3324283 into pytorch:master Jul 1, 2020
@mthrok mthrok deleted the save branch July 1, 2020 18:41
@mthrok
Copy link
Collaborator Author

mthrok commented Jul 1, 2020

thanks

mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023
* fix loss calculation for RNN

* fixes loss for both RNN & Transformer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants