Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace runners prefix amz2023. #3819

Open
wants to merge 1,144 commits into
base: master
Choose a base branch
from

Conversation

jeanschmidt
Copy link

testing new runners

nateanl and others added 30 commits May 23, 2023 13:42
Summary:
resolve #3347

`position_bias` is ignored in `extract_features` method, this doesn't affect Wav2Vec2 or HuBERT models, but it changes the output of transformer layers (except the first layer) in WavLM model. This PR fixes it by adding `position_bias` to the method.

Pull Request resolved: #3350

Reviewed By: mthrok

Differential Revision: D46112148

Pulled By: nateanl

fbshipit-source-id: 3d21aa4b32b22da437b440097fd9b00238152596
Summary: Pull Request resolved: #3366

Reviewed By: nateanl

Differential Revision: D46136238

Pulled By: mthrok

fbshipit-source-id: 3432f5d007293831bab21460a79ae26b1bbc81a8
Summary:
CC atalman malfet

Pull Request resolved: #3360

Reviewed By: mthrok

Differential Revision: D46150898

Pulled By: atalman

fbshipit-source-id: 985a0ef69406f48fb15f239d6b16616c0a5379f5
Summary:
This commit changes the way doc is pushed.
It ammends instead of adding a new commit.

Currently each commit in gh-pages contain like 100MB of data. gh-pages branch is fetched by default when `git clone`. So the size of torchaudio repo grows significantly.

Pull Request resolved: #3345

Reviewed By: nateanl

Differential Revision: D46136612

Pulled By: mthrok

fbshipit-source-id: 39479ee5d1a6888254ef50f0db252453d976d183
Summary:
* Delay the import of torchaudio until the CLI options are parsed.
* Add option to set log level to DEBUG so that it's easy to see the issue with external libraries.

Pull Request resolved: #3346

Reviewed By: nateanl

Differential Revision: D46022546

Pulled By: mthrok

fbshipit-source-id: 9f988bbd770c2fd2bb260c3cfe02b238a9da2808
Summary:
Follow-up #3045
- Revert the removal of HW acceleration doc
- comment out FFmpeg CLI test run

Pull Request resolved: #3349

Reviewed By: nateanl

Differential Revision: D46121899

Pulled By: mthrok

fbshipit-source-id: dfc030a69f05addec73637cfb6a720c184e37323
Summary: Pull Request resolved: #3367

Reviewed By: nateanl

Differential Revision: D46148139

Pulled By: mthrok

fbshipit-source-id: 50f297ac69bb95562976eb452e4e382b8c064c3c
Summary:
This PR adds AV-ASR recipe which contains sample implementations of training and evaluation pipelines for RNNT based automatic, visual, and audio-visual (ASR, VSR, AV-ASR) models on LRS3. This repository includes both streaming/non-streaming modes.

CC stavros99 xiaohui-zhang YumengTao mthrok nateanl hwangjeff

Pull Request resolved: #3278

Reviewed By: nateanl

Differential Revision: D46121550

Pulled By: mpc001

fbshipit-source-id: bb44b97ae25e87df2a73a707008be46af4ad0fc6
Summary:
This commit fixes the following issues affecting streaming decoding quality
1. The `init_b` hypothesis is only regenerated from blank token if no initial hypotheses are provided.
2. Allows the decoder to receive top-K hypothesis to continue decoding from, instead of using just the top hypothesis at each decoding step.  This dramatically affects decoding quality especially for speech with long pauses and disfluencies.
3. Some minor errors regarding shape checking for length.

This also means that the resulting output is the entire transcript up until that time step, instead of just the incremental change in transcript.

Pull Request resolved: #3295

Reviewed By: nateanl

Differential Revision: D46216113

Pulled By: hwangjeff

fbshipit-source-id: 8f7efae28dcca4a052f434ca55a2795c9e5ec0b0
Summary:
This reverts commit d38a785.

This is temporary revert to unblock unit test migration from circleci to github

Pull Request resolved: #3377

Reviewed By: mthrok

Differential Revision: D46230498

Pulled By: atalman

fbshipit-source-id: 000d8a9ca00750fc1ca61f4c2cdd6e930a5ce46d
Summary:
The tests failed for several bundles. Remove them and will re-add once the root cause is figured out.

Pull Request resolved: #3378

Reviewed By: atalman

Differential Revision: D46230884

Pulled By: nateanl

fbshipit-source-id: 42056a29b2ec2335268b273d3e37fb517035be92
Summary:
Use cuda 11.8 for circleci tests.
11.7 was deprecated

Pull Request resolved: #3381

Reviewed By: osalpekar

Differential Revision: D46236223

Pulled By: atalman

fbshipit-source-id: 6d6a8e09603807a07241f31c1bd1e6d3a2b67d9d
Summary:
11.7 uses 8.5.0; 11.8 uses 8.7.0; 12.1 uses 8.8.1.  Otherwise, Windows vision job (8.5.0) would overwrite the CUDNN version setup by PyTorch (8.7.0) leading to this flaky failures https://github.com/pytorch/pytorch/actions/runs/5088860652/jobs/9146641450

```
RuntimeError: cuDNN version incompatibility: PyTorch was compiled  against (8, 7, 0) but found runtime version (8, 5, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.
```

Pull Request resolved: #3380

Reviewed By: atalman

Differential Revision: D46236286

Pulled By: huydhn

fbshipit-source-id: 9ca12d5068c3029688347d52c5c284488f33728d
Summary:
g722 format only supports 16k Hz, but AVCodec does not list this. The implementation does not insert resampling and the resulting audio can be slowed down or sped up.

Pull Request resolved: #3373

Reviewed By: hwangjeff

Differential Revision: D46233181

Pulled By: mthrok

fbshipit-source-id: 902b3f862a8f7269dc35bc871e868b0e78326c6c
Summary:
When encoding audio with mulaw, the resulting data does not have header, and the StreamReader defaults to 16k Hz, which can strech/shrink the resulting waveform.

Pull Request resolved: #3372

Reviewed By: hwangjeff

Differential Revision: D46234772

Pulled By: mthrok

fbshipit-source-id: 942c89a8cfe29b0b6f57b3e5b6c9dfd3524ca552
Summary:
Continuing with the job migrations from CCI to Nova, this PR introduces the Windows CPU Unittest job as a Nova workflow.

The job is passing: https://github.com/pytorch/audio/actions/runs/5094569687/jobs/9159020192?pr=3329.

Pull Request resolved: #3329

Reviewed By: huydhn

Differential Revision: D46265649

Pulled By: atalman

fbshipit-source-id: 7659dfbcc8ad400f2e109ff64530e1f768e82ef9
Summary:
Pull Request resolved: #3383

This commit reduces `torchaudio::sox_*` namespace into `torchaudio::sox`.
Also put Pybind11 registration and TorchBind registration into anonymous namescope.

Differential Revision: D46257367

fbshipit-source-id: 0f0f181eaa72036916e223263daf4b7c298fca0d
Summary:
Pull Request resolved: #3389

Adopt more of const reference in sox source code.

Differential Revision: D46264068

fbshipit-source-id: 809d34a6e16f621c856d4278ef7ce45a5868a717
Summary:
Disable failing GPU unit test.
See associated issue: #3376

Pull Request resolved: #3384

Reviewed By: mthrok

Differential Revision: D46279324

Pulled By: atalman

fbshipit-source-id: 3a606bb992e0261451f48d1fb458e054f7fd5583
Summary:
Pull Request resolved: #3379

Fixes `RNNTBeamSearch.infer`'s docstring and removes unused import from tutorial.

Reviewed By: mthrok

Differential Revision: D46227174

fbshipit-source-id: 7c1c3f05a6476cb0437622dea6f3ae6cb3ea9468
Summary:
Windows GPU workflows

Pull Request resolved: #3364

Reviewed By: mthrok

Differential Revision: D46292403

Pulled By: atalman

fbshipit-source-id: ee3c6f8082ca77bdc1ffdb930c59fa5a9cb25a4a
Summary:
Nova - Deprecate windows circleci unit tests

Pull Request resolved: #3393

Reviewed By: malfet

Differential Revision: D46315608

Pulled By: atalman

fbshipit-source-id: 3d7b5d0618b9d2e12e5f97e21d7becdc61d85c69
Summary:
Set the directory of JUnitText XML file to the one where test-infra picks up and put them in summary.

Example: https://github.com/pytorch/audio/actions/runs/5136305988

Pull Request resolved: #3394

Differential Revision: D46328832

Pulled By: mthrok

fbshipit-source-id: f0b5020a911ca4ec09345a965bdec769300859f0
Summary:
See title. If all is well, we can deprecate the CCI job in a few days.

Pull Request resolved: #3341

Reviewed By: mthrok

Differential Revision: D46324265

Pulled By: osalpekar

fbshipit-source-id: bc706c6ae4285d4085dc5f0223ea41d8fc290f1c
Summary:
Introducing the stylecheck job on Nova. It seems like it is failing on trunk, but the functionality of this job itself is working and it fails with the same error as it does on trunk with CCI.

Pull Request resolved: #3390

Reviewed By: mthrok

Differential Revision: D46324223

Pulled By: osalpekar

fbshipit-source-id: 1324202e53569d610559ef6f1b90cb5c364e6909
Summary:
Deprecates the Linux and MacOS Unittest jobs now that they've been running on Nova for over a week.

Aside: There was also a stylecheck job that was dependent on the Linux Unittest job. I also put up #3390 to move that stylecheck job to Nova. I'm happy to reintroduce the CCI stylecheck job standalone in CCI if we want the Nova version to run on main for a week.

Pull Request resolved: #3391

Reviewed By: mthrok

Differential Revision: D46324198

Pulled By: osalpekar

fbshipit-source-id: 2115748e153c5dee1a38db2b6230acebc4f56927
Summary:
To prepare for the upcoming removal of file-like object support from sox_io backend,
this commit changes apply_codec function to use tempfile.

`apply_codec` function is now deprecated and users are encourated to use `torchaudio.io.AudioEffector`.
We will not remove the function itself, but will remove the entry from the doc.

Pull Request resolved: #3386

Reviewed By: hwangjeff

Differential Revision: D46330610

Pulled By: mthrok

fbshipit-source-id: 3071bdefa05b4cbb9f00629bef50f0981eae89b4
Summary:
The arguments of TorchAudio's save function ("format", "bits_per_sample" and "encoding")
are not one-to-one mapping to the arguments of FFmpeg encoding.

For example, to use vorbis codec, FFmpeg expects "ogg" container/extension with "vorbis"
encoder. It does not recognize "vorbis" extension like TorchAudio (libsox) does.

This commit refactors the logic to parse/map the arguments.

As a result it now properly works with vorbis and mp3 extension.

Pull Request resolved: #3387

Reviewed By: hwangjeff

Differential Revision: D46328787

Pulled By: mthrok

fbshipit-source-id: 36f993952a062bfec58a8b51be6aa86297571f90
Summary:
Follow-up #3386 The intended change was to use path of temporary file, instead of file-like object

Pull Request resolved: #3397

Reviewed By: hwangjeff

Differential Revision: D46346189

Pulled By: mthrok

fbshipit-source-id: 44da799c6587bcb63a118a6313b7299bad742a40
Summary: Pull Request resolved: #3398

Reviewed By: nateanl

Differential Revision: D46354862

Pulled By: mthrok

fbshipit-source-id: b86dcdfeff8ed9db87b0b78eca20f6f18117e97e
mthrok and others added 28 commits October 31, 2023 19:05
When the input is zero Tensor, the result should be empty.
PyTorch lightening is having issue with the nightly PyTorch.
Let the other tests still run.
global audio backend is removed thus this is no-op.
Back port from release/2.1 branch.
Need to git-fetch source code to get the version number dynamically
* Update doc

* Update citation
Summary:
This is not needed anymore after pytorch/test-infra#4865.


Reviewed By: malfet, jeanschmidt, clee2000, NicolasHug

Differential Revision: D52735187

Pulled By: huydhn
* add golf and dynonet paper

* doc: add references

* add EOF

* fix: line too long

* remove line end space

* remove indentation

Co-authored-by: moto <855818+mthrok@users.noreply.github.com>

---------

Co-authored-by: moto <855818+mthrok@users.noreply.github.com>
Differential Revision: D53606067

Pull Request resolved: #3740
The lengths of targets and log_probs should be reversed.
Differential Revision: D54263224

Pull Request resolved: #3751
* Update tacotron2_pipeline_tutorial.py

- Fixed typo
- Clarified what was being done in different sections
…ing pybind11 (#3766)

Unpin mkl version and install pybind11 to get the windows CI working again

This fixes #3767
…o/ffmpeg/stream_reader/stream_processor.h +20

Differential Revision: D57294285

Pull Request resolved: #3792
…c/decoders/TransducerDecoder.h +20

Differential Revision: D57294284

Pull Request resolved: #3793
…haudio/sox/effects.cpp +20

Differential Revision: D57294298

Pull Request resolved: #3791
Summary:
Pull Request resolved: #3803

The model checkpoint path can not be created for Squim models. Use the latest download_asset method to fix it.

Reviewed By: moto-meta

Differential Revision: D59061348
Copy link

pytorch-bot bot commented Jul 25, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/audio/3819

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 1 Unrelated Failure

As of commit 54ef2f9 with merge base 69b2a0a (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.