Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update this branch #786

Merged
merged 103 commits into from
Dec 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
f0f8f86
Merge pull request #1 from NVIDIA/master
byshiue Jun 30, 2020
1e21e53
[FT] 1. Push the FasterTransformer v2.1
byshiue Jun 30, 2020
d8cc4a9
remove pretrained aligns and update readme accordingly.
andabi Aug 3, 2020
1aa6813
[FT] 1. Fix the bug of TensorRT plugin of FasterTransformer encoder. …
byshiue Aug 6, 2020
36ad5fe
Update .gitmodules
nv-kkudrynski Aug 6, 2020
769843e
Merge pull request #11 from NVIDIA/master
swethmandava Aug 10, 2020
b82c372
triton v2 api, download mrpc fix, update for mpi 4.2
Aug 10, 2020
efd6384
pointing to wikiextractor commit
Aug 10, 2020
e8f87ac
Keep wikiextractor version fixed
sharathts Aug 10, 2020
1069a73
converge to pyt
Aug 10, 2020
c8bbdb5
Merge pull request #644 from swethmandava/master
swethmandava Aug 11, 2020
41a0891
Merge pull request #645 from NVIDIA/sharathts-patch-4
nv-kkudrynski Aug 11, 2020
fb40734
Remove autobench scripts (#647)
pribalta Aug 12, 2020
9d4c9f3
tfrecords with correct name
Aug 13, 2020
7c0afee
Merge pull request #648 from swethmandava/master
swethmandava Aug 13, 2020
88864b9
[BERT/PyT] MRPC and SST-2 support
nv-kkudrynski Aug 14, 2020
ff7e38b
Merge pull request #650 from NVIDIA/bert_pyt_mrpc
nv-kkudrynski Aug 14, 2020
3745b49
[DLRM/PyT] Update
nv-kkudrynski Aug 17, 2020
d875531
Merge pull request #654 from NVIDIA/dlrm_update
nv-kkudrynski Aug 17, 2020
0d15a95
[DLRM/PyT] Readme fixes
nv-kkudrynski Aug 18, 2020
bbbc823
Merge pull request #655 from NVIDIA/gh/release
nv-kkudrynski Aug 18, 2020
446c878
[ELECTRA/TF2] Update inference latency (#657)
sharathts Aug 20, 2020
8bd6dd1
Document synthetic dataset options
hXl3s Aug 20, 2020
0e6cfbd
Merge pull request #659 from hXl3s/RN50/readme-update
nv-kkudrynski Aug 20, 2020
5cc03ca
[BERT/PyT] Update pretrained checkpoint links (#660)
sharathts Aug 21, 2020
8588e98
[BERT/PyT] specify GPU for triton (#666)
sharathts Sep 2, 2020
21fcdd6
[DLRM/PyT] Triton updates
nv-kkudrynski Sep 7, 2020
323005c
Merge pull request #676 from NVIDIA/gh/release
nv-kkudrynski Sep 8, 2020
5d36b4f
Fixing hyperlinks
nv-kkudrynski Sep 8, 2020
7a4c425
[BERT/PyT] Fix dataloader typo
Sep 9, 2020
cf54b78
fixed link
nv-kkudrynski Sep 10, 2020
1402e94
Update CUDA-Optimized/FastSpeech/README.md
nv-kkudrynski Sep 11, 2020
49e387c
Merge pull request #633 from andabi/master
nv-kkudrynski Sep 11, 2020
152d0c0
Merge pull request #684 from gpauloski/bert_pytorch_fix
nv-kkudrynski Sep 11, 2020
6b82d3a
[TXL/PyT] Minor update for PyTorch Transformer-XL (#688)
szmigacz Sep 14, 2020
437b950
Fixed links in readme
nv-kkudrynski Sep 14, 2020
482fe9a
[BERT/PyT] fix onnx export (#689)
sharathts Sep 15, 2020
aacbda6
Update Jasper sample to TensorRT 7.1.3.4 (#687)
rajeevsrao Sep 15, 2020
a74236a
[BERT/PyT] remove redundant section (#690)
sharathts Sep 17, 2020
751bca1
Update README.md
mmarcinkiewicz Sep 18, 2020
72f40b8
Fixed distributed checkpoint loading
hXl3s Sep 18, 2020
94518be
Merge pull request #693 from hXl3s/RN50/ngc-checkpoint-update
nv-kkudrynski Sep 18, 2020
66d1891
Merge branch 'master' into master
byshiue Sep 20, 2020
b2e89e6
[FT] FasterTransformer 3.0 Release (#696)
byshiue Sep 23, 2020
421c839
fix div
KsenijaS Sep 25, 2020
d057bab
[FastPitch/PyT] Updating for 20.08
nv-kkudrynski Sep 30, 2020
0b27e35
Merge pull request #708 from NVIDIA/gh/release
nv-kkudrynski Sep 30, 2020
550123f
updated convai
grzegorz-k-karch Sep 7, 2020
385d81e
a few fixes
grzegorz-k-karch Oct 7, 2020
b1ce24a
Merge pull request #677 from GrzegorzKarchNV/convai-update
nv-kkudrynski Oct 7, 2020
e76b900
Merge pull request #692 from NVIDIA/unetmed_add_nccl_to_known_issues
nv-kkudrynski Oct 7, 2020
70f247f
Merge pull request #700 from KsenijaS/fix_div
Oct 7, 2020
f217ab1
[COnvNets/Pyt] Triton Deployment
nv-kkudrynski Oct 14, 2020
ac05902
Merge pull request #714 from NVIDIA/gh/release
nv-kkudrynski Oct 14, 2020
3cf3a5c
commit on 'master'
alvarognvidia Oct 15, 2020
0a120b2
Merge pull request #715 from alvarognvidia/master
nv-kkudrynski Oct 15, 2020
9061083
[ConvNets/Pyt] Pretrained weights usage guidelines
nv-kkudrynski Oct 21, 2020
d1b1854
Merge pull request #718 from NVIDIA/gh/release
nv-kkudrynski Oct 21, 2020
b2c72b2
[FastPitch/PyT] Adding notebooks
nv-kkudrynski Oct 21, 2020
8cbac00
Merge pull request #719 from NVIDIA/gh/release
nv-kkudrynski Oct 21, 2020
533f744
[TXL/PyT] Fixed issue with AMP training together with gradient accumu…
szmigacz Oct 23, 2020
96e1700
[ELECTRA/TF2] Pretraining and other updates
nv-kkudrynski Oct 26, 2020
e159774
Merge pull request #722 from NVIDIA/gh/release
nv-kkudrynski Oct 26, 2020
36a6985
[BERT/TF] TRT int8 and Triton
nv-kkudrynski Oct 29, 2020
03c5a9f
Merge pull request #728 from NVIDIA/gh/release
nv-kkudrynski Oct 29, 2020
bec8259
[FastPitch/PyT] updated checkpoints, multispeaker and text processing
nv-kkudrynski Oct 30, 2020
475256f
Merge pull request #731 from NVIDIA/gh/release
nv-kkudrynski Oct 30, 2020
2af0f03
Update README.md
alvarognvidia Nov 2, 2020
fd32b99
[CUDA-Optimized/FastSpeech]
Nov 2, 2020
81e8636
Add --gpus flag to docker run
pribalta Nov 2, 2020
6f20c08
Merge pull request #735 from NVIDIA/pribalta-fix-unetind-readme
nv-kkudrynski Nov 2, 2020
b2e7f4a
[ConvNets/TF] Performance fix
nv-kkudrynski Nov 3, 2020
308925e
Merge pull request #737 from NVIDIA/gh/release
nv-kkudrynski Nov 3, 2020
1d4211f
Merge pull request #12 from NVIDIA/master
swethmandava Nov 3, 2020
c72196b
fix copying perf numbers mistake
Nov 3, 2020
b5741a9
Merge pull request #733 from alvarognvidia/master
nv-kkudrynski Nov 4, 2020
799660f
[Kaldi] Adding Jupyter notebook
nv-kkudrynski Nov 4, 2020
64ea93d
Merge pull request #734 from andabi/master
nv-kkudrynski Nov 4, 2020
2749a80
Merge branch 'gh/master' into gh/release
nv-kkudrynski Nov 5, 2020
ff86473
[FastPitch/PyT] Fixed ckpt handling
nv-kkudrynski Nov 5, 2020
0b34777
Merge pull request #740 from NVIDIA/gh/release
nv-kkudrynski Nov 5, 2020
a095658
Fix: Fix the bugs of allocating workspace (#746,#747)
byshiue Nov 9, 2020
c9846ca
Fix merge issues
hXl3s Nov 9, 2020
0b455ff
Merge pull request #748 from hXl3s/RN50/argparse_fix
nv-kkudrynski Nov 9, 2020
002bcd8
update gluon version to with bert in readme
swethmandava Nov 9, 2020
5ec39fc
Merge pull request #739 from swethmandava/master
swethmandava Nov 9, 2020
1113674
remove links to old ngc checkpoints
swethmandava Nov 9, 2020
7e85343
Merge pull request #749 from swethmandava/master
swethmandava Nov 9, 2020
fa1ddc9
[WideAndDeep/TF] library version fix
nv-kkudrynski Nov 10, 2020
3ddcba4
Merge pull request #751 from NVIDIA/gh/release
nv-kkudrynski Nov 10, 2020
4ef867a
Update a link to spark 3.0.0.
mkfilipiuk Nov 10, 2020
ad49eae
Merge pull request #752 from mkfilipiuk/patch-2
nv-kkudrynski Nov 10, 2020
cca4828
Fixed mrcnn weights downloading script
jan-golda Nov 18, 2020
61d9adc
Merge pull request #755 from jan-golda/mrcnn/fix_weights
nv-kkudrynski Nov 18, 2020
4a64c5b
Fix broken link to Spark pre-processing
jconwayNV Nov 19, 2020
9a6c524
fixing rng_state for backward compatibility
grzegorz-k-karch Nov 19, 2020
d17b10e
Merge pull request #759 from GrzegorzKarchNV/fix_rng_state
Nov 19, 2020
94a8f28
[UNet medical/TF2] Fix
nv-kkudrynski Nov 23, 2020
f3c6bdf
Merge pull request #764 from NVIDIA/gh/release
nv-kkudrynski Nov 23, 2020
478d565
[WideAndDeep/TF] Update for 20.10
nv-kkudrynski Nov 26, 2020
66667f1
Merge pull request #769 from NVIDIA/gh/release
nv-kkudrynski Nov 26, 2020
33ea90e
removie trt, fix queuing delay typo in triton readme for bert
Dec 3, 2020
99b1c89
Merge pull request #773 from swethmandava/master
swethmandava Dec 3, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
11 changes: 4 additions & 7 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
[submodule "PyTorch/Translation/Transformer/cutlass"]
path = PyTorch/Translation/Transformer/cutlass
url = https://github.com/NVIDIA/cutlass.git
[submodule "PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server"]
path = PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server
url = https://github.com/NVIDIA/tensorrt-inference-server.git
branch = r19.06
[submodule "PyTorch/SpeechRecognition/Jasper/external/triton-inference-server"]
path = PyTorch/SpeechRecognition/Jasper/external/triton-inference-server
url = https://github.com/NVIDIA/triton-inference-server.git
branch = r19.12
6 changes: 0 additions & 6 deletions CUDA-Optimized/FastSpeech/.gitmodules

This file was deleted.

11 changes: 9 additions & 2 deletions CUDA-Optimized/FastSpeech/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,14 @@
ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:20.03-py3
ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:20.10-py3
FROM ${FROM_IMAGE_NAME}

# ARG UNAME
# ARG UID
# ARG GID
# RUN groupadd -g $GID -o $UNAME
# RUN useradd -m -u $UID -g $GID -o -s /bin/bash $UNAME
# USER $UNAME

ADD . /workspace/fastspeech
WORKDIR /workspace/fastspeech

RUN sh ./scripts/install.sh
RUN sh ./scripts/install.sh
64 changes: 38 additions & 26 deletions CUDA-Optimized/FastSpeech/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,9 +95,9 @@ and encapsulates some dependencies. Aside from these dependencies, ensure you
have the following components:

* [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker)
* [PyTorch 20.03-py3+ NGC container](https://ngc.nvidia.com/registry/nvidia-pytorch)
* [PyTorch 20.10-py3 NGC container](https://ngc.nvidia.com/registry/nvidia-pytorch)
or newer
* [NVIDIA Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) or [Turing](https://www.nvidia.com/en-us/geforce/turing/) based GPU
* [NVIDIA Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/), [Turing](https://www.nvidia.com/en-us/geforce/turing/)<!--, or [Ampere](https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/) based GPU-->

For more information about how to get started with NGC containers, see the
following sections from the NVIDIA GPU Cloud Documentation and the Deep Learning
Expand All @@ -120,11 +120,6 @@ To train your model using mixed precision with Tensor Cores or using FP32, perfo
git clone https://github.com/NVIDIA/DeepLearningExamples.git
cd DeepLearningExamples/CUDA-Optimized/FastSpeech
```
and pull submodules.
```
git submodule init
git submodule update
```

2. Download and preprocess the dataset. Data is downloaded to the ./LJSpeech-1.1 directory (on the host). The ./LJSpeech-1.1 directory is mounted to the /workspace/fastspeech/LJSpeech-1.1 location in the NGC container.
```
Expand All @@ -146,7 +141,15 @@ To train your model using mixed precision with Tensor Cores or using FP32, perfo
python fastspeech/dataset/ljspeech_dataset.py --dataset_path="./LJSpeech-1.1" --mels_path="./mels_ljspeech1.1"
```

The preprocessed mel-spectrograms are stored in the ./mels_ljspeech1.1 directory. Also, the preprocessed alignments are prepared in ./aligns_ljspeech1.1 directory. For more information, refer to the [training process section](#training-process).
The preprocessed mel-spectrograms are stored in the ./mels_ljspeech1.1 directory.

Next, preprocess the alignments on LJSpeech dataset with feed-forwards to the teacher model. Download the Nvidia [pretrained Tacotron2 checkpoint](https://drive.google.com/file/d/1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA/view) to get a pretrained teacher model. And set --tacotron2_path to the Tacotron2 checkpoint file path and the result alignments are stored in --aligns_path.
```
python fastspeech/align_tacotron2.py --dataset_path="./LJSpeech-1.1" --tacotron2_path="tacotron2_statedict.pt" --aligns_path="aligns_ljspeech1.1"
```

The preprocessed alignments are stored in the ./aligns_ljspeech1.1 directory. For more information, refer to the [training process section](#training-process).


Finally, run the training script:

Expand All @@ -161,23 +164,23 @@ To train your model using mixed precision with Tensor Cores or using FP32, perfo
python fastspeech/train.py --dataset_path="./LJSpeech-1.1" --mels_path="./mels_ljspeech1.1" --aligns_path="./aligns_ljspeech1.1" --log_path="./logs" --checkpoint_path="./checkpoints" --use_amp
```

6. Start generation. To generate waveforms with WaveGlow Vocoder, Get [pretrained WaveGlow model](https://drive.google.com/open?id=1rpK8CzAAirq9sWZhe9nlfvxMF1dRgFbF) in the home directory, for example, ./waveglow_256channels.pt.
6. Start generation. To generate waveforms with WaveGlow Vocoder, Get [pretrained WaveGlow model](https://ngc.nvidia.com/catalog/models/nvidia:waveglow_ckpt_amp_256/files?version=19.10.0) from NGC into the home directory, for example, ./nvidia_waveglow256pyt_fp16.

After you have trained the FastSpeech model, you can perform generation using the checkpoint stored in ./checkpoints. Then run:
```
python generate.py --waveglow_path="./waveglow_256channels.pt" --checkpoint_path="./checkpoints" --text="./test_sentences.txt"
python generate.py --waveglow_path="./nvidia_waveglow256pyt_fp16" --checkpoint_path="./checkpoints" --text="./test_sentences.txt"
```

The script loads automatically the latest checkpoint (if any exists), or you can pass a checkpoint file through --ckpt_file. And it loads input texts in ./test_sentences.txt and stores the result in ./results directory. You can also set the result directory path with --results_path.

You can also run with a sample text:
```
python generate.py --waveglow_path="./waveglow_256channels.pt" --checkpoint_path="./checkpoints" --text="The more you buy, the more you save."
python generate.py --waveglow_path="./nvidia_waveglow256pyt_fp16" --checkpoint_path="./checkpoints" --text="The more you buy, the more you save."
```

7. Accelerate generation(inferencing of FastSpeech and WaveGlow) with TensorRT. Set parameters config file with --hparam=trt.yaml to enable TensorRT inference mode. To prepare for running WaveGlow on TensorRT, first extract a TensorRT engine file via [DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/trt](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/trt) and copy this in the home directory, for example, ./waveglow.fp16.trt. Then run with --waveglow_engine_path:
7. Accelerate generation(inferencing of FastSpeech and WaveGlow) with TensorRT. Set parameters config file with --hparam=trt.yaml to enable TensorRT inference mode. To prepare for running WaveGlow on TensorRT, first get an ONNX file via [DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/tensorrt](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/tensorrt), convert it to an TensorRT engine using scripts/waveglow/convert_onnx2trt.py, and copy this in the home directory, for example, ./waveglow.fp16.trt. Then run with --waveglow_engine_path:
```
python generate.py --hparam=trt.yaml --waveglow_path="./waveglow_256channels.pt" --checkpoint_path="./checkpoints" --text="./test_sentences.txt" --waveglow_engine_path="waveglow.fp16.trt"
python generate.py --hparam=trt.yaml --waveglow_path="./nvidia_waveglow256pyt_fp16" --checkpoint_path="./checkpoints" --text="./test_sentences.txt" --waveglow_engine_path="waveglow.fp16.trt"
```

## Advanced
Expand Down Expand Up @@ -285,33 +288,29 @@ For more details, refer to [accelerating inference with TensorRT](fastspeech/trt

#### Generation

To generate waveforms with WaveGlow Vocoder, 1) Make sure to pull [Nvidia WaveGlow](https://github.com/NVIDIA/waveglow) through git submodule, 2) get [pretrained WaveGlow model](https://drive.google.com/open?id=1rpK8CzAAirq9sWZhe9nlfvxMF1dRgFbF) in the home directory, for example, ./waveglow_256channels.pt.
```
git submodule init
git submodule update
```
To generate waveforms with WaveGlow Vocoder, get [pretrained WaveGlow model](https://ngc.nvidia.com/catalog/models/nvidia:waveglow_ckpt_amp_256/files?version=19.10.0) from NGC into the home directory, for example, ./nvidia_waveglow256pyt_fp16.

Run generate.py with:
* --text - an input text or the text file path.
* --results_path - result waveforms directory path. (default=./results).
* --ckpt_file - checkpoint file path. (default checkpoint file is the latest file in --checkpoint_path)
```
python generate.py --waveglow_path="./waveglow_256channels.pt" --text="The more you buy, the more you save."
python generate.py --waveglow_path="./nvidia_waveglow256pyt_fp16" --text="The more you buy, the more you save."
```
or
```
python generate.py --waveglow_path="./waveglow_256channels.pt" --text=test_sentences.txt
python generate.py --waveglow_path="./nvidia_waveglow256pyt_fp16" --text=test_sentences.txt
```

Sample result waveforms are [here](https://gitlab-master.nvidia.com/dahn/fastspeech/tree/master/samples).
Sample result waveforms are [here](samples).

To generate waveforms with the whole pipeline of FastSpeech and WaveGlow with TensorRT, extract a WaveGlow TRT engine file through https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/trt and run generate.py with --hparam=trt.yaml and --waveglow_engine_path.
To generate waveforms with the whole pipeline of FastSpeech and WaveGlow with TensorRT, extract a WaveGlow TRT engine file through https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/tensorrt and run generate.py with --hparam=trt.yaml and --waveglow_engine_path.

```
python generate.py --hparam=trt.yaml --waveglow_path="./waveglow_256channels.pt" --waveglow_engine_path="waveglow.fp16.trt" --text="The more you buy, the more you save."
python generate.py --hparam=trt.yaml --waveglow_path="./nvidia_waveglow256pyt_fp16" --waveglow_engine_path="waveglow.fp16.trt" --text="The more you buy, the more you save."
```

Sample result waveforms are [FP32](https://gitlab-master.nvidia.com/dahn/fastspeech/-/tree/master/fastspeech/trt/samples) and [FP16](https://gitlab-master.nvidia.com/dahn/fastspeech/-/tree/master/fastspeech/trt/samples_fp16).
Sample result waveforms are [FP32](fastspeech/trt/samples) and [FP16](fastspeech/trt/samples_fp16).


## Performance
Expand Down Expand Up @@ -383,7 +382,17 @@ The following sections provide details on how we achieved our performance and ac

#### Training performance results

Our results were obtained by running the script in [training performance benchmark](#training-performance-benchmark) in the PyTorch-20.03-py3 NGC container on NVIDIA DGX-1 with 8x V100 16G GPUs. Performance numbers (in number of mels per second) were averaged over an entire training epoch.
Our results were obtained by running the script in [training performance benchmark](#training-performance-benchmark) on <!--NVIDIA DGX A100 with 8x A100 40G GPUs and -->NVIDIA DGX-1 with 8x V100 16G GPUs. Performance numbers (in number of mels per second) were averaged over an entire training epoch.

<!-- ##### Training performance: NVIDIA DGX A100 (8x A100 40GB)

| GPUs | Batch size / GPU | Throughput(mels/s) - FP32 | Throughput(mels/s) - mixed precision | Throughput speedup (FP32 - mixed precision) | Multi-GPU Weak scaling - FP32 | Multi-GPU Weak scaling - mixed precision
|---|----|--------|--------|------|-----|------|
| 1 | 32 | | | | | 1 |
| 4 | 32 | | | | | |
| 8 | 32 | | | | | | -->

##### Training performance: NVIDIA DGX-1 (8x V100 16GB)

| GPUs | Batch size / GPU | Throughput(mels/s) - FP32 | Throughput(mels/s) - mixed precision | Throughput speedup (FP32 - mixed precision) | Multi-GPU Weak scaling - FP32 | Multi-GPU Weak scaling - mixed precision
|---|----|--------|--------|------|-----|------|
Expand All @@ -393,7 +402,7 @@ Our results were obtained by running the script in [training performance benchma

#### Inference performance results

Our results were obtained by running the script in [inference performance benchmark](#inference-performance-benchmark) in the PyTorch-20.03-py3 NGC container on NVIDIA DGX-1 with 1x V100 16GB GPU and a NVIDIA T4. The following tables show inference statistics for the FastSpeech and WaveGlow text-to-speech system on PyTorch and comparisons by framework with batch size 1 in FP16, gathered from 1000 inference runs. Latency is measured from the start of FastSpeech inference to the end of WaveGlow inference. The tables include average latency, latency standard deviation, and latency confidence intervals. Throughput is measured as the number of generated audio samples per second. RTF is the real-time factor which tells how many seconds of speech are generated in 1 second of compute. The used WaveGlow model is a 256-channel model. The numbers reported below were taken with a moderate length of 128 characters.
Our results were obtained by running the script in [inference performance benchmark](#inference-performance-benchmark) on NVIDIA DGX-1 with 1x V100 16GB GPU and a NVIDIA T4. The following tables show inference statistics for the FastSpeech and WaveGlow text-to-speech system on PyTorch and comparisons by framework with batch size 1 in FP16, gathered from 1000 inference runs. Latency is measured from the start of FastSpeech inference to the end of WaveGlow inference. The tables include average latency, latency standard deviation, and latency confidence intervals. Throughput is measured as the number of generated audio samples per second. RTF is the real-time factor which tells how many seconds of speech are generated in 1 second of compute. The used WaveGlow model is a 256-channel model. The numbers reported below were taken with a moderate length of 128 characters.

##### Inference performance: NVIDIA DGX-1 (1x V100 16GB)

Expand Down Expand Up @@ -434,6 +443,9 @@ Our results were obtained by running the script in [inference performance benchm
## Release notes

### Changelog
Oct 2020
- PyTorch 1.7, TensorRT 7.2 support <!--and Nvidia Ampere architecture support-->

July 2020
- Initial release

Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading