Skip to content

Commit

Permalink
download from BRANCH (#11206)
Browse files Browse the repository at this point in the history
Signed-off-by: monica-sekoyan <msekoyan@nvidia.com>
  • Loading branch information
monica-sekoyan authored and NeMo Bot committed Nov 7, 2024
1 parent 5182968 commit 8997ba2
Showing 1 changed file with 20 additions and 18 deletions.
38 changes: 20 additions & 18 deletions tutorials/asr/Self_Supervised_Pre_Training.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@
"3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select \"GPU\" for hardware accelerator)\n",
"4. Run this cell to set up dependencies.\n",
"5. Restart the runtime (Runtime -> Restart Runtime) for any upgraded packages to take effect\n",
"\n\nNOTE: User is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.\n",
"\n",
"\n",
"NOTE: User is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.\n",
"\"\"\"\n",
"# If you're using Google Colab and not running locally, run this cell.\n",
"\n",
Expand Down Expand Up @@ -272,8 +274,8 @@
"source": [
"## Grab the configs we'll use in this example\n",
"!mkdir configs\n",
"!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/examples/asr/conf/ssl/citrinet/citrinet_ssl_1024.yaml\n",
"!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/examples/asr/conf/citrinet/citrinet_1024.yaml\n"
"!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/asr/conf/ssl/citrinet/citrinet_ssl_1024.yaml\n",
"!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/asr/conf/citrinet/citrinet_1024.yaml\n"
]
},
{
Expand Down Expand Up @@ -380,16 +382,16 @@
},
{
"cell_type": "markdown",
"metadata": {
"id": "4JnepitBZ3ta"
},
"source": [
"Note that for this loss the outputs must match the inputs, so since we are using Citrinet architecture with 8x stride, we would need to either set \"combine_time_steps\" to 8, or put additional stride layers in the decoder. By default for Citrinet with 8x stride we use \"combine_time_steps=4\" and \"stride_layers=1\" to match the 8x stride.\n",
"\n",
"Since in MaskedPatchAugmentation we set mask_patches to 0.5 and our min_durations are set to 3.2, we are guaranteed to have 1.6 masked second per utterance, or 160 masked steps. Since combine_time_steps is set to 4, this means that 160 / 4 = 40 total negatives can be sampled, so we set num_negatives to 40 (unless you set sample_from_same_utterance_only to false or sample_from_non_masked to true, but this tends to make results worse).\n",
"\n",
"In the default configs we assume that min_duration for samples is higher (8 seconds by default), so there we can set patch_size to 48 for a total of 480 masked steps, and use 100 sampled negatives. If the min_duration of samples that you are training on allows, the amount of masked steps as well as negatives can be increased further (masking around 50% of the sample duration tends to work well)."
],
"metadata": {
"id": "4JnepitBZ3ta"
}
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -482,7 +484,7 @@
"outputs": [],
"source": [
"!mkdir scripts\n",
"!wget -P scripts/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/scripts/tokenizers/process_asr_text_tokenizer.py\n",
"!wget -P scripts/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/scripts/tokenizers/process_asr_text_tokenizer.py\n",
"\n",
"!python ./scripts/process_asr_text_tokenizer.py \\\n",
" --manifest=\"{data_dir}/an4/train_manifest.json\" \\\n",
Expand Down Expand Up @@ -650,23 +652,23 @@
},
{
"cell_type": "markdown",
"source": [
"We can optionally freeze the encoder and only fine-tune the decoder during training. This can be done to lower the memory and time requirements of fine-tuning, but will likely result in a higher word error rate."
],
"metadata": {
"id": "S5aVb2F8WuAR"
}
},
"source": [
"We can optionally freeze the encoder and only fine-tune the decoder during training. This can be done to lower the memory and time requirements of fine-tuning, but will likely result in a higher word error rate."
]
},
{
"cell_type": "code",
"source": [
"#asr_model.encoder.freeze()"
],
"execution_count": null,
"metadata": {
"id": "LpF_YQUmXUR8"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": [
"#asr_model.encoder.freeze()"
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -711,7 +713,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
"version": "3.10.12"
},
"pycharm": {
"stem_cell": {
Expand Down

0 comments on commit 8997ba2

Please sign in to comment.