From 8997ba2d248d63f3884ec237d471beef8d18c8b2 Mon Sep 17 00:00:00 2001
From: monica-sekoyan <166123533+monica-sekoyan@users.noreply.github.com>
Date: Thu, 7 Nov 2024 19:43:50 +0400
Subject: [PATCH] download from BRANCH (#11206)

Signed-off-by: monica-sekoyan <msekoyan@nvidia.com>
---
 .../asr/Self_Supervised_Pre_Training.ipynb    | 38 ++++++++++---------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/tutorials/asr/Self_Supervised_Pre_Training.ipynb b/tutorials/asr/Self_Supervised_Pre_Training.ipynb
index 91709d230c2c..18861759237b 100644
--- a/tutorials/asr/Self_Supervised_Pre_Training.ipynb
+++ b/tutorials/asr/Self_Supervised_Pre_Training.ipynb
@@ -17,7 +17,9 @@
         "3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select \"GPU\" for hardware accelerator)\n",
         "4. Run this cell to set up dependencies.\n",
         "5. Restart the runtime (Runtime -> Restart Runtime) for any upgraded packages to take effect\n",
-        "\n\nNOTE: User is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.\n",
+        "\n",
+        "\n",
+        "NOTE: User is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.\n",
         "\"\"\"\n",
         "# If you're using Google Colab and not running locally, run this cell.\n",
         "\n",
@@ -272,8 +274,8 @@
       "source": [
         "## Grab the configs we'll use in this example\n",
         "!mkdir configs\n",
-        "!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/examples/asr/conf/ssl/citrinet/citrinet_ssl_1024.yaml\n",
-        "!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/examples/asr/conf/citrinet/citrinet_1024.yaml\n"
+        "!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/asr/conf/ssl/citrinet/citrinet_ssl_1024.yaml\n",
+        "!wget -P configs/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/asr/conf/citrinet/citrinet_1024.yaml\n"
       ]
     },
     {
@@ -380,16 +382,16 @@
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "4JnepitBZ3ta"
+      },
       "source": [
         "Note that for this loss the outputs must match the inputs, so since we are using Citrinet architecture with 8x stride, we would need to either set \"combine_time_steps\" to 8, or put additional stride layers in the decoder. By default for Citrinet with 8x stride we use \"combine_time_steps=4\" and \"stride_layers=1\" to match the 8x stride.\n",
         "\n",
         "Since in MaskedPatchAugmentation we set mask_patches to 0.5 and our min_durations are set to 3.2, we are guaranteed to have 1.6 masked second per utterance, or 160 masked steps. Since combine_time_steps is set to 4, this means that 160 / 4 = 40 total negatives can be sampled, so we set num_negatives to 40 (unless you set sample_from_same_utterance_only to false or sample_from_non_masked to true, but this tends to make results worse).\n",
         "\n",
         "In the default configs we assume that min_duration for samples is higher (8 seconds by default), so there we can set patch_size to 48 for a total of 480 masked steps, and use 100 sampled negatives. If the min_duration of samples that you are training on allows, the amount of masked steps as well as negatives can be increased further (masking around 50% of the sample duration tends to work well)."
-      ],
-      "metadata": {
-        "id": "4JnepitBZ3ta"
-      }
+      ]
     },
     {
       "cell_type": "markdown",
@@ -482,7 +484,7 @@
       "outputs": [],
       "source": [
         "!mkdir scripts\n",
-        "!wget -P scripts/ https://raw.githubusercontent.com/NVIDIA/NeMo/main/scripts/tokenizers/process_asr_text_tokenizer.py\n",
+        "!wget -P scripts/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/scripts/tokenizers/process_asr_text_tokenizer.py\n",
         "\n",
         "!python ./scripts/process_asr_text_tokenizer.py \\\n",
         "  --manifest=\"{data_dir}/an4/train_manifest.json\" \\\n",
@@ -650,23 +652,23 @@
     },
     {
       "cell_type": "markdown",
-      "source": [
-        "We can optionally freeze the encoder and only fine-tune the decoder during training. This can be done to lower the memory and time requirements of fine-tuning, but will likely result in a higher word error rate."
-      ],
       "metadata": {
         "id": "S5aVb2F8WuAR"
-      }
+      },
+      "source": [
+        "We can optionally freeze the encoder and only fine-tune the decoder during training. This can be done to lower the memory and time requirements of fine-tuning, but will likely result in a higher word error rate."
+      ]
     },
     {
       "cell_type": "code",
-      "source": [
-        "#asr_model.encoder.freeze()"
-      ],
+      "execution_count": null,
       "metadata": {
         "id": "LpF_YQUmXUR8"
       },
-      "execution_count": null,
-      "outputs": []
+      "outputs": [],
+      "source": [
+        "#asr_model.encoder.freeze()"
+      ]
     },
     {
       "cell_type": "markdown",
@@ -711,7 +713,7 @@
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
-      "version": "3.7.7"
+      "version": "3.10.12"
     },
     "pycharm": {
       "stem_cell": {