From 7887128bb444188d106d67e1f5ef8812c16e52b2 Mon Sep 17 00:00:00 2001
From: Yang Zhang <yzhang123@users.noreply.github.com>
Date: Tue, 7 Feb 2023 17:53:46 -0500
Subject: [PATCH] Tn doc 16 (#5954)

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix new repo links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix links

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix spelling

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add warning

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add comment

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

---------

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
---
 .../nlp/text_normalization/wfst/intro.rst     |  4 +--
 .../wfst/wfst_customization.rst               |  6 ++--
 .../wfst/wfst_resources.rst                   |  9 ++++--
 .../wfst/wfst_text_normalization.rst          | 29 ++++++++++++-------
 .../wfst/wfst_text_processing_deployment.rst  | 19 +++++++-----
 nemo_text_processing/README.md                |  4 +++
 .../inverse_text_normalization/README.md      |  4 +++
 .../text_normalization/README.md              |  3 ++
 .../Text_(Inverse)_Normalization.ipynb        | 12 ++++++--
 tutorials/text_processing/WFST_Tutorial.ipynb | 16 +++++++---
 10 files changed, 76 insertions(+), 30 deletions(-)

diff --git a/docs/source/nlp/text_normalization/wfst/intro.rst b/docs/source/nlp/text_normalization/wfst/intro.rst
index 05fff8391915..526ffa24c279 100644
--- a/docs/source/nlp/text_normalization/wfst/intro.rst
+++ b/docs/source/nlp/text_normalization/wfst/intro.rst
@@ -1,9 +1,9 @@
 WFST-based (Inverse) Text Normalization
 =======================================
 
-NeMo supports Text Normalization (TN), audio-based TN and Inverse Text Normalization (ITN) tasks.
+NeMo-text-processing supports Text Normalization (TN), audio-based TN and Inverse Text Normalization (ITN) tasks.
 
-.. note::
+.. warning::
 
     *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
 
diff --git a/docs/source/nlp/text_normalization/wfst/wfst_customization.rst b/docs/source/nlp/text_normalization/wfst/wfst_customization.rst
index c4c1b7cb47fe..7c00c0c4b06b 100644
--- a/docs/source/nlp/text_normalization/wfst/wfst_customization.rst
+++ b/docs/source/nlp/text_normalization/wfst/wfst_customization.rst
@@ -3,7 +3,7 @@
 Grammar customization
 =====================
 
-.. note::
+.. warning::
 
     TN/ITN is transitioning from `NVIDIA/NeMo <https://github.com/NVIDIA/NeMo>`_ repository to a standalone `NVIDIA/NeMo-text-processing <https://github.com/NVIDIA/NeMo-text-processing>`_ repository. All updates and discussions/issues should go to the new repository.
 
@@ -15,8 +15,8 @@ Steps to customize grammars
 ---------------------------
 
 1. Install `NeMo-TN from source <https://github.com/NVIDIA/NeMo-text-processing#from-source>`_
-2. Run `nemo_text_processing/text_normalization/normalize.py <https://github.com/NVIDIA/NeMo-text-processing/blob/main/nemo_text_processing/text_normalization/normalize.py>`_ or `nemo_text_processing/inverse_text_normalization/inverse_normalize.py <https://github.com/NVIDIA/NeMo-text-processing/blob/main/nemo_text_processing/inverse_text_normalization/inverse_normalize.py>`_ with `--verbose` flag to evaluate current behavior on the target case, see argument details in the scripts and `this tutorial <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`_
-3. Modify existing grammars or add new grammars to cover the target case using `Tutorial on how to write new grammars <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/WFST_Tutorial.ipynb>`_
+2. Run `nemo_text_processing/text_normalization/normalize.py <https://github.com/NVIDIA/NeMo-text-processing/blob/main/nemo_text_processing/text_normalization/normalize.py>`_ or `nemo_text_processing/inverse_text_normalization/inverse_normalize.py <https://github.com/NVIDIA/NeMo-text-processing/blob/main/nemo_text_processing/inverse_text_normalization/inverse_normalize.py>`_ with `--verbose` flag to evaluate current behavior on the target case, see argument details in the scripts and `this tutorial <https://colab.research.google.com/github/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb>`_
+3. Modify existing grammars or add new grammars to cover the target case using `Tutorial on how to write new grammars <https://colab.research.google.com/github/NVIDIA/NeMo-text-processing/blob/main/tutorials/WFST_Tutorial.ipynb>`_
 4. Add new test cases `here <https://github.com/NVIDIA/NeMo-text-processing/tree/main/tests/nemo_text_processing>`_:
     - Run python tests:
 
diff --git a/docs/source/nlp/text_normalization/wfst/wfst_resources.rst b/docs/source/nlp/text_normalization/wfst/wfst_resources.rst
index 97870770ffa4..f83a79e58a22 100644
--- a/docs/source/nlp/text_normalization/wfst/wfst_resources.rst
+++ b/docs/source/nlp/text_normalization/wfst/wfst_resources.rst
@@ -3,11 +3,16 @@
 Resources and Documentation
 ===========================
 
+.. warning::
+
+    *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
+
 - List of `TN/ITN issues <https://github.com/NVIDIA/NeMo/issues?q=is%3Aissue+label%3ATN%2FITN+>`_, use `TN/ITN` label
 - TN/ITN related `discussions <https://github.com/NVIDIA/NeMo/discussions?discussions_q=label%3ATN%2FITN>`_, use `TN/ITN` label
 - Documentation on how to generate `.far files for deployment in Riva (via Sparrowhawk) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/text_normalization/wfst/wfst_text_processing_deployment.html>`_
-- Tutorial that provides an `Overview of NeMo-TN/ITN <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`_
-- Tutorial on `how to write new grammars <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/WFST_Tutorial.ipynb>`_ in `Pynini <https://www.opengrm.org/twiki/bin/view/GRM/Pynini>`_
+- Tutorial that provides an `Overview of NeMo-TN/ITN <https://colab.research.google.com/github/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb>`_
+- Tutorial on `how to write new grammars <https://colab.research.google.com/github/NVIDIA/NeMo-text-processing/blob/main/tutorials/WFST_Tutorial.ipynb>`_ in `Pynini <https://www.opengrm.org/twiki/bin/view/GRM/Pynini>`_
 
 
 
diff --git a/docs/source/nlp/text_normalization/wfst/wfst_text_normalization.rst b/docs/source/nlp/text_normalization/wfst/wfst_text_normalization.rst
index 3abdcdbb2d57..922780328e76 100644
--- a/docs/source/nlp/text_normalization/wfst/wfst_text_normalization.rst
+++ b/docs/source/nlp/text_normalization/wfst/wfst_text_normalization.rst
@@ -3,6 +3,11 @@
 Text (Inverse) Normalization
 ============================
 
+.. warning::
+
+    *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
+
 The `nemo_text_processing` Python package is based on WFST grammars :cite:`textprocessing-norm-mohri2005weighted` and supports:
 
 1. Text Normalization (TN) converts text from written form into its verbalized form. It is used as a preprocessing step before Text to Speech (TTS). For example,
@@ -11,7 +16,7 @@ The `nemo_text_processing` Python package is based on WFST grammars :cite:`textp
 
      "123" -> "one hundred twenty three"
 
-NeMo has both a fast version which is deterministic :cite:`textprocessing-norm-zhang2021nemo` which has more language support and a context-aware version :cite:`textprocessing-norm-bakhturina2022shallow`.
+`nemo_text_processing` has both a fast version which is deterministic :cite:`textprocessing-norm-zhang2021nemo` which has more language support and a context-aware version :cite:`textprocessing-norm-bakhturina2022shallow`.
 In case of ambiguous input, e.g. 
 
 .. code-block:: bash
@@ -47,13 +52,17 @@ Audio-based TN can be used to normalize ASR training data.
 Installation
 ------------
 
-`nemo_text_processing` is automatically installed with `NeMo <https://github.com/NVIDIA/NeMo>`_. But it relies on `pynini` python library, which you need to install following below steps,
+If you have already installed `nemo_text_processing <https://github.com/NVIDIA/NeMo-text-processing>`_, it should have `pynini` python library. Otherwise install explicitly:
 
 .. code-block:: shell-session
 
-    wget https://raw.githubusercontent.com/NVIDIA/NeMo/stable/nemo_text_processing/install_pynini.sh
-    bash install_pynini.sh
+    pip install pynini==2.1.5
+
+or if this fails on missing OpenFst headers:
+
+.. code-block:: shell-session
 
+    conda install -c conda-forge pynini=2.1.5
 
 
 Quick Start Guide
@@ -66,14 +75,14 @@ The standard text normalization based on WFST  :cite:`textprocessing-norm-zhang2
 
 .. code-block:: bash
 
-    cd NeMo/nemo_text_processing/text_normalization/
+    cd NeMo-text-processing/nemo_text_processing/text_normalization/
     python normalize.py --text="123" --language=en
 
 if you want to normalize a string. To normalize a text file split into sentences, run the following:
 
 .. code-block:: bash
 
-    cd NeMo/nemo_text_processing/text_normalization/
+    cd NeMo-text-processing/nemo_text_processing/text_normalization/
     python normalize.py --input_file=INPUT_FILE_PATH --output_file=OUTPUT_FILE_PATH --language=en
 
 The context-aware version :cite:`textprocessing-norm-bakhturina2022shallow` is a shallow fusion of non-deterministic WFST and pretrained masked language model.
@@ -86,7 +95,7 @@ The context-aware version :cite:`textprocessing-norm-bakhturina2022shallow` is a
 
 .. code-block:: bash
 
-    cd NeMo/nemo_text_processing/
+    cd NeMo-text-processing/nemo_text_processing/
     python wfst_lm_rescoring.py
 
 
@@ -97,7 +106,7 @@ Inverse Text Normalization
 
 .. code-block:: bash
 
-    cd NeMo/nemo_text_processing/inverse_text_normalization/
+    cd NeMo-text-processing/nemo_text_processing/inverse_text_normalization/
     python inverse_normalize.py --text="one hundred twenty three" --language=en
 
 
@@ -124,7 +133,7 @@ Audio-based TN
 
 .. code-block:: bash
 
-    cd NeMo/nemo_text_processing/text_normalization/
+    cd NeMo-text-processing/nemo_text_processing/text_normalization/
     python normalize_with_audio.py --text="123" --language="en" --n_tagged=10 --cache_dir="cache_dir" --audio_data="example.wav" --model="stt_en_conformer_ctc_large" 
 
 Additional Arguments:
@@ -137,7 +146,7 @@ Additional Arguments:
 
 .. note::
 
-    More details can be found in `NeMo/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo/blob/stable/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb>`_.
+    More details can be found in `NeMo-text-processing/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb <https://github.com/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb>`__ in `Google's Colab <https://colab.research.google.com/github/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb>`_.
 
 Language Support Matrix
 ------------------------
diff --git a/docs/source/nlp/text_normalization/wfst/wfst_text_processing_deployment.rst b/docs/source/nlp/text_normalization/wfst/wfst_text_processing_deployment.rst
index 188ab81bd2fd..76e8abba212a 100644
--- a/docs/source/nlp/text_normalization/wfst/wfst_text_processing_deployment.rst
+++ b/docs/source/nlp/text_normalization/wfst/wfst_text_processing_deployment.rst
@@ -3,9 +3,14 @@
 Deploy to Production with C++ backend
 =====================================
 
-NeMo provides tools to deploy :doc:`TN and ITN <wfst_text_normalization>` for production :cite:`textprocessing-deployment-zhang2021nemo`.
+.. warning::
+
+    *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
+
+NeMo-text-processing provides tools to deploy :doc:`TN and ITN <wfst_text_normalization>` for production :cite:`textprocessing-deployment-zhang2021nemo`.
 It uses `Sparrowhawk <https://github.com/google/sparrowhawk>`_ :cite:`textprocessing-deployment-sparrowhawk` -- an open-source C++ framework by Google.
-The grammars written with NeMo can be exported into an `OpenFST <https://www.openfst.org/>`_ Archive File (FAR) and dropped into Sparrowhawk.
+The grammars written with NeMo-text-processing can be exported into an `OpenFST <https://www.openfst.org/>`_ Archive File (FAR) and dropped into Sparrowhawk.
 
     .. image:: images/deployment_pipeline.png
         :align: center
@@ -18,7 +23,7 @@ Requirements
 
 * :doc:`nemo_text_processing <wfst_text_normalization>` package
 * `Docker <https://www.docker.com/>`_
-* `NeMo source code <https://github.com/NVIDIA/NeMo>`_
+* `NeMo-text-processing source code <https://github.com/NVIDIA/NeMo-text-processing>`_
 
 
 .. _wfst_deployment_quick_start:
@@ -31,11 +36,11 @@ Examples how to run:
 .. code-block:: bash
 
     # export English TN grammars and return prompt inside docker container  
-    cd NeMo/tools/text_processing_deployment
+    cd NeMo-text-processing/tools/text_processing_deployment
     bash export_grammars.sh --GRAMMARS=tn_grammars --LANGUAGE=en --INPUT_CASE=cased
 
     # export English ITN grammars and return prompt inside docker container  
-    cd NeMo/tools/text_processing_deployment
+    cd NeMo-text-processing/tools/text_processing_deployment
     bash export_grammars.sh --GRAMMARS=itn_grammars --LANGUAGE=en
 
 
@@ -44,7 +49,7 @@ Arguments:
 * ``GRAMMARS`` - ``tn_grammars`` or ``itn_grammars`` to export either TN or ITN grammars.
 * ``LANGUAGE`` - `en` for English. Click :doc:`here <wfst_text_normalization>` for full list of languages.
 * ``INPUT_CASE`` - ``cased`` or ``lower_cased`` (ITN has no differentiation between these two, only used for TN).
-* ``MODE`` - By default ``export`` which returns prompt inside the docker. If ``--MODE=test`` runs NeMo pytests inside container.
+* ``MODE`` - By default ``export`` which returns prompt inside the docker. If ``--MODE=test`` runs NeMo-text-processing pytests inside container.
 * ``OVERWRITE_CACHE`` - Whether to re-export grammars or load from cache. By default ``True``. 
 * ``FORCE_REBUILD`` - Whether to rebuild docker image in cased of updated dependencies. By default ``False``.
 
@@ -57,7 +62,7 @@ Go to script folder:
 
 .. code-block:: bash
 
-    cd NeMo/tools/text_processing_deployment
+    cd NeMo-text-processing/tools/text_processing_deployment
 
 1. Grammars written in Python are exported to `OpenFST <https://www.openfst.org/>`_ archive files (FAR). Specifically, grammars `ClassifyFst` and `VerbalizeFst` from :doc:`nemo_text_processing <wfst_text_normalization>` are exported and saved to `./LANGUAGE/classify/tokenize_and_classify.far` and `./LANGUAGE/verbalize/verbalize.far` respectively.
 
diff --git a/nemo_text_processing/README.md b/nemo_text_processing/README.md
index 181d6ddc8e3b..14ca7e5a1e90 100644
--- a/nemo_text_processing/README.md
+++ b/nemo_text_processing/README.md
@@ -1,6 +1,10 @@
 **nemo_text_processing**
 ==========================
 
+> **Warning**
+> *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
+
 Introduction
 ------------
 
diff --git a/nemo_text_processing/inverse_text_normalization/README.md b/nemo_text_processing/inverse_text_normalization/README.md
index 8bf5a0fcf929..3d1220400143 100644
--- a/nemo_text_processing/inverse_text_normalization/README.md
+++ b/nemo_text_processing/inverse_text_normalization/README.md
@@ -1,5 +1,9 @@
 # Inverse Text Normalization
 
+> **Warning**
+> *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
+
 Inverse Text Normalization is part of NeMo's `nemo_text_processing` - a Python package that is installed with the `nemo_toolkit`. 
 It converts text from spoken form into written form, e.g. "one hundred twenty three" -> "123".
 
diff --git a/nemo_text_processing/text_normalization/README.md b/nemo_text_processing/text_normalization/README.md
index d14e4d1fa06e..94d347dc8df5 100644
--- a/nemo_text_processing/text_normalization/README.md
+++ b/nemo_text_processing/text_normalization/README.md
@@ -1,5 +1,8 @@
 # Text Normalization
 
+> **Warning**
+> *TN/ITN is transitioning from [NVIDIA/NeMo](https://github.com/NVIDIA/NeMo) repository to a standalone [NVIDIA/NeMo-text-processing](https://github.com/NVIDIA/NeMo-text-processing) repository. All updates and discussions/issues should go to the new repository.*
+
 Text Normalization is part of NeMo's `nemo_text_processing` - a Python package that is installed with the `nemo_toolkit`. 
 It converts text from written form into its verbalized form, e.g. "123" -> "one hundred twenty three".
 
diff --git a/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb b/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb
index 4982104b3c23..f2b546ed0d49 100755
--- a/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb
+++ b/tutorials/text_processing/Text_(Inverse)_Normalization.ipynb
@@ -18,6 +18,14 @@
     "\"\"\""
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> **_WARNING:_** \n",
+    "TN/ITN is transitioning from https://github.com/NVIDIA/NeMo to a standalone https://github.com/NVIDIA/NeMo-text-processing repository. Please use https://github.com/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb instead."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -142,7 +150,7 @@
     "print(normalized)"
    ]
   },
-    {
+  {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -465,4 +473,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 1
-}
\ No newline at end of file
+}
diff --git a/tutorials/text_processing/WFST_Tutorial.ipynb b/tutorials/text_processing/WFST_Tutorial.ipynb
index 97b70edaa503..f3ffa79aa503 100644
--- a/tutorials/text_processing/WFST_Tutorial.ipynb
+++ b/tutorials/text_processing/WFST_Tutorial.ipynb
@@ -22,6 +22,14 @@
     "\"\"\""
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> **_WARNING:_** \n",
+    "TN/ITN is transitioning from https://github.com/NVIDIA/NeMo to a standalone https://github.com/NVIDIA/NeMo-text-processing repository. Please use https://github.com/NVIDIA/NeMo-text-processing/blob/main/tutorials/WFST_Tutorial.ipynb instead."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -41,10 +49,10 @@
     "## Install NeMo, which installs both nemo and nemo_text_processing package\n",
     "BRANCH = 'r1.16.0'\n",
     "!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[nemo_text_processing]\n",
-          "\n",
-          "# install Pynini for text normalization\n",
-          "! wget https://raw.githubusercontent.com/NVIDIA/NeMo/main/nemo_text_processing/install_pynini.sh\n",
-          "! bash install_pynini.sh"
+    "\n",
+    "# install Pynini for text normalization\n",
+    "! wget https://raw.githubusercontent.com/NVIDIA/NeMo/main/nemo_text_processing/install_pynini.sh\n",
+    "! bash install_pynini.sh"
    ]
   },
   {