Fix notebook generation (#227)

* Add Gradio nb links
huggingface · May 30, 2022 · 80b0fbd · 80b0fbd
1 parent f66182e
commit 80b0fbd
Show file tree

Hide file tree

Showing 8 changed files with 120 additions and 59 deletions.
diff --git a/chapters/en/chapter9/2.mdx b/chapters/en/chapter9/2.mdx
@@ -1,5 +1,12 @@
 # Building your first demo
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section2.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section2.ipynb"},
+]} />
+
 Let's start by installing Gradio! Since it is a Python package, simply run:
 
 `$ pip install gradio `

diff --git a/chapters/en/chapter9/3.mdx b/chapters/en/chapter9/3.mdx
@@ -1,5 +1,12 @@
 # Understanding the Interface class
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section3.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section3.ipynb"},
+]} />
+
 In this section, we will take a closer look at the `Interface` class, and understand the
 main parameters used to create one.
 

diff --git a/chapters/en/chapter9/4.mdx b/chapters/en/chapter9/4.mdx
@@ -1,5 +1,12 @@
 # Sharing demos with others
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section4.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section4.ipynb"},
+]} />
+
 Now that you've built a demo, you'll probably want to share it with others. Gradio demos
 can be shared in two ways: using a ***temporary share link*** or ***permanent hosting on Spaces***.
 
@@ -21,10 +28,9 @@ To add additional content to your demo, the `Interface` class supports some opti
     - `live`: if you want to make your demo "live", meaning that your model reruns every time the input changes, you can set `live=True`. This makes sense to use with quick models (we'll see an example at the end of this section)
 Using the options above, we end up with a more complete interface. Run the code below so you can chat with Rick and Morty:
 
-```python out
+```py
 title = "Ask Rick a Question"
-description =
-"""
+description = """
 The bot was trained to answer questions based on Rick and Morty dialogues. Ask Rick anything!
 <img src="https://huggingface.co/spaces/course-demos/Rick_and_Morty_QA/resolve/main/rick.png" width=200px>
 """
@@ -38,7 +44,7 @@ gr.Interface(
     title=title,
     description=description,
     article=article,
-    examples=[["What are you doing?"], ["Where should we time travel to?"]]
+    examples=[["What are you doing?"], ["Where should we time travel to?"]],
 ).launch()
 ```
 
@@ -111,6 +117,7 @@ def predict(im):
 ```
 
 Now that we have a `predict()` function. The next step is to define and launch our gradio interface:
+
 ```py
 interface = gr.Interface(
     predict,

diff --git a/chapters/en/chapter9/5.mdx b/chapters/en/chapter9/5.mdx
@@ -1,5 +1,12 @@
 # Integrations with the Hugging Face Hub
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section5.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section5.ipynb"},
+]} />
+
 To make your life even easier, Gradio integrates directly with Hugging Face Hub and Hugging Face Spaces.
 You can load demos from the Hub and Spaces with only *one line of code*.
 

diff --git a/chapters/en/chapter9/6.mdx b/chapters/en/chapter9/6.mdx
@@ -1,5 +1,12 @@
 # Advanced Interface features
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section6.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section6.ipynb"},
+]} />
+
 Now that we can build and share a basic interface, let's explore some more advanced features such as state, and interpretation.
 
 ### Using state to persist data

diff --git a/chapters/en/chapter9/7.mdx b/chapters/en/chapter9/7.mdx
@@ -1,5 +1,12 @@
 # Introduction to Gradio Blocks
 
+<DocNotebookDropdown
+  classNames="absolute z-10 right-0 top-0"
+  options={[
+    {label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section7.ipynb"},
+    {label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section7.ipynb"},
+]} />
+
 In the previous sections we have explored and created demos using the `Interface` class. In this section we will introduce our **newly developed** low-level API called `gradio.Blocks`.
 
 Now, what's the difference between `Interface` and `Blocks`?

diff --git a/utils/code_formatter.py b/utils/code_formatter.py
@@ -4,6 +4,7 @@
 import re
 from pathlib import Path
 
+
 def blackify(filename, check_only=False):
     # Read the content of the file
     with open(filename, "r", encoding="utf-8") as f:
@@ -20,16 +21,12 @@ def blackify(filename, check_only=False):
             start_index = line_index
             while line_index < len(lines) and lines[line_index].strip() != "```":
                 line_index += 1
-            
-            code = "\n".join(lines[start_index: line_index])
+
+            code = "\n".join(lines[start_index:line_index])
             # Deal with ! instructions
             code = re.sub(r"^!", r"## !", code, flags=re.MULTILINE)
-
-            code_samples.append({
-                "start_index": start_index,
-                "end_index": line_index - 1,
-                "code": code
-            })
+
+            code_samples.append({"start_index": start_index, "end_index": line_index - 1, "code": code})
             line_index += 1
         else:
             line_index += 1
@@ -39,29 +36,28 @@ def blackify(filename, check_only=False):
     full_code = delimiter.join([sample["code"] for sample in code_samples])
     formatted_code = full_code.replace("\t", "    ")
     formatted_code = black.format_str(formatted_code, mode=black.FileMode({black.TargetVersion.PY37}, line_length=90))
-    
+
     # Black adds last new lines we don't want, so we strip individual code samples.
     cells = formatted_code.split(delimiter)
     cells = [cell.strip() for cell in cells]
     formatted_code = delimiter.join(cells)
-    
+
     if check_only:
         return full_code == formatted_code
     elif full_code == formatted_code:
         # Nothing to do, all is good
         return
-    
+
     formatted_code = re.sub(r"^## !", r"!", formatted_code, flags=re.MULTILINE)
     print(f"Formatting {filename}")
     # Re-build the content with formatted code
     new_lines = []
     start_index = 0
     for sample, code in zip(code_samples, formatted_code.split(delimiter)):
-        new_lines.extend(lines[start_index:sample["start_index"]])
+        new_lines.extend(lines[start_index : sample["start_index"]])
         new_lines.append(code)
         start_index = sample["end_index"] + 1
     new_lines.extend(lines[start_index:])
-
 
     with open(filename, "w", encoding="utf-8") as f:
         f.write("\n".join(new_lines))
@@ -77,14 +73,18 @@ def format_all_files(check_only=False):
         except Exception:
             print(f"Failed to format {filename}.")
             raise
-    
+
     if check_only and len(failures) > 0:
         raise ValueError(f"{len(failures)} files need to be formatted, run `make style`.")
 
 
 if __name__ == "__main__":
     parser = argparse.ArgumentParser()
-    parser.add_argument("--check_only", action="store_true", help="Just check files are properly formatted.")
+    parser.add_argument(
+        "--check_only",
+        action="store_true",
+        help="Just check files are properly formatted.",
+    )
     args = parser.parse_args()
 
     format_all_files(check_only=args.check_only)
diff --git a/utils/generate_notebooks.py b/utils/generate_notebooks.py
@@ -22,15 +22,16 @@
 
 frameworks = {"pt": "PyTorch", "tf": "TensorFlow"}
 
+
 def read_and_split_frameworks(fname):
     """
     Read the MDX in fname and creates two versions (if necessary) for each framework.
     """
     with open(fname, "r") as f:
         content = f.readlines()
-    
+
     contents = {"pt": [], "tf": []}
-    
+
     differences = False
     current_content = []
     line_idx = 0
@@ -54,12 +55,13 @@ def read_and_split_frameworks(fname):
     if len(current_content) > 0:
         for key in contents:
             contents[key].extend(current_content)
-    
+
     if differences:
         return {k: "".join(content) for k, content in contents.items()}
     else:
         return "".join(content)
 
+
 def extract_cells(content):
     """
     Extract the code/output cells from content.
@@ -96,12 +98,16 @@ def convert_to_nb_cell(cell):
     nb_cell = {"cell_type": "code", "execution_count": None, "metadata": {}}
     if isinstance(cell, tuple):
         nb_cell["source"] = cell[0]
-        nb_cell["outputs"] = [nbformat.notebooknode.NotebookNode({
-            'data': {'text/plain': cell[1]},
-            'execution_count': None,
-            'metadata': {},
-            'output_type': 'execute_result',
-        })]
+        nb_cell["outputs"] = [
+            nbformat.notebooknode.NotebookNode(
+                {
+                    "data": {"text/plain": cell[1]},
+                    "execution_count": None,
+                    "metadata": {},
+                    "output_type": "execute_result",
+                }
+            )
+        ]
     else:
         nb_cell["source"] = cell
         nb_cell["outputs"] = []
@@ -110,9 +116,7 @@ def convert_to_nb_cell(cell):
 
 def nb_cell(source, code=True):
     if not code:
-        return nbformat.notebooknode.NotebookNode(
-            {"cell_type": "markdown", "source": source, "metadata": {}}
-        )
+        return nbformat.notebooknode.NotebookNode({"cell_type": "markdown", "source": source, "metadata": {}})
     return nbformat.notebooknode.NotebookNode(
         {"cell_type": "code", "metadata": {}, "source": source, "execution_count": None, "outputs": []}
     )
@@ -152,50 +156,71 @@ def build_notebook(fname, title, output_dir="."):
         "What to do when you get an error",
     ]
     sections_with_faiss = ["Semantic search with FAISS (PyTorch)", "Semantic search with FAISS (TensorFlow)"]
+    sections_with_gradio = [
+        "Building your first demo",
+        "Understanding the Interface class",
+        "Sharing demos with others",
+        "Integrations with the Hugging Face Hub",
+        "Advanced Interface features",
+        "Introduction to Blocks",
+    ]
     stem = Path(fname).stem
     if not isinstance(sections, dict):
         contents = [sections]
         titles = [title]
-        fnames = [f"{stem}.ipynb"]
+        fnames = [f"section{stem}.ipynb"]
     else:
         contents = []
         titles = []
         fnames = []
         for key, section in sections.items():
             contents.append(section)
             titles.append(f"{title} ({frameworks[key]})")
-            fnames.append(f"{stem}_{key}.ipynb")
-    
+            fnames.append(f"section{stem}_{key}.ipynb")
+
     for title, content, fname in zip(titles, contents, fnames):
         cells = extract_cells(content)
         if len(cells) == 0:
             continue
-        
+
         nb_cells = [
             nb_cell(f"# {title}", code=False),
-            nb_cell("Install the Transformers and Datasets libraries to run this notebook.", code=False)
+            nb_cell("Install the Transformers and Datasets libraries to run this notebook.", code=False),
         ]
 
         # Install cell
         installs = ["!pip install datasets transformers[sentencepiece]"]
         if title in sections_with_accelerate:
             installs.append("!pip install accelerate")
             installs.append("# To run the training on TPU, you will need to uncomment the followin line:")
-            installs.append("# !pip install cloud-tpu-client==0.10 torch==1.9.0 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl")
+            installs.append(
+                "# !pip install cloud-tpu-client==0.10 torch==1.9.0 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl"
+            )
         if title in sections_with_hf_hub:
             installs.append("!apt install git-lfs")
         if title in sections_with_faiss:
             installs.append("!pip install faiss-gpu")
-
+        if title in sections_with_gradio:
+            installs.append("!pip install gradio")
+
         nb_cells.append(nb_cell("\n".join(installs)))
 
         if title in sections_with_hf_hub:
-            nb_cells.extend([
-                nb_cell("You will need to setup git, adapt your email and name in the following cell.", code=False),
-                nb_cell("!git config --global user.email \"you@example.com\"\n!git config --global user.name \"Your Name\""),
-                nb_cell("You will also need to be logged in to the Hugging Face Hub. Execute the following and enter your credentials.", code=False),
-                nb_cell("from huggingface_hub import notebook_login\n\nnotebook_login()"),
-            ])
+            nb_cells.extend(
+                [
+                    nb_cell(
+                        "You will need to setup git, adapt your email and name in the following cell.", code=False
+                    ),
+                    nb_cell(
+                        '!git config --global user.email "you@example.com"\n!git config --global user.name "Your Name"'
+                    ),
+                    nb_cell(
+                        "You will also need to be logged in to the Hugging Face Hub. Execute the following and enter your credentials.",
+                        code=False,
+                    ),
+                    nb_cell("from huggingface_hub import notebook_login\n\nnotebook_login()"),
+                ]
+            )
         nb_cells += [convert_to_nb_cell(cell) for cell in cells]
         metadata = {"colab": {"name": title, "provenance": []}}
         nb_dict = {"cells": nb_cells, "metadata": metadata, "nbformat": 4, "nbformat_minor": 4}
@@ -206,26 +231,20 @@ def build_notebook(fname, title, output_dir="."):
 
 def get_titles():
     """
-    Parse the yaml _chapters.yml to get the correspondence filename to title
+    Parse the _toctree.yml file to get the correspondence filename to title
     """
-    table = yaml.safe_load(open(os.path.join(PATH_TO_COURSE, "_chapters.yml"), "r"))
+    table = yaml.safe_load(open(os.path.join(PATH_TO_COURSE, "_toctree.yml"), "r"))
     result = {}
     for entry in table:
-        chapter_name = entry["local"]
-        sections = []
-        for i, section in enumerate(entry["sections"]):
-            if isinstance(section, str):
-                result[os.path.join(chapter_name, f"section{i+1}")] = section
+        for section in entry["sections"]:
+            section_title = section["title"]
+            if "local_fw" in section:
+                section_names = section["local_fw"]
+                result[section_names["pt"]] = section_title
+                result[section_names["tf"]] = section_title
             else:
                 section_name = section["local"]
-                section_title = section["title"]
-                if isinstance(section_name, str):
-                    result[os.path.join(chapter_name, section_name)] = section_title
-                else:
-                    if isinstance(section_title, str):
-                        section_title = {key: section_title for key in section_name.keys()}
-                    for key in section_name.keys():
-                        result[os.path.join(chapter_name, section_name[key])] = section_title[key]
+                result[section_name] = section_title
     return {k: v for k, v in result.items() if "quiz" not in v}