Skip to content

Commit

Permalink
Fix notebook generation (#227)
Browse files Browse the repository at this point in the history
* Add Gradio nb links
  • Loading branch information
lewtun authored May 30, 2022
1 parent f66182e commit 80b0fbd
Show file tree
Hide file tree
Showing 8 changed files with 120 additions and 59 deletions.
7 changes: 7 additions & 0 deletions chapters/en/chapter9/2.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Building your first demo

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section2.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section2.ipynb"},
]} />

Let's start by installing Gradio! Since it is a Python package, simply run:

`$ pip install gradio `
Expand Down
7 changes: 7 additions & 0 deletions chapters/en/chapter9/3.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Understanding the Interface class

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section3.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section3.ipynb"},
]} />

In this section, we will take a closer look at the `Interface` class, and understand the
main parameters used to create one.

Expand Down
15 changes: 11 additions & 4 deletions chapters/en/chapter9/4.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Sharing demos with others

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section4.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section4.ipynb"},
]} />

Now that you've built a demo, you'll probably want to share it with others. Gradio demos
can be shared in two ways: using a ***temporary share link*** or ***permanent hosting on Spaces***.

Expand All @@ -21,10 +28,9 @@ To add additional content to your demo, the `Interface` class supports some opti
- `live`: if you want to make your demo "live", meaning that your model reruns every time the input changes, you can set `live=True`. This makes sense to use with quick models (we'll see an example at the end of this section)
Using the options above, we end up with a more complete interface. Run the code below so you can chat with Rick and Morty:

```python out
```py
title = "Ask Rick a Question"
description =
"""
description = """
The bot was trained to answer questions based on Rick and Morty dialogues. Ask Rick anything!
<img src="https://huggingface.co/spaces/course-demos/Rick_and_Morty_QA/resolve/main/rick.png" width=200px>
"""
Expand All @@ -38,7 +44,7 @@ gr.Interface(
title=title,
description=description,
article=article,
examples=[["What are you doing?"], ["Where should we time travel to?"]]
examples=[["What are you doing?"], ["Where should we time travel to?"]],
).launch()
```

Expand Down Expand Up @@ -111,6 +117,7 @@ def predict(im):
```

Now that we have a `predict()` function. The next step is to define and launch our gradio interface:

```py
interface = gr.Interface(
predict,
Expand Down
7 changes: 7 additions & 0 deletions chapters/en/chapter9/5.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Integrations with the Hugging Face Hub

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section5.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section5.ipynb"},
]} />

To make your life even easier, Gradio integrates directly with Hugging Face Hub and Hugging Face Spaces.
You can load demos from the Hub and Spaces with only *one line of code*.

Expand Down
7 changes: 7 additions & 0 deletions chapters/en/chapter9/6.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Advanced Interface features

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section6.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section6.ipynb"},
]} />

Now that we can build and share a basic interface, let's explore some more advanced features such as state, and interpretation.

### Using state to persist data
Expand Down
7 changes: 7 additions & 0 deletions chapters/en/chapter9/7.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Introduction to Gradio Blocks

<DocNotebookDropdown
classNames="absolute z-10 right-0 top-0"
options={[
{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/chapter9/section7.ipynb"},
{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/chapter9/section7.ipynb"},
]} />

In the previous sections we have explored and created demos using the `Interface` class. In this section we will introduce our **newly developed** low-level API called `gradio.Blocks`.

Now, what's the difference between `Interface` and `Blocks`?
Expand Down
30 changes: 15 additions & 15 deletions utils/code_formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import re
from pathlib import Path


def blackify(filename, check_only=False):
# Read the content of the file
with open(filename, "r", encoding="utf-8") as f:
Expand All @@ -20,16 +21,12 @@ def blackify(filename, check_only=False):
start_index = line_index
while line_index < len(lines) and lines[line_index].strip() != "```":
line_index += 1
code = "\n".join(lines[start_index: line_index])

code = "\n".join(lines[start_index:line_index])
# Deal with ! instructions
code = re.sub(r"^!", r"## !", code, flags=re.MULTILINE)

code_samples.append({
"start_index": start_index,
"end_index": line_index - 1,
"code": code
})

code_samples.append({"start_index": start_index, "end_index": line_index - 1, "code": code})
line_index += 1
else:
line_index += 1
Expand All @@ -39,29 +36,28 @@ def blackify(filename, check_only=False):
full_code = delimiter.join([sample["code"] for sample in code_samples])
formatted_code = full_code.replace("\t", " ")
formatted_code = black.format_str(formatted_code, mode=black.FileMode({black.TargetVersion.PY37}, line_length=90))

# Black adds last new lines we don't want, so we strip individual code samples.
cells = formatted_code.split(delimiter)
cells = [cell.strip() for cell in cells]
formatted_code = delimiter.join(cells)

if check_only:
return full_code == formatted_code
elif full_code == formatted_code:
# Nothing to do, all is good
return

formatted_code = re.sub(r"^## !", r"!", formatted_code, flags=re.MULTILINE)
print(f"Formatting {filename}")
# Re-build the content with formatted code
new_lines = []
start_index = 0
for sample, code in zip(code_samples, formatted_code.split(delimiter)):
new_lines.extend(lines[start_index:sample["start_index"]])
new_lines.extend(lines[start_index : sample["start_index"]])
new_lines.append(code)
start_index = sample["end_index"] + 1
new_lines.extend(lines[start_index:])


with open(filename, "w", encoding="utf-8") as f:
f.write("\n".join(new_lines))
Expand All @@ -77,14 +73,18 @@ def format_all_files(check_only=False):
except Exception:
print(f"Failed to format {filename}.")
raise

if check_only and len(failures) > 0:
raise ValueError(f"{len(failures)} files need to be formatted, run `make style`.")


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--check_only", action="store_true", help="Just check files are properly formatted.")
parser.add_argument(
"--check_only",
action="store_true",
help="Just check files are properly formatted.",
)
args = parser.parse_args()

format_all_files(check_only=args.check_only)
99 changes: 59 additions & 40 deletions utils/generate_notebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,16 @@

frameworks = {"pt": "PyTorch", "tf": "TensorFlow"}


def read_and_split_frameworks(fname):
"""
Read the MDX in fname and creates two versions (if necessary) for each framework.
"""
with open(fname, "r") as f:
content = f.readlines()

contents = {"pt": [], "tf": []}

differences = False
current_content = []
line_idx = 0
Expand All @@ -54,12 +55,13 @@ def read_and_split_frameworks(fname):
if len(current_content) > 0:
for key in contents:
contents[key].extend(current_content)

if differences:
return {k: "".join(content) for k, content in contents.items()}
else:
return "".join(content)


def extract_cells(content):
"""
Extract the code/output cells from content.
Expand Down Expand Up @@ -96,12 +98,16 @@ def convert_to_nb_cell(cell):
nb_cell = {"cell_type": "code", "execution_count": None, "metadata": {}}
if isinstance(cell, tuple):
nb_cell["source"] = cell[0]
nb_cell["outputs"] = [nbformat.notebooknode.NotebookNode({
'data': {'text/plain': cell[1]},
'execution_count': None,
'metadata': {},
'output_type': 'execute_result',
})]
nb_cell["outputs"] = [
nbformat.notebooknode.NotebookNode(
{
"data": {"text/plain": cell[1]},
"execution_count": None,
"metadata": {},
"output_type": "execute_result",
}
)
]
else:
nb_cell["source"] = cell
nb_cell["outputs"] = []
Expand All @@ -110,9 +116,7 @@ def convert_to_nb_cell(cell):

def nb_cell(source, code=True):
if not code:
return nbformat.notebooknode.NotebookNode(
{"cell_type": "markdown", "source": source, "metadata": {}}
)
return nbformat.notebooknode.NotebookNode({"cell_type": "markdown", "source": source, "metadata": {}})
return nbformat.notebooknode.NotebookNode(
{"cell_type": "code", "metadata": {}, "source": source, "execution_count": None, "outputs": []}
)
Expand Down Expand Up @@ -152,50 +156,71 @@ def build_notebook(fname, title, output_dir="."):
"What to do when you get an error",
]
sections_with_faiss = ["Semantic search with FAISS (PyTorch)", "Semantic search with FAISS (TensorFlow)"]
sections_with_gradio = [
"Building your first demo",
"Understanding the Interface class",
"Sharing demos with others",
"Integrations with the Hugging Face Hub",
"Advanced Interface features",
"Introduction to Blocks",
]
stem = Path(fname).stem
if not isinstance(sections, dict):
contents = [sections]
titles = [title]
fnames = [f"{stem}.ipynb"]
fnames = [f"section{stem}.ipynb"]
else:
contents = []
titles = []
fnames = []
for key, section in sections.items():
contents.append(section)
titles.append(f"{title} ({frameworks[key]})")
fnames.append(f"{stem}_{key}.ipynb")
fnames.append(f"section{stem}_{key}.ipynb")

for title, content, fname in zip(titles, contents, fnames):
cells = extract_cells(content)
if len(cells) == 0:
continue

nb_cells = [
nb_cell(f"# {title}", code=False),
nb_cell("Install the Transformers and Datasets libraries to run this notebook.", code=False)
nb_cell("Install the Transformers and Datasets libraries to run this notebook.", code=False),
]

# Install cell
installs = ["!pip install datasets transformers[sentencepiece]"]
if title in sections_with_accelerate:
installs.append("!pip install accelerate")
installs.append("# To run the training on TPU, you will need to uncomment the followin line:")
installs.append("# !pip install cloud-tpu-client==0.10 torch==1.9.0 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl")
installs.append(
"# !pip install cloud-tpu-client==0.10 torch==1.9.0 https://storage.googleapis.com/tpu-pytorch/wheels/torch_xla-1.9-cp37-cp37m-linux_x86_64.whl"
)
if title in sections_with_hf_hub:
installs.append("!apt install git-lfs")
if title in sections_with_faiss:
installs.append("!pip install faiss-gpu")

if title in sections_with_gradio:
installs.append("!pip install gradio")

nb_cells.append(nb_cell("\n".join(installs)))

if title in sections_with_hf_hub:
nb_cells.extend([
nb_cell("You will need to setup git, adapt your email and name in the following cell.", code=False),
nb_cell("!git config --global user.email \"you@example.com\"\n!git config --global user.name \"Your Name\""),
nb_cell("You will also need to be logged in to the Hugging Face Hub. Execute the following and enter your credentials.", code=False),
nb_cell("from huggingface_hub import notebook_login\n\nnotebook_login()"),
])
nb_cells.extend(
[
nb_cell(
"You will need to setup git, adapt your email and name in the following cell.", code=False
),
nb_cell(
'!git config --global user.email "you@example.com"\n!git config --global user.name "Your Name"'
),
nb_cell(
"You will also need to be logged in to the Hugging Face Hub. Execute the following and enter your credentials.",
code=False,
),
nb_cell("from huggingface_hub import notebook_login\n\nnotebook_login()"),
]
)
nb_cells += [convert_to_nb_cell(cell) for cell in cells]
metadata = {"colab": {"name": title, "provenance": []}}
nb_dict = {"cells": nb_cells, "metadata": metadata, "nbformat": 4, "nbformat_minor": 4}
Expand All @@ -206,26 +231,20 @@ def build_notebook(fname, title, output_dir="."):

def get_titles():
"""
Parse the yaml _chapters.yml to get the correspondence filename to title
Parse the _toctree.yml file to get the correspondence filename to title
"""
table = yaml.safe_load(open(os.path.join(PATH_TO_COURSE, "_chapters.yml"), "r"))
table = yaml.safe_load(open(os.path.join(PATH_TO_COURSE, "_toctree.yml"), "r"))
result = {}
for entry in table:
chapter_name = entry["local"]
sections = []
for i, section in enumerate(entry["sections"]):
if isinstance(section, str):
result[os.path.join(chapter_name, f"section{i+1}")] = section
for section in entry["sections"]:
section_title = section["title"]
if "local_fw" in section:
section_names = section["local_fw"]
result[section_names["pt"]] = section_title
result[section_names["tf"]] = section_title
else:
section_name = section["local"]
section_title = section["title"]
if isinstance(section_name, str):
result[os.path.join(chapter_name, section_name)] = section_title
else:
if isinstance(section_title, str):
section_title = {key: section_title for key in section_name.keys()}
for key in section_name.keys():
result[os.path.join(chapter_name, section_name[key])] = section_title[key]
result[section_name] = section_title
return {k: v for k, v in result.items() if "quiz" not in v}


Expand Down

0 comments on commit 80b0fbd

Please sign in to comment.