Skip to content

Commit

Permalink
Add auto-generated component hub page to docs
Browse files Browse the repository at this point in the history
  • Loading branch information
RobbeSneyders committed Oct 4, 2023
1 parent 8e80509 commit 5eeed82
Show file tree
Hide file tree
Showing 7 changed files with 200 additions and 2 deletions.
5 changes: 4 additions & 1 deletion docs/.readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,7 @@ build:
- poetry config virtualenvs.create false
post_install:
# Install dependencies with 'docs' dependency group
- poetry install --with docs
- poetry install --with docs
pre_build:
# Generate hub documentation
- python scripts/component_readme/generate_hub.py
2 changes: 1 addition & 1 deletion docs/components/components.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Fondant makes it easy to build data preparation pipelines leveraging reusable components. Fondant
provides a lot of components out of the box
([overview](https://github.com/ml6team/fondant/tree/main/components)), but you can also define your
([overview](hub.md)), but you can also define your
own custom components.

## The anatomy of a component
Expand Down
88 changes: 88 additions & 0 deletions docs/components/hub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
disable_toc: True
---

# Component Hub

Below you can find the reusable components offered by Fondant.

??? "caption_images"

--8<-- "components/caption_images/README.md:1"

??? "download_images"

--8<-- "components/download_images/README.md:1"

??? "embed_images"

--8<-- "components/embed_images/README.md:1"

??? "embedding_based_laion_retrieval"

--8<-- "components/embedding_based_laion_retrieval/README.md:1"

??? "filter_comments"

--8<-- "components/filter_comments/README.md:1"

??? "filter_image_resolution"

--8<-- "components/filter_image_resolution/README.md:1"

??? "filter_line_length"

--8<-- "components/filter_line_length/README.md:1"

??? "image_cropping"

--8<-- "components/image_cropping/README.md:1"

??? "image_resolution_extraction"

--8<-- "components/image_resolution_extraction/README.md:1"

??? "language_filter"

--8<-- "components/language_filter/README.md:1"

??? "load_from_files"

--8<-- "components/load_from_files/README.md:1"

??? "load_from_hf_hub"

--8<-- "components/load_from_hf_hub/README.md:1"

??? "load_from_parquet"

--8<-- "components/load_from_parquet/README.md:1"

??? "minhash_generator"

--8<-- "components/minhash_generator/README.md:1"

??? "pii_redaction"

--8<-- "components/pii_redaction/README.md:1"

??? "prompt_based_laion_retrieval"

--8<-- "components/prompt_based_laion_retrieval/README.md:1"

??? "segment_images"

--8<-- "components/segment_images/README.md:1"

??? "text_length_filter"

--8<-- "components/text_length_filter/README.md:1"

??? "text_normalization"

--8<-- "components/text_normalization/README.md:1"

??? "write_to_hf_hub"

--8<-- "components/write_to_hf_hub/README.md:1"

56 changes: 56 additions & 0 deletions docs/overrides/partials/toc.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<!--
Copyright (c) 2016-2023 Martin Donath <martin.donath@squidfunk.com>
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to
deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
IN THE SOFTWARE.
-->

<!-- Determine title -->
{% set title = lang.t("toc") %}
{% if config.mdx_configs.toc and config.mdx_configs.toc.title %}
{% set title = config.mdx_configs.toc.title %}
{% endif %}

<!-- Table of contents -->
<nav class="md-nav md-nav--secondary" aria-label="{{ title }}">
{% set toc = page.toc %}

<!--
Check whether the content starts with a level 1 headline. If it does, the
top-level anchor must be skipped, since it would be redundant to the link
to the current page that is located just above the anchor. Therefore we
directly continue with the children of the anchor.
-->
{% set first = toc | first %}
{% if first and first.level == 1 %}
{% set toc = first.children %}
{% endif %}

<!-- Table of contents title and list -->
{% if toc and not page.meta.disable_toc %}
<label class="md-nav__title" for="__toc">
<span class="md-nav__icon md-icon"></span>
{{ title }}
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
{% for toc_item in toc %}
{% include "partials/toc-item.html" %}
{% endfor %}
</ul>
{% endif %}
</nav>
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ nav:
- Creating custom components: components/custom_component.md
- Read / write components: components/generic_component.md
- Component spec: components/component_spec.md
- Hub: components/hub.md
- Data explorer: data_explorer.md
- Infrastructure: infrastructure.md
- Manifest: manifest.md
Expand Down
36 changes: 36 additions & 0 deletions scripts/component_readme/generate_hub.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import typing as t
from pathlib import Path
from glob import glob

import jinja2


def find_components() -> t.List[str]:
return [Path(d).name for d in sorted(glob("components/*", recursive=True))]


def generate_hub(components) -> str:
env = jinja2.Environment(
loader=jinja2.loaders.FileSystemLoader(Path(__file__).parent),
trim_blocks=True
)
template = env.get_template("hub_template.md")

return template.render(
components=components
)


def write_hub(hub: str) -> None:
with open("docs/components/hub.md", "w") as f:
f.write(hub)


def main():
components = find_components()
hub = generate_hub(components)
write_hub(hub)


if __name__ == "__main__":
main()
14 changes: 14 additions & 0 deletions scripts/component_readme/hub_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
disable_toc: True
---

# Component Hub

Below you can find the reusable components offered by Fondant.

{% for component in components %}
??? "{{ component }}"

--8<-- "components/{{ component }}/README.md:1"

{% endfor %}

0 comments on commit 5eeed82

Please sign in to comment.