Skip to content

Commit

Permalink
fix docs, make sure docs build
Browse files Browse the repository at this point in the history
  • Loading branch information
gokulavasan committed Jul 30, 2024
1 parent 2750510 commit 9ad0094
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 21 deletions.
36 changes: 27 additions & 9 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -1,23 +1,41 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
ifneq ($(EXAMPLES_PATTERN),)
EXAMPLES_PATTERN_OPTS := -D sphinx_gallery_conf.filename_pattern="$(EXAMPLES_PATTERN)"
endif

# You can set these variables from the command line.
SPHINXOPTS = -W -j auto $(EXAMPLES_PATTERN_OPTS)
SPHINXBUILD = sphinx-build
SPHINXPROJ = torchdata
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

doctest: html
$(SPHINXBUILD) -b doctest $(SPHINXOPTS) "$(SOURCEDIR)" "$(BUILDDIR)"/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
docset: html
doc2dash --name $(SPHINXPROJ) --icon $(SOURCEDIR)/_static/img/pytorch-logo-flame.png --enable-js --online-redirect-url http://pytorch.org/data/ --force $(BUILDDIR)/html/

# Manually fix because Zeal doesn't deal well with `icon.png`-only at 2x resolution.
cp $(SPHINXPROJ).docset/icon.png $(SPHINXPROJ).docset/icon@2x.png
convert $(SPHINXPROJ).docset/icon@2x.png -resize 16x16 $(SPHINXPROJ).docset/icon.png

html-noplot: # Avoids running the gallery examples, which may take time
$(SPHINXBUILD) -D plot_gallery=0 -b html "${SOURCEDIR}" "$(BUILDDIR)"/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."

clean:
rm -rf $(BUILDDIR)/*
rm -rf $(SOURCEDIR)/generated_examples/ # sphinx-gallery
rm -rf $(SOURCEDIR)/gen_modules/ # sphinx-gallery
rm -rf $(SOURCEDIR)/sg_execution_times.rst # sphinx-gallery
rm -rf $(SOURCEDIR)/generated/ # autosummary

.PHONY: help doctest Makefile
.PHONY: help Makefile docset

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
Expand Down
22 changes: 11 additions & 11 deletions docs/source/dp_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ Accessing AWS S3 with ``fsspec`` DataPipes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This requires the installation of the libraries ``fsspec``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`_) and ``s3fs``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`__) and ``s3fs``
(`s3fs GitHub repo <https://github.com/fsspec/s3fs>`_).

You can list out the files within a S3 bucket directory by passing a path that starts
Expand Down Expand Up @@ -363,7 +363,7 @@ is also available for writing data to cloud.
Accessing Google Cloud Storage (GCS) with ``fsspec`` DataPipes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This requires the installation of the libraries ``fsspec``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`_) and ``gcsfs``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`__) and ``gcsfs``
(`gcsfs GitHub repo <https://github.com/fsspec/gcsfs>`_).

You can list out the files within a GCS bucket directory by specifying a path that starts
Expand Down Expand Up @@ -400,11 +400,11 @@ Accessing Azure Blob storage with ``fsspec`` DataPipes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This requires the installation of the libraries ``fsspec``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`_) and ``adlfs``
(`documentation <https://filesystem-spec.readthedocs.io/en/latest/>`__) and ``adlfs``
(`adlfs GitHub repo <https://github.com/fsspec/adlfs>`_).
You can access data in Azure Data Lake Storage Gen2 by providing URIs staring with ``abfs://``.
You can access data in Azure Data Lake Storage Gen2 by providing URIs staring with ``abfs://``.
For example,
`FSSpecFileLister <generated/torchdata.datapipes.iter.FSSpecFileLister.html>`_ (``.list_files_by_fsspec(...)``)
`FSSpecFileLister <generated/torchdata.datapipes.iter.FSSpecFileLister.html>`_ (``.list_files_by_fsspec(...)``)
can be used to list files in a directory in a container:

.. code:: python
Expand All @@ -430,11 +430,11 @@ directory ``curated/covid-19/ecdc_cases/latest``, belonging to account ``pandemi
.open_files_by_fsspec(account_name='pandemicdatalake') \
.parse_csv()
print(list(dp)[:3])
# [['date_rep', 'day', ..., 'iso_country', 'daterep'],
# [['date_rep', 'day', ..., 'iso_country', 'daterep'],
# ['2020-12-14', '14', ..., 'AF', '2020-12-14'],
# ['2020-12-13', '13', ..., 'AF', '2020-12-13']]
If necessary, you can also access data in Azure Data Lake Storage Gen1 by using URIs staring with
If necessary, you can also access data in Azure Data Lake Storage Gen1 by using URIs staring with
``adl://`` and ``abfs://``, as described in `README of adlfs repo <https://github.com/fsspec/adlfs/blob/main/README.md>`_

Accessing Azure ML Datastores with ``fsspec`` DataPipes
Expand All @@ -446,11 +446,11 @@ An Azure ML datastore is a *reference* to an existing storage account on Azure.
- Authentication is automatically handled - both *credential-based* access (service principal/SAS/key) and *identity-based* access (Azure Active Directory/managed identity) are supported. When using credential-based authentication, you do not need to expose secrets in your code.

This requires the installation of the library ``azureml-fsspec``
(`documentation <https://learn.microsoft.com/python/api/azureml-fsspec/?view=azure-ml-py>`_).
(`documentation <https://learn.microsoft.com/python/api/azureml-fsspec/?view=azure-ml-py>`__).

You can access data in an Azure ML datastore by providing URIs staring with ``azureml://``.
You can access data in an Azure ML datastore by providing URIs staring with ``azureml://``.
For example,
`FSSpecFileLister <generated/torchdata.datapipes.iter.FSSpecFileLister.html>`_ (``.list_files_by_fsspec(...)``)
`FSSpecFileLister <generated/torchdata.datapipes.iter.FSSpecFileLister.html>`_ (``.list_files_by_fsspec(...)``)
can be used to list files in a directory in a container:

.. code:: python
Expand All @@ -470,7 +470,7 @@ can be used to list files in a directory in a container:
dp = IterableWrapper([uri]).list_files_by_fsspec()
print(list(dp))
# ['azureml:///<sub_id>/resourcegroups/<rg_name>/workspaces/<ws_name>/datastores/<datastore>/paths/<folder>/file1.txt',
# ['azureml:///<sub_id>/resourcegroups/<rg_name>/workspaces/<ws_name>/datastores/<datastore>/paths/<folder>/file1.txt',
# 'azureml:///<sub_id>/resourcegroups/<rg_name>/workspaces/<ws_name>/datastores/<datastore>/paths/<folder>/file2.txt', ...]
You can also open files using `FSSpecFileOpener <generated/torchdata.datapipes.iter.FSSpecFileOpener.html>`_
Expand Down
2 changes: 1 addition & 1 deletion torchdata/dataloader2/reading_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ def __new__(cls, *args, **kwargs):

class InProcessReadingService(ReadingServiceInterface):
r"""
Default ReadingService to serve the ``DataPipe` graph in the main process,
Default ReadingService to serve the ``DataPipe`` graph in the main process,
and apply graph settings like determinism control to the graph.
Args:
Expand Down

0 comments on commit 9ad0094

Please sign in to comment.