Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider method autodoc and sample autogeneration #1128

Merged
merged 33 commits into from
Mar 31, 2020

Conversation

malefice
Copy link
Contributor

@malefice malefice commented Mar 12, 2020

What does this change

If accepted, the old build_docs will be replaced with a new autodoc system that will allow provider method docs to be more comprehensive.

What was wrong

Current docs are lacking as discussed in #1083.

How this fixes it

The project will now use Sphinx's AutoDoc extension to generate docs for provider methods from docstrings. Furthermore, it is possible to specify sample invocations inside the docstrings using a very simple syntax. Those will be parsed during docstring preprocessing, and the results of each valid invocation will be presented as well. A sample docs site using this system can be viewed here

Further thoughts

One side effect of the old build_docs system was the implicit inability to create provider methods with required arguments without default values. Attempting to do so will cause the docs building process to fail.

For example, the bothify method accepts a text argument with the default value of '## ??'. Semantically, it does not really make sense for that method to have a default value for text anyway, since users will most likely supply their own values for text. Should this new system be accepted, it will be possible to create such provider methods without causing the autodoc system to fail while still being able to showcase what each provider method is capable of.

Fixes #1083

@malefice
Copy link
Contributor Author

@fcurella i made some more changes, specifically the ability to set the number of samples per batch and the initial seed for that batch.

Oversimplified example, if provider method method has this in its docstring:

:sample 10 seed=1000: foo="bar"

Something like this will be autogenerated under the Example section of its docs.

>>> Faker.seed(1000)
>>> for _ in range(10):
...     fake.method(foo="bar")
...
# Output from 10 runs will be displayed here

I made this change, because over the course of rewriting some of the docs, I realized the generated examples will still be lacking under the old design. For example, you cannot properly demonstrate how random sampling methods actually work if you can only generate one sample at a time.

Anyway, I updated the BaseProvider docs first, and you can view it here. Please let me know if this looks good before I proceed with the other providers.

@fcurella
Copy link
Collaborator

@malefice my only issue is that sample may be a little vague. I'd rather be more specific and use sample_size instead.

@malefice
Copy link
Contributor Author

malefice commented Mar 18, 2020

@fcurella I am not so sure if sample_size is the correct role name either. For :sample 10 seed=1000: foo="bar", I personally read it as "sample 10 using seed=1000". Maybe generate or since these are provider methods, provide? Or maybe we are overthinking this and should just use fake?

What do you think?

:provide 10 seed=1000:

:generate 10 seed=1000:

:sample_size 10 seed=1000:

:fake 10 seed=1000:

Or maybe these?

:sample count=10 seed=1000:

:sample size=10 seed=1000:

@fcurella
Copy link
Collaborator

I like :sample size=10 seed=1000:

@malefice
Copy link
Contributor Author

@fcurella Already made the change.

Furthermore, I have also investigated the pypy3.5-6.0 build failure. Something in that pypy version triggers a segmentation fault if sphinx.util.logging.getLogger is used, but not if the standard library's logging.getLogger is used. You can view the build job for this here. I also tried updating the pypy3 version to the latest, and it does not trigger the segmentation fault with sphinx.util.logging.getLogger. Build job for this here.

By the way, sphinx logging is used to show warnings on invalid samples and failed sample generation. How do you want to proceed with this?

@fcurella
Copy link
Collaborator

Thank you so much for all your work @malefice !

I would just switch to the standard logging library, if there aren't any repercussions.

@malefice
Copy link
Contributor Author

if there aren't any repercussions.

Personally, I would just opt to use Sphinx logging, because I prefer to keep the logging part uniform with how Sphinx handles it. There are also some minor differences, but as to how important the differences are for our use case, it will be your call.

Sphinx logging already uses its own logging configuration as shown here, and it collects all logging output and neatly shows them after all source files have been read. Warning messages will also be in red.

Out of the box, standard library logging will output messages as source files are being read, and warnings will not be in red like so:

image

@fcurella
Copy link
Collaborator

Thank you @malefice , let's go with the standard logging module :)

@malefice
Copy link
Contributor Author

malefice commented Mar 25, 2020

@fcurella Despite the coronavirus outbreak, I have reason to believe that I will become increasingly busy over the next few weeks. I am thinking of wrapping this one up by just finishing the docstrings for modules I already started and writing some docs on how to write samples. Hopefully it will be possible for you to merge this PR in that state. At least that way, other people can also start writing docs as well VS waiting for me to finish this huge task.

I am also thinking that we should also exclude the faker.sphinx module when packaging faker for release. Since the autodoc functionality is not something people will need in production, I think we will be able to sleep better at night knowing that module (which uses eval() for sample generation) will not be present in production.

Anyway, your thoughts?

@fcurella
Copy link
Collaborator

@malefice I wuld be ok with excluding faker.sphinx from the package, as long as we can still run tests without it. Some distro maintainers run tests against the package before distribution (see #905, #917).

I'm ok with merging the PR even if we don't have all docstrings.

Thank you so much for all your hard work. Your help has really been invaluable. Take care and take it easy during these hard times :)

@malefice
Copy link
Contributor Author

@fcurella Thank you, and please do take care of yourself as well!

Anyway, I placed all the sphinx docs stuff inside faker.sphinx such that faker at large is not even aware of its existence. The tests.sphinx module is the only thing aware of faker.sphinx, and nothing else depends on either of the two modules, so excluding those two modules should still result in a good package.

@malefice malefice changed the title [WIP] Provider method autodoc and sample autogeneration Provider method autodoc and sample autogeneration Mar 26, 2020
@malefice
Copy link
Contributor Author

@fcurella I think I am done with this PR. Can you please review my work? You can view the sample site here.

@fcurella
Copy link
Collaborator

Merging this in. Thank you so much!

@fcurella fcurella merged commit 0e1ddfd into joke2k:master Mar 31, 2020
IlfirinPL pushed a commit to IlfirinPL/faker that referenced this pull request Apr 9, 2021
* Add sample code validator class

* Create build_docs replacement

* Enforce sorting on AVAILABLE_LOCALES

* Create provider method docstring preprocessor

* Configure sphinx to use new autodoc setup

* Reverse part of cdd79d5

* Prettify barcode provider docstring

* Assortment of changes to ProviderMethodDocstring

* Add more tests

* Allow use of OrderedDict in samples

* Fix flake8 errors

* Fix isort errors

* Add sphinx to tox.ini testenv deps

* Exclude autodoc files from coverage

* Fix one last isort error

* Allow multiple results and seeding

* Remove old build_docs

* Update base provider docs

* Fix flake8 and isort errors again

* Force usage of size kwarg in sample generation

* Revert typo introduced in 4d00704

* Improve logging

* Improve results stringification

* Finalize barcode provider docs

* Finalize standard provider docs

* Finalize misc provider docs

* Improve color provider docs

* Autogenerate default sample for insufficient docstrings

* Exclude faker.sphinx and tests.sphinx from release distributions

* Add docs on writing docs

* Fix check-manifest errors

* Fix faker.sphinx not found error in RTD

* fix small grammar issues

Co-authored-by: Flavio Curella <89607+fcurella@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revamping the Documentation
2 participants