Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Documentation about kfp generated api python sdk #3216

Closed
Bobgy opened this issue Mar 5, 2020 · 24 comments · Fixed by #3787, #3898 or #4047
Closed

[Doc] Documentation about kfp generated api python sdk #3216

Bobgy opened this issue Mar 5, 2020 · 24 comments · Fixed by #3787, #3898 or #4047
Assignees
Labels
area/docs area/sdk/client priority/p0 status/triaged Whether the issue has been explicitly triaged

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Mar 5, 2020

There has been quite some pain for @jingzhang36 and me to figure out how to use generated sdk client. Also, many users clearly don't know how to use it either.

Related context

I wonder how we can improve it. Can we do the following?

@rmgogogo
Copy link
Contributor

Joe, is https://kubeflow-pipelines.readthedocs.io/en/latest/index.html owned by us?
We had another thread on it, but no one replied it.

@Bobgy Bobgy added the status/triaged Whether the issue has been explicitly triaged label Mar 18, 2020
@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 18, 2020

@rmgogogo I think it is generated from here https://github.com/kubeflow/pipelines/tree/master/docs.

@Ark-kun should know more about it.

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 20, 2020

@neuromage @Ark-kun I feel like this is a must have for KFP 1.0, also it would be the best if we can automate api spec doc generation and release regularly (in addition to some manually maintained examples).

What do you think? Can we add this into sdk/dsl group planning for Q2?

@neuromage
Copy link
Contributor

@Bobgy Yes, this makes sense for Q2. Would you mind creating an issue and adding it to the KFP 1.0 planning Kanban? I'll sync up with @jessiezcc on resourcing this.
/cc @Ark-kun who may already know how to do this with existing tools.

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 25, 2020

Thanks @neuromage! I created #3354 to track this.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 11, 2020

Joe, is https://kubeflow-pipelines.readthedocs.io/en/latest/index.html owned by us?
We had another thread on it, but no one replied it.

Sorry. I own that.

Without tweaks, sphinx does not know about the generated members of Client, because they do not exist on the Client class - only on objects. I can try to refactor the generation, so that it adds members to the class, not just the object. But that has some cons, since the generation will run during the import.

Maybe there is another way...

@Bobgy
Copy link
Contributor Author

Bobgy commented Apr 13, 2020

Without tweaks, sphinx does not know about the generated members of Client, because they do not exist on the Client class - only on objects. I can try to refactor the generation, so that it adds members to the class, not just the object. But that has some cons, since the generation will run during the import.

I'm trying to understand, why is generation happening during import a con? When does it currently happen? at object init phase?

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 13, 2020

I'm trying to understand, why is generation happening during import a con? When does it currently happen? at object init phase?

Currently the generation happens during kfp.Client() init.
Usually the importing phase should be as small and fast as possible and have no complicated logic. Normally, if there is an error in some function, then only that function is broken. However if there is some error during the module level execution, then import kfp stops working which is significantly more serious. This could happen for example if some new version of kfp-server-api package is for some reason not compatible with older SDKs.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 13, 2020

I'm digging into sphinx. Maybe autoattribute or autodata can generate the documentation for a Client instance.

@Bobgy
Copy link
Contributor Author

Bobgy commented Apr 13, 2020

Thanks! makes sense to me. Would be great if doc can be supported with current approach.

@jnmclarty
Copy link

Maybe there is another way...

I tried hacking on this a few different ways today. Indeed, there would be draw-backs to call anything like the current implementation of _add_generated_apis on import.

I looked around the features, events, extensions, autodoc directives of sphinx. I even attempted subclassing a Documenter (eg to create, for instance, autoinstance:: kfp.client_instance) to see what might be possible.

I think one approach worth considering would involve instantiating a kfp.Client during the build process for the docs and then telling sphinx explicitly, albeit dynamically, about the generated api.

I have a solution which works, and generates an expanded version of the docs. Only snag, is I'm guessing my approach, isn't compatible with readthedocs.io.

I'd send a PR tomorrow, or next weekend, if people are okay with instantiating a Client during the doc build. The doc builder would have to pass a host, I guess. (Or, there would need to be some refactoring of the constructor).

Since the kubeflow parent project has a registered domain and host, I'm guessing stuffing some extra static html somewhere, couldn't be too hard.

@Bobgy
Copy link
Contributor Author

Bobgy commented May 11, 2020

@jnmclarty Super thanks! This is a big step forward!

After reviewing the generated doc, there are still some usage frictions:

@Bobgy
Copy link
Contributor Author

Bobgy commented May 11, 2020

I've been searching if any solutions exist generating doc for swagger clients.

I tried swagger-api/swagger-codegen#1387 (comment).

Demo: https://bobgy.github.io/pipelines-api-doc/ for job api.

It's a lot easier to read, but we'd need to figure out how to patch/document using our sdk with it.

@Ark-kun
Copy link
Contributor

Ark-kun commented May 12, 2020

I have a solution which works, and generates an expanded version of the docs. Only snag, is I'm guessing my approach, isn't compatible with readthedocs.io.

I'd send a PR tomorrow, or next weekend, if people are okay with instantiating a Client during the doc build. The doc builder would have to pass a host, I guess. (Or, there would need to be some refactoring of the constructor).

This looks great! Thanks for helping.
Did you eventually just generate the lists of methods using a script? (e.g. .. automethod:: kfp.Client.runs.list_runs)

@Bobgy
Copy link
Contributor Author

Bobgy commented May 21, 2020

@Ark-kun Thanks! I can see new methods in the website now: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp-client-experiments

/reopen
to discuss further usability problems

Questions:

  • Why does every method have a corresponding method with suffix with_http_info? Will users ever need it? e.g. delete_job_with_http_info(id, **kwargs)[source]
  • Is there any way we can inject comment for each method so the long misleading comments can be removed?

@k8s-ci-robot
Copy link
Contributor

@Bobgy: Reopened this issue.

In response to this:

@Ark-kun Thanks! I can see new methods in the website now: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp-client-experiments

/reopen
to discuss further usability problems

Questions:

  • Why does every method have a corresponding method with suffix with_http_info? Will users ever need it? e.g. delete_job_with_http_info(id, **kwargs)[source]
  • Is there any way we can inject comment for each method so the long misleading comments can be removed?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this May 21, 2020
@Ark-kun
Copy link
Contributor

Ark-kun commented May 21, 2020

  • Why does every method have a corresponding method with suffix with_http_info? Will users ever need it? e.g. delete_job_with_http_info(id, **kwargs)[source]

That's how the API client in the generated kfp-server-apipackage looks like. Swagger codegen generates those methods.

  • Is there any way we can inject comment for each method so the long misleading comments can be removed?

The documentation reflects the generated kfp-server-api package. To change the documentation, the package needs to be changed.

P.S. The docs will look nicer once the swagger-codegen bug fix is merge swagger-api/swagger-codegen#10259

@Ark-kun Ark-kun closed this as completed May 21, 2020
@Bobgy
Copy link
Contributor Author

Bobgy commented May 22, 2020

/reopen
@Ark-kun Sorry, I'd want to keep this open until we have a usable end-to-end user journey for understanding how to use generated APIs.

e.g.

create_job(body, **kwargs)[source]
Create a new job. # noqa: E501

This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True >>> thread = api.create_job(body, async_req=True) >>> result = thread.get()

:param async_req bool :param ApiJob body: The job to be created (required) :return: ApiJob

If the method is called asynchronously, returns the request thread.

With current documentation, users can sort of find the body param should be of type ApiJob, but the documentation page doesn't have a reference to ApiJob.

ApiJob is also importable from users in the package, is it possible to include documentation for ApiJob there? (No need to be in the description, but if the object's documentation can be searchable on the page, it's a lot more usable)

@k8s-ci-robot k8s-ci-robot reopened this May 22, 2020
@k8s-ci-robot
Copy link
Contributor

@Bobgy: Reopened this issue.

In response to this:

/reopen
@Ark-kun Sorry, I'd want to keep this open until we have a usable end-to-end user journey for understanding how to use generated APIs.

e.g.

create_job(body, **kwargs)[source]
Create a new job. # noqa: E501

This method makes a synchronous HTTP request by default. To make an asynchronous HTTP request, please pass async_req=True >>> thread = api.create_job(body, async_req=True) >>> result = thread.get()

:param async_req bool :param ApiJob body: The job to be created (required) :return: ApiJob

If the method is called asynchronously, returns the request thread.

With current documentation, users can sort of find the body param should be of type ApiJob, but the documentation page doesn't have a reference to ApiJob.

ApiJob is also importable from users in the package, is it possible to include documentation for ApiJob there?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Ark-kun
Copy link
Contributor

Ark-kun commented May 22, 2020

I'd want to keep this open until we have a usable end-to-end user journey for understanding how to use generated APIs.

I thought the main issue we were solving was discovery of the auto-generated API of the kfp-server-API package. I agree that we need better docs, but the documentation pages you're seeing just reflect the docstrings of the generated package. You can check that by using help(kfp.Client().jobs.create_job)

With current documentation, users can sort of find the body param should be of type ApiJob, but the documentation page doesn't have a reference to ApiJob.

There would be a link if the parameter had a type annotation, but without the annotation it's not possible.

ApiJob is also importable from users in the package, is it possible to include documentation for ApiJob there?

Yes, it's possible. Thank you for reminding. I forgot to include the models in the final version.
I'll include them.

I should probably add a separate page so that the kfp.Client page is not too long.

In the future we can try upgrading from swagger-codegen to https://github.com/OpenAPITools/openapi-generator which may generate nicer docstrings.

@Bobgy
Copy link
Contributor Author

Bobgy commented May 22, 2020

Thanks!

There would be a link if the parameter had a type annotation, but without the annotation it's not possible.

No problem if we don't have the link now, just want to first make ApiJob searchable on that page.

Also +1 in trying openapi-generator, I think swagger has moved on to v3, v2 probably won't get enough attention.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 19, 2020

Hi @Ark-kun, would you mind cherry picking your PR in OpenAPITools/openapi-generator#6391 to our template folder here: https://github.com/kubeflow/pipelines/tree/master/backend/api/python_http_client_template

We can get the fix right away.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 19, 2020

/reopen

@k8s-ci-robot
Copy link
Contributor

@Bobgy: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment