Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow client library #2450

Conversation

davidcavazos
Copy link
Contributor

  • Use client libraries instead of subprocess
  • Updated README formatting
  • Updated requirements

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 7, 2019
@davidcavazos davidcavazos changed the title Dataflow cancel client library Dataflow client library Oct 7, 2019
@davidcavazos davidcavazos added dependencies Pull requests that update a dependency file kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 7, 2019
@davidcavazos davidcavazos requested a review from gguuss October 7, 2019 21:41
@davidcavazos davidcavazos force-pushed the dataflow-cancel-client-library branch from b02c4e1 to aa076f1 Compare October 7, 2019 21:44
@davidcavazos davidcavazos removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Oct 7, 2019
Copy link
Contributor

@gguuss gguuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if you can check the cancel request response and warn / fail as appropriate, consider adding it in. Nice job removing the subprocess stuff, it was a little environment-sensitive.

jobId=job_id,
body={'requestedState': 'JOB_STATE_CANCELLED'}
)
request.execute()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the subprocess approach, you check for success, can you check here or is this "fire and forget"? Probably fine either way as it's a small job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after closer inspection that shouldn't actually be the case. If launching the template fails it will raise an exception which will cause the test to fail.

The job can only be cancelled after it changes status from not started into ready or something like that which takes a couple seconds (usually). That's why we're sleeping for 5 seconds. However, if for some reason it takes longer and it cannot be cancelled, the job will still run for 5-8 minutes which is not a big deal, but that shouldn't cause the test to fail.

So we're cancelling as a best effort to try to minimize resources used, so I think we're fine in not checking the output of the cancel.

@davidcavazos davidcavazos force-pushed the dataflow-cancel-client-library branch from aa076f1 to 9ddcdc9 Compare October 8, 2019 17:39
@davidcavazos davidcavazos merged commit cbb910e into GoogleCloudPlatform:master Oct 8, 2019
@davidcavazos davidcavazos deleted the dataflow-cancel-client-library branch October 8, 2019 17:42
bradmiro added a commit that referenced this pull request Oct 8, 2019
* Adds updates for samples profiler ... vision (#2439)

* cloud-sql: add Cloud Run support to sqlalchemy sample (#2449)

* Adds split updates for Firebase ... opencensus (#2438)

* App engine upversion (#2437)

* App engine upversion

* Python 2 compatible pytest

* Dataflow client library (#2450)

* Updated requirements

* Update service naming convention

* Prefer client libraries over shell commands

* Update README format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement. dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants