Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add temp_format argument to pack_partitions_to_parquet #22

Merged
merged 7 commits into from
Jan 11, 2020

Conversation

jonmmease
Copy link
Collaborator

This PR adds a new argument to DaskGeoDataFrame.pack_partitions_to_parquet named temp_format. This argument may be set to a format string containing a {partition} replacement field. If provided, this string is formatted with the output partition number to generate the temporary directory path for that partition.

For example temp_format="/tmp/spatial/part-{partition}" would create temporary directories:

  • /tmp/spatial/part-0
  • /tmp/spatial/part-1
  • /tmp/spatial/part-2
    ...

The temp_format string may also contain a {uuid} replacement field. If provided this will be replaced by a randomly generated UUID string. This makes it possible to reuse the same temp_format string in multiple simultaneous jobs without conflict.

@jonmmease jonmmease merged commit d4e4297 into master Jan 11, 2020
@jonmmease jonmmease deleted the pack_partitions_tmp_dir branch January 11, 2020 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant