Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bulk-delete API for recursive delete in lakeFSFS #4203

Closed
arielshaqed opened this issue Sep 19, 2022 · 2 comments · Fixed by #4204 or #4205
Closed

Use bulk-delete API for recursive delete in lakeFSFS #4203

arielshaqed opened this issue Sep 19, 2022 · 2 comments · Fixed by #4204 or #4205
Assignees
Labels
team/ecosystem Team Ecosystem

Comments

@arielshaqed
Copy link
Contributor

Measurements indicate that this can speed up the cleanupJob operation on FileOutputCommitter (which uses recursive delete) by 7x.

@arielshaqed arielshaqed self-assigned this Sep 19, 2022
@arielshaqed
Copy link
Contributor Author

Plan

Be more like (modern) S3AFileSystem, which uses a queue + separate thread for deletion.

@arielshaqed
Copy link
Contributor Author

Implementation

Bulk-delete makes it really hard to report errors accurately. Instead, we shall report the first bulk of errors and stop. S3AFileSystem appears to do something similar.

Additionally, should add a config flag to turn off bulk deletion, for when things really start confusing the user! 👻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/ecosystem Team Ecosystem
Projects
None yet
3 participants