Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly handle failed attempts to delete a file in Hive connector #12314

Merged
merged 1 commit into from
May 13, 2022

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented May 10, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

bugfix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Hive connector

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

Relates to: #12306
Fixes: #12296

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Hive
* Fix an issue where leftover files could remain in the table when there was a problem when issuing delete command to underlying distributed filesystem.
  The issue could appear during call to `OPTIMIZE` table procedure or during `INSERT`/`UPDATE`/`CREATE TABLE AS` query
  executed with `retry-policy` set to `TASK` or `QUERY`. ({issue}`12314`)

@@ -2247,7 +2248,10 @@ private void finishOptimize(ConnectorSession session, ConnectorTableExecuteHandl
if (firstScannedPath.isEmpty()) {
firstScannedPath = Optional.of(scannedPath);
}
retry().run("delete " + scannedPath, () -> fs.delete(scannedPath, false));
retry().run("delete " + scannedPath, () -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have an integration test based on minio showcasing this fix?

https://docs.min.io/docs/minio-multi-user-quickstart-guide.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice - but I do not have resources to do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the tricky part is that this delete call spots themselves are in error handling code paths - which would make test setup cumbersome.

@findepi
Copy link
Member

findepi commented May 10, 2022

Relates to: #12306

"Fixes"?
Also, does it "fixes" #12296 ?

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but let's wait a bit in case ignoring delete status was deliberate. Let's give a chance for folks to chime in.

Relates to: #12306

"Fixes"?

If we want an exception - then yeah - I would say it fixes it.\

Actually no - different code path.

@findepi findepi requested review from dain, electrum and phd3 May 10, 2022 14:26
@losipiuk
Copy link
Member Author

losipiuk commented May 10, 2022

Relates to: #12306

"Fixes"?

If we want an exception - then yeah - I would say it fixes it.

Also, does it "fixes" #12296 ?

Yeah

@losipiuk losipiuk force-pushed the lo/fix-delete-optimize branch from 3b76f9d to 3e58eb9 Compare May 13, 2022 12:43
@losipiuk losipiuk merged commit 70d411d into trinodb:master May 13, 2022
@github-actions github-actions bot added this to the 381 milestone May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Hive ALTER TABLE Optimize does not fail when deleting files from S3 failed
4 participants