Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Handle KernelEngineException when reading the _last_checkpoint file #3086

Merged
merged 5 commits into from
May 16, 2024

Conversation

vkorukanti
Copy link
Collaborator

@vkorukanti vkorukanti commented May 10, 2024

There is an issue with the CloseableIterator interface that Kernel is using. Currently, it extends Java's iterator, which doesn't throw any exceptions. We use CloseableIterator when returning data read from a file or any incremental data access. Any IOException in hasNext or next is wrapped in a UncheckedIOException or RuntimeException. Users of the CloseableIterator need to catch for UncheckedIOException or RuntimeException explicitly and look at the cause if they are interested in the IOException. This is not consistent and causes problems for the code that want to handle exceptions like FileNotFoundException (subclass of IOException) and take further actions.

  • Change the CloseableIterator.{next, hasNext} contract to expect KernelEngineException for any exceptions that occur while executing in the Engine.
  • Update the DefaultParquetHandler and DefaultJsonHandler to throw KernelEngineException instead of UncheckedIOException or RuntimeException.
  • In the checkpoint metadata loading method, catch KernelEngineException and see if the cause is FileNotFoundException. If yes, don't retry loading.

@vkorukanti vkorukanti changed the title [Kernel] Handle UncheckedIOException when reading the _last_checkpoint file [Kernel] Handle KernelEngineException when reading the _last_checkpoint file May 16, 2024
Copy link
Collaborator

@scottsand-db scottsand-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments

* Throws when the {@link Engine} encountered an error while executing an operation.
*/
public class KernelEngineException extends RuntimeException {
private static final String msgT = "Encountered an error from the underlying engine " +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the capital T mean in msgT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a shortened messageTemplate. Renaming it to msgTemplate

* any) wrapped in this exception as cause. E.g.
* {@link IOException} thrown while trying to read from a Delta
* log file. It will be wrapped in this exception as cause.
* @throws KernelException When encountered an operation or state that is invalid or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for KernelEngineException you state i'ts for exceptions that occur in the Engine.

Am I correct in assuming that KernelException is for exceptions that occur in the Kernel, not in the engine? Can you add that very short clarification here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. Clarified.

@vkorukanti vkorukanti merged commit c024269 into delta-io:master May 16, 2024
10 checks passed
vkorukanti added a commit that referenced this pull request May 16, 2024
…checkpoint` file (#3086)

There is an issue with the `CloseableIterator` interface that Kernel is
using. Currently, it extends Java's `iterator`, which doesn't throw any
exceptions. We use `CloseableIterator` when returning data read from a
file or any incremental data access. Any `IOException` in `hasNext` or
`next` is wrapped in a `UncheckedIOException` or `RuntimeException`.
Users of the `CloseableIterator` need to catch for
`UncheckedIOException` or `RuntimeException` explicitly and look at the
cause if they are interested in the `IOException`. This is not
consistent and causes problems for the code that want to handle
exceptions like `FileNotFoundException` (subclass of `IOException`) and
take further actions.

* Change the `CloseableIterator.{next, hasNext}` contract to expect
`KernelEngineException` for any exceptions that occur while executing in
the `Engine`.
* Update the `DefaultParquetHandler` and `DefaultJsonHandler` to throw
`KernelEngineException` instead of `UncheckedIOException` or
`RuntimeException`.
* In the checkpoint metadata loading method, catch
`KernelEngineException` and see if the cause is `FileNotFoundException.`
If yes, don't retry loading.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants