Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Nov 18, 2024

What changes were proposed in this pull request?

The pr aims to

  • extract common logic for reading the descriptor of PB file to one place.
  • at the same time, when using the from_protobuf or to_protobuf function in connect-client and spark-sql (or spark-shell), the spark error-condition thrown when the PB file is not found or read fails will be aligned.

Why are the changes needed?

I found that the logic for reading the descriptor of PB file is scattered in various places in the spark code repository, eg:

def readDescriptorFileContent(filePath: String): Array[Byte] = {
try {
FileUtils.readFileToByteArray(new File(filePath))
} catch {
case ex: FileNotFoundException =>
throw new RuntimeException(s"Cannot find descriptor file at path: $filePath", ex)
case ex: NoSuchFileException =>
throw new RuntimeException(s"Cannot find descriptor file at path: $filePath", ex)
case NonFatal(ex) =>
throw new RuntimeException(s"Failed to read the descriptor file: $filePath", ex)
}
}

private def readDescriptorFileContent(filePath: String): Array[Byte] = {
try {
Files.readAllBytes(Paths.get(filePath))
} catch {
case ex: FileNotFoundException =>
throw CompilationErrors.cannotFindDescriptorFileError(filePath, ex)
case ex: NoSuchFileException =>
throw CompilationErrors.cannotFindDescriptorFileError(filePath, ex)
case NonFatal(ex) =>
throw CompilationErrors.descriptorParseError(ex)
}
}

def readDescriptorFileContent(filePath: String): Array[Byte] = {
try {
FileUtils.readFileToByteArray(new File(filePath))
} catch {
case ex: FileNotFoundException =>
throw QueryCompilationErrors.cannotFindDescriptorFileError(filePath, ex)
case ex: NoSuchFileException =>
throw QueryCompilationErrors.cannotFindDescriptorFileError(filePath, ex)
case NonFatal(ex) => throw QueryCompilationErrors.descriptorParseError(ex)
}
}

  • I think we should gather it together to reduce the cost of maintenance.
  • Align spark error-condition to improve consistency in end-user experience.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun panbingkun changed the title [SPARK-50334][SQL] Extract common logic for reading PB files [SPARK-50334][SQL] Extract common logic for reading the descriptor of PB file Nov 18, 2024
@panbingkun panbingkun marked this pull request as ready for review November 19, 2024 06:12
FileUtils.readFileToByteArray(new File(filePath))
} catch {
case ex: FileNotFoundException =>
throw new RuntimeException(s"Cannot find descriptor file at path: $filePath", ex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you migrated the errors on error conditions too. Could you reflect this in PR's description and in the title.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, very detailed disclosure, has been updated!
thanks!

@MaxGekk
Copy link
Member

MaxGekk commented Nov 21, 2024

+1, LGTM. Merging to master.
Thank you, @panbingkun.

@MaxGekk MaxGekk closed this in 136c722 Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants