-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel] Fix issue querying tables with spaces in the name #3291
Conversation
|
||
/** | ||
* Escapes the given string to be used as a partition value in the path. Basically this escapes | ||
* - characters that can't be in a file path. E.g. `a\nb` will be escaped to `a%0Ab`. - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's up with the -
usage in this comment? is it a dot job? the "-" literal character?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto-format. changed it to use the proper lists<ul>
@@ -56,7 +56,9 @@ public interface Scan { | |||
* <li>name: {@code add}, type: {@code struct}</li> | |||
* <li>Description: Represents `AddFile` DeltaLog action</li> | |||
* <li><ul> | |||
* <li>name: {@code path}, type: {@code string}, description: location of the file.</li> | |||
* <li>name: {@code path}, type: {@code string}, description: location of the file. | |||
* The path is a URI as specified by RFC 2396 URI Generic Syntax, which needs to be decoded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the input to Table.forPath also be a String that represents a URI? Have we updated that documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not at the moment, and we can't change that. If we want to take URI as an input we should be explicit about and and it should be another API, something like Table.forURI(URI tableURI)
.
The path
here comes from Delta Log and stored as a URI in Delta Log according to the protocol. The name is path
but it is actually a URI. Just updating the documentation to reflect that.
Before the next release, I will have a design decision to change the path string
to URI
everywhere else (basically in the Engine
interfaces). Until then this is the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
…3291) ## Description Currently, Kernel uses a mix of path (file system path) or URI (in string format) in API interfaces, which causes confusion and bugs. Context: Path refers to a file system path which could have some characters that should be escaped when converted to URI E.g. path: `s3:/bucket/path to file/`, URI for the same path: `s3:/bucket/path%20to%20file/` Make it uniform everywhere to just use the paths (file system path). ## How was this patch tested? Additional tests with table path containing spaces.
Description
(Stacked on top of #3289 and #3290 )
Currently, Kernel uses a mix of path (file system path) or URI (in string format) in API interfaces, which causes confusion and bugs.
Context:
Path refers to a file system path which could have some characters that should be escaped when converted to URI
E.g. path:
s3:/bucket/path to file/
, URI for the same path:s3:/bucket/path%20to%20file/
Make it uniform everywhere to just use the paths (file system path).
How was this patch tested?
Additional tests with table path containing spaces.