-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3FileIO Can Create Non-Posix Paths #6758
Comments
@amogh-jahagirdar + @jackye1995 Do you have any thoughts on this? |
So internally we have a fix, where we extend Hadoop But I agree it would be good to enforce this across all FileIOs, and just make sure removing trailing slash is done whenever needed. I am not sure if it has to be in the spec, given it has been like this for a pretty long time. |
|
For the Spec I was just considering something like "path" a path is a POSIX normalized path |
Mostly agree with @jackye1995 here. I think at minimum, we for sure should remove trailing slash when writing any file that's part of the Iceberg table. As for the spec change, I think it's good to define at the spec level. Even though our systems will need to enforce backwards compatibility for the files written with double slash, at least for future Iceberg releases we can establish the guarantee. Standardizing file paths reduces this complexity of having to understand nuances of each FileIO implementation. Iceberg users would be confident just from the spec that all paths written will be a normalized posix path. |
This is also an issue for data files here iceberg/core/src/main/java/org/apache/iceberg/LocationProviders.java Lines 126 to 135 in 223177f
|
@findepi Is this an issue with Trino's file IO rules? Or does it always require all paths be Posix compliant? I know you have special S3 IO code as well |
@RussellSpitzer thanks for the ping, but honestly I do not know. @RussellSpitzer what else I should look at? |
That's basically the only issue, I think you may be safe if you never use a Hadoop File System implementation but it can potentially make an Iceberg table with paths that another framework like Spark or Flink could not access. So our big thought here is should we force all Iceberg "paths" to be posix normalized (no double slashes, no relative path elements, etc ...) Another example would be setting the table location like |
If we add this enforcement and still want backwards compatibility, it seems like we will need to do that as a feature flag. And this cannot just be an engine level feature flag, because such enforcement will lead to some tables having hybrid file paths, and some are stored in posix style and some are stored literally. So it seems necessary to:
Would you agree with the actions above? |
I think (2) we can actually do immediately without losing backwards compatibility. All readers can read paths written in the posix compliant matter, so implementing 2 even without a feature flag should be valid. The only issue is that new files will be in a slightly different location in S3 than they would have been previously. So I don't think this even needs a flag. IE If we say from this point on, all files that are created will have posix normalized paths, should have no effect on compatibility. So I don't think we need (1). We let everyone read paths as they are in the metadata, we just make sure they are all written in the metadata in a way that is universally compatible. Then I do think we may need a (3) but this is just for tables which have been written with non-posix compatible FileIO's that need to be used with a posix compatible file IO. This is a bit more intensive. |
I see, I thought you were saying the file paths will still be stored in the original way like So instead, you actually mean we will never even store non-posix paths in Iceberg metadata going forward? |
Yeah my plan would be, regardless of what string ends up at the FileIO. The FileIO is responsible for only writing a posix compatible path. So just because I have a table location "foo/bar/" doesn't mean I get files at "foo/bar//data/". I'm open to other suggestions but this seems like a way we can resolve the issue. For our users I'm trying to deal with the fact that they have many many tables and don't want to go through them modifying the table location in all of them. Ideally I think we should be able to go back and forth between FileIO's without causing a failure. This is a slightly unique situation because I'm not sure if there are any other FileIO's which allow access to the same underlying filesystem. |
I think this is probably a problem for all the S3-compatible storage systems, we already have a few implemented in this repo like the one from Dell ECS, so agree we should solve compatibility of it.
Yes I agree this is a simpler plan than my original thought. Are you currently already working on this? If not I can take it. I think (3) as a procedure I think is probably still needed for tables already in that situation, and once we have that we can probably enforce it as a requirement and users for tables with issue can have a clear path forward. |
I have not started, so you are free to take it if you like. I do think we need confirmation from the Trino folks that this is something they can also follow. I know they have a completely different pathway for opening and creating files. |
I think Trino is mostly fine, because all the file paths are converted to Hadoop paths before writing, so technically it already enforces posix style. |
Oh then this is actually much needed for trino too, since by default they can't read a non-posix file either? |
Correct, that's why I said there is an internal patch in Athena to support non-posix files. |
We have a hack in Trino to allow reading non-standard paths. That's the All of our writing code currently goes through Hadoop, so paths in S3 will be normalized, but we have a project to decouple Trino from Hadoop and Hive codebases, so we'll want our upcoming non-Hadoop Handling CREATE TABLE ... WITH (location = 's3://foo/bar/../baz') What should the resulting object name in S3 be? What gets written to the Iceberg manifest? Where should the normalization happen? Do we need to normalize all "user input" locations? While I like the idea of just using strings and not dealing with directories (which we copied from Iceberg |
My call would be that everything new file we create with FileIO (in iceberg) hits this transformation import java.nio.file.Paths
scala> def posixNormalize(s: String) = Paths.get(s).normalize.toString
posixNormalize: (s: String)String
scala> posixNormalize("s3://foo/bar/../baz")
res41: String = s3:/foo/baz So I would add that as an protected method in FileIO and then any calls to call posixNormalize before doing anything else. Ideally I wish I could force this on all calls to newOutputFile but I don't think we can manage that without breaking the API. Maybe we can do that in Iceberg 2.0 and have final OutputFile newOutputFile(String path) {
return newOutputFileImpl(posixNormalize(path))
}
OutputFile newOutputFileImpl(String path) For now I would just add that to S3FileIO |
Yes this case is interesting, I can see it go both ways. Sounds like we should still have a feature flag controlling this behavior then? |
I still think we just set this up as Iceberg only makes posix paths for any new files it creates. Anything going through FileIO.create is normalized. This means everything in the metadata is normalized unless a user explicitly and manually adds a data file that isn't. |
I just realized this has implications for the specification. If a manifest file contains a location containing |
@electrum thats part of my suggestion, no path entries can contain any unnormailzed posix characters. No .., ., //, maybe also specify no symbolic links although I'm not sure how we enforce that. |
I am concerned about security implications of handling |
I put up a draft for discussion, there are different ways to achieve that, the idea in the PR is to normalizes all the places that we write a location, which includes table location, metadata file location and data location.
How do Hadoop file systems currently handle such use case? I believe it will also support |
Consensus at Community Sync was that for now we will just add a strip trailing slash to
To remove any chance of accidentally adding the double slash. For all the other oddities Jack's PR will allow for manually setting a flag to forbid/only write posix compatible files. This would cover ".., ." and some other things but still not help us in the case of symbolic links which ends up being a weirder proposition based on the FileIO |
Apache Iceberg version
1.1.0 (latest release)
Query engine
None
Please describe the bug 🐞
An interesting thing we ran into:
Our FileIo API contains this method
iceberg/api/src/main/java/org/apache/iceberg/io/FileIO.java
Lines 44 to 46 in c07f2aa
Which uses a "String" as the Path for creating a new file reference. Now in general this is not an issue but there are some edge cases here. For example, S3FileIO doesn't enforce posix rules when creating paths or directories (since neither of those really exist in S3.) This means we the two following locations are actually different:
But within posix systems these two should refer to the exact same thing. See https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_03_266
Which we can see holds true in Java's Path class.
Or more importantly in Hadoop's Path Class
This leads to an issue when a table is written to by S3FileIO but then read with HadoopFileIO. HadoopFileIO cannot read from the special
foo//bar/file
path because this isn't a valid posix path. This means if for some reason we end up generating double slashes in our path's metadata_location (or other paths) when using S3FileIO those files will be inaccessible if S3FileIO is swapped with HadoopFileIO.I think in this case we should probably add to the spec that all files (and paths) must comply with posix standards.
The text was updated successfully, but these errors were encountered: