S3FileIO Can Create Non-Posix Paths #6758

RussellSpitzer · 2023-02-06T15:54:19Z

Apache Iceberg version

1.1.0 (latest release)

Query engine

None

Please describe the bug 🐞

An interesting thing we ran into:

Our FileIo API contains this method

iceberg/api/src/main/java/org/apache/iceberg/io/FileIO.java

Lines 44 to 46 in c07f2aa

    
           /** Get a {@link OutputFile} instance to write bytes to the file at the given path. */ 
        
           OutputFile newOutputFile(String path);

Which uses a "String" as the Path for creating a new file reference. Now in general this is not an issue but there are some edge cases here. For example, S3FileIO doesn't enforce posix rules when creating paths or directories (since neither of those really exist in S3.) This means we the two following locations are actually different:

foo//bar/file
foo/bar/file

But within posix systems these two should refer to the exact same thing. See https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_03_266

A pathname may optionally contain one or more trailing slashes. Multiple successive slashes are considered to be the same as one slash.

Which we can see holds true in Java's Path class.

scala> import java.nio.file.Paths
import java.nio.file.Paths

scala> Paths.get("/foo//bar/file").equals(Paths.get("/foo/bar/file"))
res29: Boolean = true

Or more importantly in Hadoop's Path Class

scala> val p = new Path("foo/bar//bazz")
p: org.apache.hadoop.fs.Path = foo/bar/bazz

scala> p.toUri
res38: java.net.URI = foo/bar/bazz

This leads to an issue when a table is written to by S3FileIO but then read with HadoopFileIO. HadoopFileIO cannot read from the special foo//bar/file path because this isn't a valid posix path. This means if for some reason we end up generating double slashes in our path's metadata_location (or other paths) when using S3FileIO those files will be inaccessible if S3FileIO is swapped with HadoopFileIO.

I think in this case we should probably add to the spec that all files (and paths) must comply with posix standards.

The text was updated successfully, but these errors were encountered:

RussellSpitzer · 2023-02-06T15:54:37Z

@amogh-jahagirdar + @jackye1995 Do you have any thoughts on this?

jackye1995 · 2023-02-06T16:47:59Z

So internally we have a fix, where we extend Hadoop Path to work with double slash instead of the posix standard, and use that Path object for Iceberg for the specific FileSystem implementation. For existing tables written with //, it seems like a case that has to continue to be supported in this way.

But I agree it would be good to enforce this across all FileIOs, and just make sure removing trailing slash is done whenever needed. I am not sure if it has to be in the spec, given it has been like this for a pretty long time.

RussellSpitzer · 2023-02-06T18:59:40Z

iceberg/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java

Line 226 in cecb10b

    
           return String.format("%s/%s/%s", metadata.location(), METADATA_FOLDER_NAME, filename);

I think it makes sense to do the remove trailing slash here. I'm not sure why we only do it for custom metadata locations

RussellSpitzer · 2023-02-06T19:01:43Z

For the Spec I was just considering something like "path" a path is a POSIX normalized path

amogh-jahagirdar · 2023-02-06T19:04:14Z

Mostly agree with @jackye1995 here. I think at minimum, we for sure should remove trailing slash when writing any file that's part of the Iceberg table.

As for the spec change, I think it's good to define at the spec level. Even though our systems will need to enforce backwards compatibility for the files written with double slash, at least for future Iceberg releases we can establish the guarantee.

Standardizing file paths reduces this complexity of having to understand nuances of each FileIO implementation. Iceberg users would be confident just from the spec that all paths written will be a normalized posix path.

RussellSpitzer · 2023-02-06T21:03:09Z

This is also an issue for data files here

iceberg/core/src/main/java/org/apache/iceberg/LocationProviders.java

Lines 126 to 135 in 223177f

    
           String dataLocation = properties.get(TableProperties.WRITE_DATA_LOCATION); 
        
           if (dataLocation == null) { 
        
             dataLocation = properties.get(TableProperties.OBJECT_STORE_PATH); 
        
             if (dataLocation == null) { 
        
               dataLocation = properties.get(TableProperties.WRITE_FOLDER_STORAGE_LOCATION); 
        
               if (dataLocation == null) { 
        
                 dataLocation = String.format("%s/data", tableLocation); 
        
               } 
        
             } 
        
           }

RussellSpitzer · 2023-02-07T17:10:27Z

@findepi Is this an issue with Trino's file IO rules? Or does it always require all paths be Posix compliant? I know you have special S3 IO code as well

findepi · 2023-02-07T17:22:01Z

@RussellSpitzer thanks for the ping, but honestly I do not know.
I guess by non-POSIX you mean handling of double-slashes. For that @electrum @ebyhr are experts and I think we differentiate between paths containing repeated slashes.

@RussellSpitzer what else I should look at?

RussellSpitzer · 2023-02-07T17:25:18Z

That's basically the only issue, I think you may be safe if you never use a Hadoop File System implementation but it can potentially make an Iceberg table with paths that another framework like Spark or Flink could not access. So our big thought here is should we force all Iceberg "paths" to be posix normalized (no double slashes, no relative path elements, etc ...)

Another example would be setting the table location like foo/bar/../baz
In our HadoopFileIo this would be equivelent to foo/baz
But S3FileIO treats it like a string an would probably make foo/bar/../baz with a .. directory

jackye1995 · 2023-02-07T18:09:51Z

If we add this enforcement and still want backwards compatibility, it seems like we will need to do that as a feature flag.

And this cannot just be an engine level feature flag, because such enforcement will lead to some tables having hybrid file paths, and some are stored in posix style and some are stored literally.

So it seems necessary to:

introduce this flag in all the places that stores a file path (catalog, table metadata, manifest list, manifest file), so the reader understands how to parse the file paths differently.
add an engine level config to turn this feature on for subsequent writes.
add some migration procedure to fix tables with broken file paths.

Would you agree with the actions above?

RussellSpitzer · 2023-02-07T18:14:33Z

I think (2) we can actually do immediately without losing backwards compatibility. All readers can read paths written in the posix compliant matter, so implementing 2 even without a feature flag should be valid. The only issue is that new files will be in a slightly different location in S3 than they would have been previously. So I don't think this even needs a flag.

IE If we say from this point on, all files that are created will have posix normalized paths, should have no effect on compatibility. So I don't think we need (1). We let everyone read paths as they are in the metadata, we just make sure they are all written in the metadata in a way that is universally compatible.

Then I do think we may need a (3) but this is just for tables which have been written with non-posix compatible FileIO's that need to be used with a posix compatible file IO. This is a bit more intensive.

jackye1995 · 2023-02-07T18:19:22Z

I see, I thought you were saying the file paths will still be stored in the original way like s3://foo/bar//baz, but will store data at s3://foo/bar/baz. Reader also sees s3://foo/bar//baz in the manifest, but will read the file at s3://foo/bar/baz.

So instead, you actually mean we will never even store non-posix paths in Iceberg metadata going forward?

RussellSpitzer · 2023-02-07T18:22:51Z

Yeah my plan would be, regardless of what string ends up at the FileIO. The FileIO is responsible for only writing a posix compatible path. So just because I have a table location "foo/bar/" doesn't mean I get files at "foo/bar//data/". I'm open to other suggestions but this seems like a way we can resolve the issue.

For our users I'm trying to deal with the fact that they have many many tables and don't want to go through them modifying the table location in all of them. Ideally I think we should be able to go back and forth between FileIO's without causing a failure. This is a slightly unique situation because I'm not sure if there are any other FileIO's which allow access to the same underlying filesystem.

jackye1995 · 2023-02-07T18:42:36Z

This is a slightly unique situation because I'm not sure if there are any other FileIO's which allow access to the same underlying filesystem.

I think this is probably a problem for all the S3-compatible storage systems, we already have a few implemented in this repo like the one from Dell ECS, so agree we should solve compatibility of it.

Yeah my plan would be, regardless of what string ends up at the FileIO. The FileIO is responsible for only writing a posix compatible path.

Yes I agree this is a simpler plan than my original thought. Are you currently already working on this? If not I can take it.

I think (3) as a procedure I think is probably still needed for tables already in that situation, and once we have that we can probably enforce it as a requirement and users for tables with issue can have a clear path forward.

RussellSpitzer · 2023-02-07T19:03:28Z

I have not started, so you are free to take it if you like. I do think we need confirmation from the Trino folks that this is something they can also follow. I know they have a completely different pathway for opening and creating files.

jackye1995 · 2023-02-07T19:13:35Z

I think Trino is mostly fine, because all the file paths are converted to Hadoop paths before writing, so technically it already enforces posix style.

https://github.com/trinodb/trino/blob/master/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSink.java#L336

https://github.com/trinodb/trino/blob/master/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergFileWriterFactory.java#L173

https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HdfsInputFile.java#L46

https://github.com/trinodb/trino/blob/master/lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HadoopPaths.java

RussellSpitzer · 2023-02-07T19:14:40Z

Oh then this is actually much needed for trino too, since by default they can't read a non-posix file either?

jackye1995 · 2023-02-07T19:20:27Z

Correct, that's why I said there is an internal patch in Athena to support non-posix files.

electrum · 2023-02-07T22:19:22Z

We have a hack in Trino to allow reading non-standard paths. That's the HadoopPaths code referenced above, along with this code in TrinoS3FileSystem.

All of our writing code currently goes through Hadoop, so paths in S3 will be normalized, but we have a project to decouple Trino from Hadoop and Hive codebases, so we'll want our upcoming non-Hadoop TrinoFileSystem implementations to handle this correctly.

Handling . and .. is tricky, though. If a user does this:

CREATE TABLE ... WITH (location = 's3://foo/bar/../baz')

What should the resulting object name in S3 be? What gets written to the Iceberg manifest? Where should the normalization happen? Do we need to normalize all "user input" locations?

While I like the idea of just using strings and not dealing with directories (which we copied from Iceberg FileIO), it certainly has annoyances and problems when interacting with POSIX-like systems.

RussellSpitzer · 2023-02-07T22:30:11Z

My call would be that everything new file we create with FileIO (in iceberg) hits this transformation

import java.nio.file.Paths
scala> def posixNormalize(s: String) = Paths.get(s).normalize.toString
posixNormalize: (s: String)String
scala> posixNormalize("s3://foo/bar/../baz")
res41: String = s3:/foo/baz

So I would add that as an protected method in FileIO and then any calls to
OutputFile newOutputFile(String path);

call posixNormalize before doing anything else. Ideally I wish I could force this on all calls to newOutputFile but I don't think we can manage that without breaking the API. Maybe we can do that in Iceberg 2.0 and have

final OutputFile newOutputFile(String path) {
  return newOutputFileImpl(posixNormalize(path))
}

OutputFile newOutputFileImpl(String path)

For now I would just add that to S3FileIO

jackye1995 · 2023-02-07T22:34:49Z

Handling . and .. is tricky, though. If a user does this:

CREATE TABLE ... WITH (location = 's3://foo/bar/../baz')

What should the resulting object name in S3 be? What gets written to the Iceberg manifest? Where should the normalization happen? Do we need to normalize all "user input" locations?

Yes this case is interesting, I can see it go both ways. Sounds like we should still have a feature flag controlling this behavior then?

RussellSpitzer · 2023-02-07T22:38:09Z

I still think we just set this up as Iceberg only makes posix paths for any new files it creates. Anything going through FileIO.create is normalized. This means everything in the metadata is normalized unless a user explicitly and manually adds a data file that isn't.

electrum · 2023-02-07T23:53:00Z

I just realized this has implications for the specification. If a manifest file contains a location containing . or .. path segments, how are readers supposed to interpret that? The current implementations that use Hadoop code will normalize them. Is this correct behavior? Should we formalize this behavior in the specification?

RussellSpitzer · 2023-02-08T03:20:04Z

@electrum thats part of my suggestion, no path entries can contain any unnormailzed posix characters. No .., ., //, maybe also specify no symbolic links although I'm not sure how we enforce that.

findepi · 2023-02-08T15:15:15Z

I am concerned about security implications of handling .. in paths. At the same time I don't see any practical implications of that. If we gonna do any form of normalization, we should forbid such paths.
I think we should forbid . too.

jackye1995 · 2023-02-08T15:48:48Z

I put up a draft for discussion, there are different ways to achieve that, the idea in the PR is to normalizes all the places that we write a location, which includes table location, metadata file location and data location.

I am concerned about security implications of handling .. in paths

How do Hadoop file systems currently handle such use case? I believe it will also support .. and . by default and derive the posix path of it right?

RussellSpitzer · 2023-02-08T17:56:06Z

Consensus at Community Sync was that for now we will just add a strip trailing slash to

iceberg/core/src/main/java/org/apache/iceberg/TableMetadata.java

Line 451 in 6697129

return location;

To remove any chance of accidentally adding the double slash.

For all the other oddities Jack's PR will allow for manually setting a flag to forbid/only write posix compatible files. This would cover ".., ." and some other things but still not help us in the case of symbolic links which ends up being a weirder proposition based on the FileIO

jackye1995 mentioned this issue Feb 8, 2023

Core: enforce writing POSIX compatible paths for data location #6772

Closed

RussellSpitzer mentioned this issue Feb 8, 2023

Core: TableMetadata Always Strips Trailing Slash From Location #6777

Merged

RussellSpitzer closed this as completed Mar 30, 2023

przemekd mentioned this issue Aug 15, 2023

remove_orphan_files throws reached maximum depth exception in AWS EMR-6.11.0 #8022

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3FileIO Can Create Non-Posix Paths #6758

S3FileIO Can Create Non-Posix Paths #6758

RussellSpitzer commented Feb 6, 2023

RussellSpitzer commented Feb 6, 2023

jackye1995 commented Feb 6, 2023 •

edited

Loading

RussellSpitzer commented Feb 6, 2023

RussellSpitzer commented Feb 6, 2023

amogh-jahagirdar commented Feb 6, 2023 •

edited

Loading

RussellSpitzer commented Feb 6, 2023

RussellSpitzer commented Feb 7, 2023

findepi commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

electrum commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023 •

edited

Loading

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023 •

edited

Loading

electrum commented Feb 7, 2023

RussellSpitzer commented Feb 8, 2023

findepi commented Feb 8, 2023

jackye1995 commented Feb 8, 2023

RussellSpitzer commented Feb 8, 2023

S3FileIO Can Create Non-Posix Paths #6758

S3FileIO Can Create Non-Posix Paths #6758

Comments

RussellSpitzer commented Feb 6, 2023

Apache Iceberg version

Query engine

Please describe the bug 🐞

RussellSpitzer commented Feb 6, 2023

jackye1995 commented Feb 6, 2023 • edited Loading

RussellSpitzer commented Feb 6, 2023

RussellSpitzer commented Feb 6, 2023

amogh-jahagirdar commented Feb 6, 2023 • edited Loading

RussellSpitzer commented Feb 6, 2023

RussellSpitzer commented Feb 7, 2023

findepi commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023

jackye1995 commented Feb 7, 2023

electrum commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023 • edited Loading

jackye1995 commented Feb 7, 2023

RussellSpitzer commented Feb 7, 2023 • edited Loading

electrum commented Feb 7, 2023

RussellSpitzer commented Feb 8, 2023

findepi commented Feb 8, 2023

jackye1995 commented Feb 8, 2023

RussellSpitzer commented Feb 8, 2023

jackye1995 commented Feb 6, 2023 •

edited

Loading

amogh-jahagirdar commented Feb 6, 2023 •

edited

Loading

RussellSpitzer commented Feb 7, 2023 •

edited

Loading

RussellSpitzer commented Feb 7, 2023 •

edited

Loading