Allow mapping object store paths without going through inventory #1848
Labels
area/lakectl
Issues related to lakeFS' command line interface (lakectl)
area/tools
Improvements or additions to tooling and scripting
Currently there are 2 ways of "importing" data into lakeFS without actually copying it:
lakectl fs stage
or the equivilent stageObject API endpont.lakefs import
that utilizes the S3 inventory to read an entire bucket (and potentially only load a subset into lakeFS)While (1) provides a reasonable solution for importing a single object, reading a directory or common prefix requires scripting that may or may not be trivial for the user. On the other hand (2) is great for loading a big (>1M objects) bucket into lakeFS - the ops overhead is substantial because:
We're missing a middle ground - the ability to ingest data directly from the object store into lakeFS, by using native object listing. This provides a relatively easy way to load a common prefix, a small table or a set of partitions (<1M objects) in a way that is more accessible to a data engineer.
The text was updated successfully, but these errors were encountered: