Catalog API #1354
Replies: 16 comments 6 replies
-
Versions of table is a single listed list, or could it be a DAG?
|
Beta Was this translation helpful? Give feedback.
-
I think it's a single chain like git |
Beta Was this translation helpful? Give feedback.
-
How does DDL work? |
Beta Was this translation helpful? Give feedback.
-
You're right. In every case it does not need a DAG history. I was overthinking. |
Beta Was this translation helpful? Give feedback.
-
"v0 <- v1 <- v2 " is enought for time travel.
The above figure may describe:
Transaction operation log will contain an item/entry to record the |
Beta Was this translation helpful? Give feedback.
-
Roughly speaking,
Any specific concerns or suggestions(sincerely welcome)? |
Beta Was this translation helpful? Give feedback.
-
I've just realized something, and I will appreciate it if you can give some feedback:
|
Beta Was this translation helpful? Give feedback.
-
pleasure to discuss with you ( and all the kind people who cares about datafause)
For each version of a given table, it is immutable: including schema, partitions, and other meta data of it. When computation layer accessing a table without specifying the version, the latest version of the table will be returned. There might be some exceptions, like re-orging a table (background merging), things of these parts are not fully settled, we need some deeper thoughts here. I agree with you that immutability will benefit cache policy, and statistics (with an affordable cost if the statistics index we are trying to maintain is not "heavy").
I think we should have them eventually, at least, the ansi As always, suggestions are sincerely welcome. |
Beta Was this translation helpful? Give feedback.
-
What are the file format and table format we want to support? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the reply.
Correct me if I was wrong, for each write operation, like insert a single row with This scheme would bring serious overhead, that is, user have to batch the records and ingest them with bulk load. How do you handle this case? |
Beta Was this translation helpful? Give feedback.
-
And besides, I have the question too. Maybe we can have our own unified columnar format? As https://www.firebolt.io/performance-at-scale said, they will transform the external data(e.g. JSON, Parquet, Avro, CSV) into their own data format F3 through the ETL pipeline. |
Beta Was this translation helpful? Give feedback.
-
@leiysky I agree and thought about this, too. Maybe it is a good idea to borrow some ideas from clickhouse to define our own file format. My impression is that clickhouse is very good at filtering data so that the query can be fast. A Parquet file has some stats or summaries for the data it stores. We may want to extend a parquet file's stats or summaries with the clickhouse tricks. |
Beta Was this translation helpful? Give feedback.
-
Yes, I think it's mainly because of sparse index, which can help skipping some data units(depends on data distribution, the effect can be surprising) with min/max, bloom filter or else. It's convenient for us to build different index on a data file, even on demand on the fly. And AFAIK, ORC does have sparse index in their data format, but I haven't do any benchmark on them, so just FYI. Anyway, we can choose a major data format for now to bootstrap fuse-store, but eventually there should be a unified data format designed by ourselves. |
Beta Was this translation helpful? Give feedback.
-
Clickhouse's mergetree engine, and the way they organize the index, is definitely a valuable reference, but IMHO, I am afraid it is not "cloud-native" friendly, @zhang2014 @sundy-li may have more insightful comments on this. We do consider using something like iceberg's table format, which is more "batch_write" friendly as @leiysky mentioned, i.e. if small parts keep being ingested in without batching, something like CK's "Too many parts.." may happen (and the metadata increases for each small ingestion, which brings burdens to the meta layer, as iceberg/Uber mentioned) Totally agreed with you that statistics of parquet and orc is a kind of sparse index. But I am afraid, the statistics of Parquet or ORC can not be used directly, since the data files are supposed to be stored in Object Storage, which is not that efficient to access; the sparse index should be maintained by meta service, and ideally, they could be accessed by computation layers as a normal data source (relation), so that we can do computations with the sparse index parallelly (or even distributedly, but that might be too much). Also, we are considering utilizing some fancy data structures to accelerating the query, like BloomRF etc. May I transfer this issue to a discussion? DataFuse is young, constructive discussions like this are preferred. |
Beta Was this translation helpful? Give feedback.
-
Personally, we should choose a simple&workable solution to implement first. The sparse index is required here, this can be extracted from the parquet or even directly by ourself, and write them to our metadata service. |
Beta Was this translation helpful? Give feedback.
-
@dantengsky Shall we have any catalog object provider like |
Beta Was this translation helpful? Give feedback.
-
Summary
Currently
DataSource
API provides us the essential functionalities of manipulating metadata, but as we moving the v0.5, some extra requirements should be fulfilled, including but not limited to:periodically sync metadata with(from) meta-storefor per statement, implicit transaction.
Note: In v0.5 we are not going to support distributed/parallel insertion/deletion
Beta Was this translation helpful? Give feedback.
All reactions