Native capability to write to Unity catalog #3305
Replies: 4 comments 4 replies
-
Hi @anilmenon14 , thanks for bringing this up! I think what we could do is also allow users to pass in a |
Beta Was this translation helpful? Give feedback.
-
I just started work on a We should think very hard about the API here. Spark's DataFrame writer v2 API could be a good reference point: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrameWriterV2.html It's a pretty complex writer that handles a ton of corner-cases. I think we can have maybe a simpler API but we should definitely think about it 😛 |
Beta Was this translation helpful? Give feedback.
-
Hi @kevinzwang , that was exactly my line of thinking too and how I have written the sample utility function that performs the writes (I have pasted a snippet of the code I have used in my utility function in my Daft project to write to UC). @jaychia has a very valid point about needing to be able to handle a variety of cases. For e.g., here we are making an assumption that the table is of Delta format when the user provides an arbitrary table name while calling any function to write to UC. It is highly likely the table is a Delta table; however, that cannot be guaranteed, and the engine should be able to handle different table formats, especially on overwrites on existing tables (e.g., CSV, JSON, ORC, etc.)
|
Beta Was this translation helpful? Give feedback.
-
Adding it to |
Beta Was this translation helpful? Give feedback.
-
This is more of an idea rather than a feature request. So, thought of initiating it here .
Since Daft has good native capability to read from Databricks Unity catalog using the credential vending in UC OSS ( 0.1.1 version , as of Daft 0.3.13) and has native integration to DeltaLake , implemented through
df.write_deltalake
, does it make sense for your team to have write capability into UC as well?I have been playing around with using the existing capabilities in Daft (version 0.3.11) and could use the existing APIs to write out to a new/existing Unity catalog table as shown in this sample snippet.
@jaychia , I know you had some past conversations on delta-rs vs delta-kernel-rs that might play into this discussion , hence tagging you for your thoughts on this.
I also know @kevinzwang has been contributing towards the new UC OSS 0.2 being available as a Python client and tagging you as well, in case you have some context of whether there is a dependency on using the new Python client.
Beta Was this translation helpful? Give feedback.
All reactions