You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following code snippet is working for older version delta protocol: Line1: dt = DeltaTable(deltaPathName) Line2: df = dt.to_pandas()
But for latest ProtocolVersions(min_reader_version=2, min_writer_version=5) it is converting all column data to null or none while to_pandas conversion.
Storage backend: adls gen2
For some other tables it is giving warning when executing Line1:
For some other tables it is throwing error when executing Line2:
File "/home/ankit/SICDPDataAnalyticsPipeline/deltalakeservices/.venv/lib/python3.8/site-packages/deltalake/table.py", line 334, in to_pyarrow_table
return self.to_pyarrow_dataset(
File "pyarrow/_dataset.pyx", line 331, in pyarrow._dataset.Dataset.to_table
File "pyarrow/_dataset.pyx", line 2577, in pyarrow._dataset.Scanner.to_table
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
pyo3_runtime.PanicException: dispatch dropped without returning error
When will the support for latest delta protocol version will be added?
The text was updated successfully, but these errors were encountered:
Oh well first, it's definitely a bug that it doesn't error on reader protocol version 2. I'll create a separate issue for that, and we'll consider this issue about supporting the higher reader protocol version.
In Python, we probably won't support reading those tables until the late next year. Supporting column mapping is going to require a major refactor of the implementation.
And for higher delta writer protocols, I don't think we have any particular timeline for that. Supporting more operations (upsert, merge) is more important to us than the higher protocol versions. That being said, if someone wanted to implement the support we would definitely take PRs for it.
Python release version: deltalake==0.6.2
The following code snippet is working for older version delta protocol:
Line1: dt = DeltaTable(deltaPathName)
Line2: df = dt.to_pandas()
But for latest ProtocolVersions(min_reader_version=2, min_writer_version=5) it is converting all column data to null or none while to_pandas conversion.
Storage backend: adls gen2
For some other tables it is giving warning when executing Line1:
For some other tables it is throwing error when executing Line2:
When will the support for latest delta protocol version will be added?
The text was updated successfully, but these errors were encountered: