-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
write_delta merge issue: Generic DeltaTable error: Unable to convert expression to string #20597
Comments
Same isssue for me, just started happening today 2025-01-08, but it was because I upgraded my environment. I reverted environment and the merge works again. Working env is python3.11 with requirements.txt containing: adlfs==2024.7.0 |
This is likely related to the interaction with view types. I will look into this over the weekend and make a PR for delta-rs. In the mean time, you can use DeltaTable.merge and do df.to_arrow |
In our experience, the issue was with deltalake (delta-rs) version. Polars 1.19.0 worked with deltalake version 0.22.3, but didn't work (with same error as above) with 0.23.0. |
@ion-elgreco - Could you give a simple example of that proposed workaround for a merge? My current merge is like so: df.write_delta(
tgt_table_path
, storage_options=storage_options
, mode='merge'
, delta_merge_options={
"predicate" : "s.TxnDate = t.TxnDate and s.fdTxnKey = t.fdTxnKey"
, "source_alias" : "s"
, "target_alias" : "t"
}
, delta_write_options={
"partition_by" : 'TxnDate'
},
).when_matched_update_all().when_not_matched_insert_all().execute() I've tried @HectorPascual's combination of polars and deltalake versions and still get that generic decoding error. Also, polars rocks. |
@toby01234 use the merge on the DeltaTable directy: https://delta-io.github.io/delta-rs/api/delta_table/#deltalake.DeltaTable.merge |
This works fine. pl.DataFrame({
'id': ['a', 'b', 'c', 'd'],
'val': [41, 51, 61, 71]
}).write_delta(
tgt_table_path,
mode='merge',
delta_merge_options={
"predicate": "src.id == tgt.id",
"source_alias": "src",
"target_alias": "tgt"
}
).when_matched_update_all(
).when_not_matched_insert_all(
).when_not_matched_by_source_delete(
).execute() |
This works fine. pl.DataFrame({
'id': ['a', 'b', 'c', 'd'],
'val': [41, 51, 61, 71]
}).write_delta(
tgt_table_path,
mode='merge',
delta_merge_options={
"predicate": "src.id = tgt.id",
"source_alias": "src",
"target_alias": "tgt"
}
).when_not_matched_insert_all(
).when_matched_update_all(
).when_not_matched_by_source_update(updates = {"y": "0"}
).execute() |
Fix should be in the next release for deltalake :) |
I can confirm it is working now, with deltalake==0.24.0 🎉 |
Checks
Reproducible example
Log output
Issue description
the error is self-speaking.
Also, I don't know if this is useful information but it seems to happen with every data type in the source dataframe (int, float, datetime, etc...)
Expected behavior
merge operation successfully completed
Installed versions
The text was updated successfully, but these errors were encountered: