-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update don't seems to be working #1740
Comments
The values of the >>> fruits_dl.update(predicate="a = 5",updates={"fruits": "'banana'"})
{'num_added_files': 1, 'num_removed_files': 1, 'num_updated_rows': 1, 'num_copied_rows': 4, 'execution_time_ms': 26, 'scan_time_ms': 17}
>>> fruits_dl.to_pandas()
a fruits
0 1 banana
1 2 orange
2 3 mango
3 4 apple
4 5 banana |
It does feel like it would be nice to be able to pass python objects here though. Something like: table.update(
predicate="a = 5",
new_values={
"fruits": "banana",
"count": 2,
"last_updated": datetime.now()
}
) But this should be provided by a different parameter. Otherwise we could end up interpreting the SQL strings as straight string values, and that wouldn't be great. cc @ion-elgreco |
That could be useful, and then we could apply this to MERGE as well. I'll pick it up next week! Want to wrap the vacuum commit thing first :) |
slightly related, is there any documentation how to use merge in Python ? |
For now you can refer to the examples in under .merge() and TableMerger Class. https://delta-io.github.io/delta-rs/python/api_reference.html#deltalake.table.TableMerger I will write some better usage documentation tomorrow on this and a blog post. |
…date()` (#1749) # Description A user can now add a new_values dictionary that contains python objects as a value. Some weird behavior's I noticed, probably related to datafusion, updating a timestamp column has to be done by providing a unix timestamp in microseconds. I personally find this very confusing, I was expecting to be able to pass "2012-10-01" for example in the updates. Another weird behaviour is with list of string columns. I can pass `{"list_of_string_col":"[1,2,3]"}` or `{"list_of_string_col":"['1','2','3']"}` and both will work. I expect the first one to raise an exception on invalid datatypes. Combined datatypes `"[1,2,'3']"` luckily do raise an error by datafusion. # Related Issue(s) <!--- For example: - closes #106 ---> - closes #1740 --------- Co-authored-by: Will Jones <willjones127@gmail.com>
Environment
0.12 , google colab
using this code
I get this error
The text was updated successfully, but these errors were encountered: