-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Left Join becomes Inner Join for inequality conditions #190
Comments
Hi @flcong |
Yeah. I can take a look tomorrow. It would be helpful to have some guidance for developers. |
Super cool! Here are some first pointers (but of course, feel free to ask for more information):
I think what currently happens is the following: I will use a very simple example with the first dataframe [1, 2, 3] and the second one [4, 5, 6]. After the cross join, we end up with all combinations, like [1, 4], [1, 5], ... [3, 6]. Then we apply the join condition as a filter, but there is just no entries [1, NULL] (which would be needed for a LEFT join). Honestly, I am not 100% sure how we would solve this best... |
Thank you for the information. I'll look into it. I'm very glad about this project. Previously, I wrote a small function to add pandas DataFrames to sqlite3 server and then run queries using sqlite3, but it turns out to be very slow for large data sets. With the development of this project, maybe some day I will finally give up SAS completely. |
I am glad it helps you :-) |
Hi. I followed the steps to install dask-sql in development mode, including installation of JDK and maven and compiled Java classes. I'm on Win10 64-bit. I encountered a series of
|
Thanks for getting back to me! I see those also on the GitHub windows build, but as I am working with Linux I am not an expert in Windows (and debugging on a GitHub worker is quite tedious). |
Hi @flcong! Did you have time to look into the issue with the joins further? Is there anything I can help you with? |
Hi, @nils-braun . Yeah, I've finished editing the |
Code:
Results:
df1
:df2
:df3
:This is an Inner Join, not Left Join.
The correct output should be as follows, using sqlite3:
where
df3
isThe text was updated successfully, but these errors were encountered: