-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove DuckDB from lakeFS Docker image #6001
Comments
@rmoff might be able to add more context, but the reason we introduced DuckDB into the image in the first place was to support the quickstart: "Making a Change to The Data" flow. We wanted to ensure 2 things:
Including the DuckDB binary in the image solves for 1,2 and reasonably well for 3. The PR linked above was recently released - it's possible that we can now use DuckDB-WASM to write data, so I believe that's perhaps the most promising path forward. |
Thanks @arielshaqed for raising this, and @ozkatz for the accurate description of the requirements. I realise that it's not a perfect solution, but for now, it's a "good enough" one to help address the immediate requirement (let users get a feel for what lakeFS is all about). As Oz says, the DuckDB-WASM route looks like the best option. It looks like there's been a release since the PR was merged so I'll try it out to see if it solves our requirement. |
I understand that it provides a path to run DuckDB with lakeFS that requires the least immediate effort by us. That is the benefit. I opened this issue because of the cost. The issue is that it requires Treeverse to support a DuckDB build, with all that that entails. Even Motherduck don't do this. Plus we want to do it inside a build for a separate project. If we are willing to pay the price:
|
+1 Agree with @arielshaqed on this one - I don't think we should maintain DuckDB as part of our Docker image. I understand that we need to enable these capabilities for the new user. It is possible that we will need to support these capabilities in the future, as they seem to be an integrated part of lakeFS, rather than a separate tool that works with lakeFS. |
Reopening as I don't see any {Docker,Make}file changes in #6044. (But please reclose if I am missing something, especially possible today!) |
Thanks for re-opening @arielshaqed - I guess the mere mention of the issue on the PR was enough for our bot overlords to automatically close this. Indeed, the Dockerfile, Makefile and quickstart docs need to change for this to close. Will try and push for it to happen very soon. |
Read your comment carefully 🤯 : according to the docs you specifically asked it to "close #6001" at the bottom of the description of #6044, 🤖 probably not where you thought about it. |
Why
We recently started maintaining a DuckDB binary as part of a set of lakeFS images
lakefs:...-with-duckdb
. This makes no sense:I believe the best way our is to remove DuckDB from lakeFS Docker images.
Alternatives
If we decide we do need DuckDB, there are multiple options. A few easy ones:
The text was updated successfully, but these errors were encountered: