DuckDB - Access or create a persistent database #159

armetiz · 2023-10-03T15:16:28Z

Feature description

By default, DuckDB start with an in-memory database.

To avoid out-of-memory, it could be useful to connect DuckDB to a database file.

When using the jdbc:duckdb: URL alone, an in-memory database is created. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the Java program). If you would like to access or create a persistent database, append its file name after the path. For example, if your database is stored in /tmp/my_database, use the JDBC URL jdbc:duckdb:/tmp/my_database to create a connection to it.

armetiz · 2023-10-04T14:45:08Z

On MBP, using DuckDB 0.9.0 with an in-memory database, I tried to fetch a large data-set.

Here the DuckDB error :

Error: near line 1: Out of Memory Error: could not allocate block of size 262KB (27.4GB/27.4GB used)
Database is launched in in-memory mode and no temporary directory is specified.
Unused blocks cannot be offloaded to disk.

Launch the database with a persistent storage back-end
Or set PRAGMA temp_directory='/path/to/tmp.tmp'

IMHO,
Using an in-memory database with setting temp_directory is adapted to a stateless task.
This should be the case by default.

Whereas using DuckDB with a persistent storage back-end could be useful only if it could be "re-used" between tasks.
This should be a Kestra option.

I mean something like that.
Tasks :

Create and import SQL table - echo "CREATE TABLE t1 AS SELECT 42 AS i, 84 AS j;" | duckdb database.file
Export analyze - echo "COPY t1 TO 'output.parquet' (FORMAT PARQUET)" | duckdb database.file

It could be useful because SQL operations could be split between dedicated task, to improve debug, maintenance, readability ...

armetiz added the enhancement New feature or request label Oct 3, 2023

anna-geller added the plugin label Oct 5, 2023

anna-geller added this to the v0.19.0 milestone Dec 5, 2023

tchiotludo removed the plugin label Jul 5, 2024

anna-geller added area/plugin Plugin-related issue or feature request area/backend Needs backend code changes labels Aug 7, 2024

anna-geller added the kind/good-first-issue label Aug 15, 2024

anna-geller removed this from the v0.19.0 milestone Aug 15, 2024

tchiotludo added good first issue Great issue for new contributors and removed kind/good-first-issue labels Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DuckDB - Access or create a persistent database #159

DuckDB - Access or create a persistent database #159

armetiz commented Oct 3, 2023 •

edited

Loading

armetiz commented Oct 4, 2023

DuckDB - Access or create a persistent database #159

DuckDB - Access or create a persistent database #159

Comments

armetiz commented Oct 3, 2023 • edited Loading

Feature description

armetiz commented Oct 4, 2023

armetiz commented Oct 3, 2023 •

edited

Loading