Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_duckdb conflicts with citus #444

Open
2 tasks done
dpxcc opened this issue Nov 18, 2024 · 3 comments
Open
2 tasks done

pg_duckdb conflicts with citus #444

dpxcc opened this issue Nov 18, 2024 · 3 comments
Labels
enhancement New feature or request
Milestone

Comments

@dpxcc
Copy link
Contributor

dpxcc commented Nov 18, 2024

What happens?

Server crashes when running citus and pg_duckdb extensions on the same Postgres instance

To Reproduce

Built citus 12.1.4 and pg_duckdb latest git on Postgre 16.4, and enabled both of them. The following query

SET duckdb.force_execution = ON;
CREATE TABLE t (id BIGINT, log TEXT, ts TIMESTAMP);
INSERT INTO t VALUES (1, 'test', '2024-01-01 08:00:00');
SELECT * FROM t;   -- crash

crashes with backtrace

#0  __pthread_kill_implementation (threadid=281472891845536, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x0000ffff834cf254 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x0000ffff8348a67c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000ffff83477130 in __GI_abort () at ./stdlib/abort.c:79
#4  0x0000aaaae68d42e4 in ExceptionalCondition (conditionName=0xffff809c4710 "plannerRestrictionContextList != NIL", fileName=0xffff809c4130 "planner/distributed_planner.c", lineNumber=2473) at assert.c:66
#5  0x0000ffff80904af8 in CurrentPlannerRestrictionContext () at planner/distributed_planner.c:2473
#6  0x0000ffff80903d34 in multi_relation_restriction_hook (root=0xaaab090808a8, relOptInfo=0xaaab090811e8, restrictionIndex=1, rte=0xaaab090813f8) at planner/distributed_planner.c:1984
#7  0x0000ffff7fe48120 in ColumnarSetRelPathlistHook (root=0xaaab090808a8, rel=0xaaab090811e8, rti=1, rte=0xaaab090813f8) at columnar_customscan.c:272
#8  0x0000aaaae653a548 in set_rel_pathlist (root=0xaaab090808a8, rel=0xaaab090811e8, rti=1, rte=0xaaab090813f8) at allpaths.c:542
#9  0x0000aaaae653a03c in set_base_rel_pathlists (root=0xaaab090808a8) at allpaths.c:354
#10 0x0000aaaae6539cf4 in make_one_rel (root=0xaaab090808a8, joinlist=0xaaab08bb6538) at allpaths.c:224
#11 0x0000aaaae657a1e4 in query_planner (root=0xaaab090808a8, qp_callback=0xaaaae658006c <standard_qp_callback>, qp_extra=0xffffd9dd1c98) at planmain.c:278
#12 0x0000aaaae657c834 in grouping_planner (root=0xaaab090808a8, tuple_fraction=0) at planner.c:1495
#13 0x0000aaaae657c034 in subquery_planner (glob=0xaaab09081528, parse=0xaaab09080ea8, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1064
#14 0x0000aaaae657a918 in standard_planner (parse=0xaaab09080ea8, query_string=0xaaab087528d8 "SELECT * FROM t;", cursorOptions=2048, boundParams=0x0) at planner.c:413
#15 0x0000ffff7fdabfa8 in DuckdbPlanNode (parse=parse@entry=0xaaab08753750, query_string=query_string@entry=0xaaab087528d8 "SELECT * FROM t;", cursor_options=cursor_options@entry=2048, bound_params=bound_params@entry=0x0, throw_error=throw_error@entry=false) at src/pgduckdb_planner.cpp:137
#16 0x0000ffff7fda373c in DuckdbPlannerHook_Cpp (bound_params=0x0, cursor_options=2048, query_string=0xaaab087528d8 "SELECT * FROM t;", parse=0xaaab08753750) at src/pgduckdb_hooks.cpp:204
#17 pgduckdb::__CPPFunctionGuard__<PlannedStmt* (*)(Query*, char const*, int, ParamListInfoData*), DuckdbPlannerHook_Cpp, Query*, char const*, int, ParamListInfoData*> (func_name=0xffff7fde3568 "DuckdbPlannerHook") at src/pgduckdb_hooks.cpp:221
#18 0x0000aaaae657a6a4 in planner (parse=0xaaab08753750, query_string=0xaaab087528d8 "SELECT * FROM t;", cursorOptions=2048, boundParams=0x0) at planner.c:279
#19 0x0000aaaae66df0d8 in pg_plan_query (querytree=0xaaab08753750, query_string=0xaaab087528d8 "SELECT * FROM t;", cursorOptions=2048, boundParams=0x0) at postgres.c:908
#20 0x0000aaaae66df234 in pg_plan_queries (querytrees=0xaaab08754198, query_string=0xaaab087528d8 "SELECT * FROM t;", cursorOptions=2048, boundParams=0x0) at postgres.c:1000
#21 0x0000aaaae66df5c4 in exec_simple_query (query_string=0xaaab087528d8 "SELECT * FROM t;") at postgres.c:1197
#22 0x0000aaaae66e5064 in PostgresMain (dbname=0xaaab087f6138 "postgres", username=0xaaab087f6118 "postgres") at postgres.c:4701
#23 0x0000aaaae65efcc0 in BackendRun (port=0xaaab087e5900) at postmaster.c:4464
#24 0x0000aaaae65ef5c8 in BackendStartup (port=0xaaab087e5900) at postmaster.c:4192
#25 0x0000aaaae65eb01c in ServerLoop () at postmaster.c:1782
#26 0x0000aaaae65ea7f8 in PostmasterMain (argc=3, argv=0xaaab0874ce50) at postmaster.c:1466
#27 0x0000aaaae64a599c in main (argc=3, argv=0xaaab0874ce50) at main.c:198

It is hitting Assert(plannerRestrictionContextList != NIL); in citus at https://github.com/citusdata/citus/blob/main/src/backend/distributed/planner/distributed_planner.c#L2473

OS:

Linux

pg_duckdb Version (if built from source use commit hash):

2da4473

Postgres Version (if built from source use commit hash):

16.4

Hardware:

No response

Full Name:

Cheng Chen

Affiliation:

Mooncake Labs

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a source build

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

  • Yes, I have
@JelteF JelteF added this to the Long term milestone Nov 19, 2024
@JelteF JelteF added the bug Something isn't working label Nov 19, 2024
@JelteF
Copy link
Collaborator

JelteF commented Nov 19, 2024

In what order did you include pg_duckdb and citus in shared_preload_libraries? Maybe try swapping the order to see if that resolves this specific issue.

More generally though, both Citus and pg_duckdb greatly change how Postgres plans and executes its queries. Having them work together well is not an easy task and is not on the current priority list.

@dpxcc
Copy link
Contributor Author

dpxcc commented Nov 19, 2024

citus has to be placed first in shared_preload_libraries, otherwise pg_ctl -D /usr/local/pgsql/data restart fails with

waiting for server to start....2024-11-19 20:00:08.935 UTC [11029] FATAL:  Citus has to be loaded first
2024-11-19 20:00:08.935 UTC [11029] HINT:  Place citus at the beginning of shared_preload_libraries.

@wuputah wuputah added enhancement New feature or request and removed bug Something isn't working labels Nov 26, 2024
@hosting2000me
Copy link

Having them work together well is not an easy task and is not on the current priority list.

I confirm the issue. With Citus enabled, we experience server crashes. This is a serious problem because both Citus and DuckDB complement each other well for analytical tasks. If I have to choose between the two, Citus offers more diverse possibilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants