Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributing tables with thousands of partitions & large shard_count triggers an OOM error #6572

Closed
lucasbfernandes opened this issue Dec 21, 2022 · 0 comments · Fixed by #6722

Comments

@lucasbfernandes
Copy link

Settings

PostgreSQL version: 14

Citus version: 11.1

Coordinator node: 4 vCores / 16 GiB RAM, 512 GiB storage

Worker nodes: 4 nodes, 16 vCores / 512 GiB RAM, 4096 GiB storage

Problem description

When running a create_distributed_table function on a table with thousands of partitions and with a large citus.shard_count configured, the following error is triggered:

ERROR:  out of memory
DETAIL:  Failed on request of size 56 in memory context "ExprContext".

How to reproduce

Execute the following SQL statements:

CREATE SCHEMA oom;

CREATE TABLE oom.orders (
    id bigint,
    order_time timestamp without time zone NOT NULL,
    region_id bigint NOT NULL
)
PARTITION BY RANGE (order_time);

SELECT create_time_partitions(
  table_name         := 'oom.orders',
  partition_interval := '1 day',
  start_from        := now() - '5 years'::interval,
  end_at             := now() + '5 years'::interval
);

This will create a table with 3653 partitions without any data. Next, run the following commands to increase the citus.shard_count value and distribute the newly created table.

SET citus.shard_count to 200;

SELECT create_distributed_table('oom.orders', 'region_id');

After some time, you should see the OOM error.

aykut-bozkurt added a commit that referenced this issue Feb 17, 2023
…ns (#6722)

We have memory leak during distribution of a table with a lot of
partitions as we do not release memory at ExprContext until all
partitions are not distributed. We improved 2 things to resolve the
issue:

1. We create and delete MemoryContext for each call to
`CreateDistributedTable` by partitions,
2. We rebuild the cache after we insert all the placements instead of
each placement for a shard.

DESCRIPTION: Fixes memory leak during distribution of a table with a lot
of partitions and shards.

Fixes #6572.
emelsimsek pushed a commit that referenced this issue Mar 6, 2023
…ns (#6722)

We have memory leak during distribution of a table with a lot of
partitions as we do not release memory at ExprContext until all
partitions are not distributed. We improved 2 things to resolve the
issue:

1. We create and delete MemoryContext for each call to
`CreateDistributedTable` by partitions,
2. We rebuild the cache after we insert all the placements instead of
each placement for a shard.

DESCRIPTION: Fixes memory leak during distribution of a table with a lot
of partitions and shards.

Fixes #6572.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant