Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for creating distributed tables with a null shard key #6745

Merged

Conversation

onurctirtir
Copy link
Member

@onurctirtir onurctirtir commented Mar 3, 2023

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:

  • Specifying shard_count is not allowed because it is assumed to be 1.
  • We mostly call such tables as "null shard-key" table in code / comments.
  • To avoid doing a breaking layout change in create_distributed_table();
    instead of throwing an error, it will inform the user that distribution_type
    param is ignored unless it's explicitly set to NULL.
  • colocate_with param allows colocating such null shard-key tables to
    each other.
  • We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass of
    DISTRIBUTED_TABLE because we mostly want to treat them as distributed
    tables in terms of SQL / DDL / operation support.
  • Metadata for such tables look like:
    • distribution method => DISTRIBUTE_BY_NONE
    • replication model => REPLICATION_MODEL_STREAMING
    • colocation id => != INVALID_COLOCATION_ID (distinguishes from Citus local tables)
  • We assign colocation groups for such tables to different nodes in a
    round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a preliminary
API.

@codecov
Copy link

codecov bot commented Mar 3, 2023

Codecov Report

Merging #6745 (7b94697) into main (cf55136) will decrease coverage by 1.12%.
The diff coverage is 92.66%.

❗ Current head 7b94697 differs from pull request most recent head e16df14. Consider uploading reports for the commit e16df14 to get more accurate results

@@            Coverage Diff             @@
##             main    #6745      +/-   ##
==========================================
- Coverage   93.17%   92.05%   -1.12%     
==========================================
  Files         260      259       -1     
  Lines       56106    56031      -75     
==========================================
- Hits        52275    51579     -696     
- Misses       3831     4452     +621     

Comment on lines 221 to 226
if (targetCacheEntry->partitionMethod != DISTRIBUTE_BY_NONE)
{
/* make sure that tables are hash partitioned */
CheckHashPartitionedTable(targetRelationId);
CheckHashPartitionedTable(sourceRelationId);
}
Copy link
Member Author

@onurctirtir onurctirtir Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even more, we can drop this check because at some point we anyway check whether two tables can be colocated (in ColocationIdForNewTable())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose range-partitioned tables can be co-located

Copy link
Member Author

@onurctirtir onurctirtir Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do you think we shouldn't have had those checks already?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change to an assert

Copy link
Member Author

@onurctirtir onurctirtir Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On main, this is what happens if try using colocate_with option for range distributed tables:

SELECT create_distributed_table('tbl_1', 'a', distribution_type=>'range', colocate_with=>'tbl');
ERROR:  cannot distribute relation
DETAIL:  Currently, colocate_with option is only supported for hash distributed tables.

So I think removing those checks / converting them to Assert()s wouldn't look like something that this PR is expected to do, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not this error though, can we really get here? it does not appear in tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those are more like sanity checks, so let me simply make them asserts then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh we can't make them asserts because their return type is void, so let me leave them as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert(targetCacheEntry->partitionMethod == DISTRIBUTE_BY_HASH ||
       targetCacheEntry->partitionMethod == DISTRIBUTE_BY_NONE);

@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch from ed967a6 to 2d88ba6 Compare March 3, 2023 14:18
@onurctirtir onurctirtir requested a review from marcocitus March 7, 2023 09:53
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch 3 times, most recently from fef7e99 to 52d11cb Compare March 7, 2023 15:43
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch from 52d11cb to 6399baf Compare March 8, 2023 10:43
@onurctirtir onurctirtir changed the base branch from main to clarify-dist-by-none March 8, 2023 10:44
@onurctirtir onurctirtir force-pushed the clarify-dist-by-none branch from 15d8b48 to 538541d Compare March 8, 2023 23:29
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch 2 times, most recently from 8dce689 to b2d6fbf Compare March 9, 2023 09:17
@onurctirtir onurctirtir force-pushed the clarify-dist-by-none branch 4 times, most recently from 7ee5b87 to bd2228a Compare March 9, 2023 11:53
}

static int roundRobinNodeIdx = -1;
roundRobinNodeIdx++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems local to the session. Is there a reliable way to make it round robin across sessions? (e.g. using shard ID)

LockRelationOid(DistNodeRelationId(), RowShareLock);

/* load and sort the worker node list for deterministic placement */
List *workerNodeList = DistributedTablePlacementNodeList(NoLock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we could use RowShareLock and omit the line above

}
else
{
CreateNullKeyShard(relationId, useExclusiveConnection);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if we should add something like RoundRobin to the name to indiciate that this function is not to be used for co-located null key table creation

Copy link
Member

@marcocitus marcocitus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks reasonable. Do we want to merge soon and add tests in subsequent PRs or is this the beginning of a feature branch? (I think I'm fine either way)

Base automatically changed from clarify-dist-by-none to main March 10, 2023 10:55
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch 3 times, most recently from d778329 to efe2124 Compare March 13, 2023 09:13
(2 rows)

-- This should fail due to missing OVERRIDING SYSTEM VALUE
-- after merging https://github.com/citusdata/citus/pull/6738.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base automatically changed from decide-dist-params-in-create-citus-table to main March 14, 2023 11:24
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch from 3396fe0 to 4249163 Compare March 14, 2023 11:25
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch 2 times, most recently from eb205d2 to 7b94697 Compare March 15, 2023 11:47
@onurctirtir
Copy link
Member Author

onurctirtir commented Mar 15, 2023

crash:

create table s1 (x int, y int);
create table s2 (x int, y int);
create table s3 (x int, y int);
create table s4 (x int, y int);
select create_distributed_table('s1');
select create_distributed_table('s2');
select create_distributed_table('s3', NULL, colocate_with := 'none');
select create_distributed_table('s4', NULL, colocate_with := 's3');
insert into s2 select * from s3;

stack trace:

#0  0x00007f9831fafa25 in InsertPartitionColumnMatchesSelect (query=query@entry=0x564ba1740448, insertRte=insertRte@entry=0x564ba14e0fe8, 
    subqueryRte=subqueryRte@entry=0x564ba1740930, selectPartitionColumnTableId=selectPartitionColumnTableId@entry=0x7fff9c820664) at planner/insert_select_planner.c:1285
1285                    if (originalAttrNo != insertPartitionColumn->varattno)
#0  0x00007f9831fafa25 in InsertPartitionColumnMatchesSelect (query=query@entry=0x564ba1740448, insertRte=insertRte@entry=0x564ba14e0fe8, 
    subqueryRte=subqueryRte@entry=0x564ba1740930, selectPartitionColumnTableId=selectPartitionColumnTableId@entry=0x7fff9c820664) at planner/insert_select_planner.c:1285
#1  0x00007f9831fb007b in DistributedInsertSelectSupported (queryTree=queryTree@entry=0x564ba1740448, insertRte=insertRte@entry=0x564ba14e0fe8, 
    subqueryRte=subqueryRte@entry=0x564ba1740930, allReferenceTables=allReferenceTables@entry=false) at planner/insert_select_planner.c:721
#2  0x00007f9831fb0ad9 in CreateDistributedInsertSelectPlan (originalQuery=originalQuery@entry=0x564ba1740448, 
    plannerRestrictionContext=plannerRestrictionContext@entry=0x564ba16b1d18) at planner/insert_select_planner.c:292
#3  0x00007f9831fb0f5e in CreateInsertSelectPlanInternal (planId=planId@entry=1, originalQuery=originalQuery@entry=0x564ba1740448, 
    plannerRestrictionContext=plannerRestrictionContext@entry=0x564ba16b1d18, boundParams=boundParams@entry=0x0) at planner/insert_select_planner.c:244
#4  0x00007f9831fb104c in CreateInsertSelectPlan (planId=planId@entry=1, originalQuery=originalQuery@entry=0x564ba1740448, 
    plannerRestrictionContext=plannerRestrictionContext@entry=0x564ba16b1d18, boundParams=boundParams@entry=0x0) at planner/insert_select_planner.c:211
#5  0x00007f9831faaa91 in CreateDistributedPlan (planId=planId@entry=1, allowRecursivePlanning=allowRecursivePlanning@entry=true, originalQuery=0x564ba1740448, 
    query=0x564ba14e1100, boundParams=0x0, hasUnresolvedParams=hasUnresolvedParams@entry=false, plannerRestrictionContext=0x564ba16b1d18) at planner/distributed_planner.c:995
#6  0x00007f9831fab68d in CreateDistributedPlannedStmt (planContext=planContext@entry=0x7fff9c820970) at planner/distributed_planner.c:794
#7  0x00007f9831fabaa9 in PlanDistributedStmt (planContext=planContext@entry=0x7fff9c820970, rteIdCounter=<optimized out>, rteIdCounter@entry=3)
    at planner/distributed_planner.c:730
#8  0x00007f9831fac536 in distributed_planner (parse=0x564ba14e1100, query_string=<optimized out>, cursorOptions=<optimized out>, boundParams=0x0)
    at planner/distributed_planner.c:264
#9  0x00007f9830d1f0e3 in PgmongoPlanner (parse=0x564ba14e1100, queryString=0x564ba14e0100 "insert into s2 select * from s3;", cursorOptions=2048, boundParams=0x0)
    at src/planner/pgmongo_planner.c:111
#10 0x00007f9831eac79d in pgss_planner (parse=0x564ba14e1100, query_string=0x564ba14e0100 "insert into s2 select * from s3;", cursorOptions=2048, boundParams=0x0)
    at pg_stat_statements.c:953
#11 0x0000564ba0756b3a in planner (parse=parse@entry=0x564ba14e1100, query_string=query_string@entry=0x564ba14e0100 "insert into s2 select * from s3;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:275
#12 0x0000564ba084268c in pg_plan_query (querytree=querytree@entry=0x564ba14e1100, query_string=query_string@entry=0x564ba14e0100 "insert into s2 select * from s3;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:883
#13 0x0000564ba0842768 in pg_plan_queries (querytrees=0x564ba1740398, query_string=query_string@entry=0x564ba14e0100 "insert into s2 select * from s3;", 
    cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:975
#14 0x0000564ba0842c99 in exec_simple_query (query_string=query_string@entry=0x564ba14e0100 "insert into s2 select * from s3;") at postgres.c:1169
#15 0x0000564ba0844d14 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4593
#16 0x0000564ba079d456 in BackendRun (port=port@entry=0x564ba1581250) at postmaster.c:4511
#17 0x0000564ba07a064b in BackendStartup (port=port@entry=0x564ba1581250) at postmaster.c:4239
#18 0x0000564ba07a0894 in ServerLoop () at postmaster.c:1806
#19 0x0000564ba07a1e66 in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1478
#20 0x0000564ba06e289c in main (argc=3, argv=0x564ba14d8d00) at main.c:202

Added a temporary check to prevent crashing: https://github.com/citusdata/citus/blob/create-null-shard-key-tbl/src/backend/distributed/planner/insert_select_planner.c#L718-L724

@onurctirtir
Copy link
Member Author

fwiw, this errors nicely:

CREATE OR REPLACE FUNCTION test(x int)
RETURNS void
AS $$
DECLARE
BEGIN
    RAISE NOTICE 'x %', x;
END;
$$ LANGUAGE plpgsql;
SELECT create_distributed_function('test(int)', 'x', colocate_with :='s4');
ERROR:  cannot colocate function "test" and table "s4" because colocate_with option is only supported for hash distributed tables and reference tables.

(we should think a bit about function call delegation for single shard tables, but not right now)

Pushed some changes to throw an error when the shard key is NULL, though we might want to add support for create_distributed_table_concurrently() later on.

@onurctirtir onurctirtir requested a review from marcocitus March 15, 2023 12:00
@marcocitus
Copy link
Member

apart from 2 latest comments, this looks pretty good to me

@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch from 7b94697 to e16df14 Compare March 17, 2023 09:59
@onurctirtir onurctirtir force-pushed the create-null-shard-key-tbl branch from e16df14 to fbb8545 Compare March 17, 2023 11:12
@onurctirtir onurctirtir changed the base branch from main to null-shard-key-devel-protected March 17, 2023 11:15
@onurctirtir onurctirtir merged commit a4cfc32 into null-shard-key-devel-protected Mar 17, 2023
@onurctirtir onurctirtir deleted the create-null-shard-key-tbl branch March 17, 2023 11:17
onurctirtir added a commit that referenced this pull request Mar 27, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 17, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 18, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 18, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 18, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 19, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 24, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
onurctirtir added a commit that referenced this pull request Apr 26, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
hanefi added a commit that referenced this pull request May 2, 2023
In this release, I tried something different. I experimented with adding
the PR number and title to the changelog right before each changelog
entry. This way, it is easier to track where a particular changelog
entry comes from. After reviews are over, I plan to remove those lines
with PR numbers and titles.

I went through all the PRs that are merged after 11.2.0 release and came
up with a list of PRs that may need help with changelog entries. You can
see details on PRs grouped in several sections below.

## PRs with missing entries

The following PRs below do not have a changelog entry. If you think that
this is a mistake, please share it in this PR along with a suggestion on
what the changelog item should be.

PR #6846 : fix 3 flaky tests in failure schedule
PR #6844 : Add CPU usage to citus_stat_tenants
PR #6833 : Fix citus_stat_tenants period updating bug
PR #6787 : Add more tests for ddl coverage
PR #6842 : Add build-cdc-* temporary directories to .gitignore
PR #6841 : Add build-cdc-* temporary directories to .gitignore
PR #6840 : Bump Citus to 12.0devel
PR #6824 : Fixes flakiness in multi_metadata_sync test
PR #6811 : Backport identity column improvements to v11.2
PR #6830 : In run_test.py actually return worker_count
PR #6825 : Fixes flakiness in multi_cluster_management test
PR #6816 : Refactor run_test.py
PR #6817 : Explicitly disallow local rels when inserting into dist table
PR #6821 : Rename citus stats tenants
PR #6822 : Add some more tests for initial sql support
PR #6819 : Fix flakyness in
citus_split_shard_by_split_points_deferred_drop
PR #6814 : Make python-regress based tests runnable with run_test.py
PR #6813 : Fix flaky multi_mx_schema_support test
PR #6720 : Convert columnar tap tests to pytest
PR #6812 : Revoke statistics permissions from public and grant them to
pg_monitor
PR #6769 : Citus stats tenants guc
PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool
parameter, pass a NULL as the value is not used
PR #6797 : Attribute local queries and cached plans on local execution
PR #6796 : Parse the annotation string correctly
PR #6762 : Add logs to citus_stats_tenants
PR #6773 : Add initial sql support for distributed tables that don't
have a shard key
PR #6792 : Disentangle MERGE planning code from the modify-planning code
path
PR #6761 : Citus stats tenants collector view
PR #6791 : Make 8 more tests runnable multiple times via run_test.py
PR #6786 : Refactor some of the planning code to accommodate a new
planning path for MERGE SQL
PR #6789 : Rename AllRelations.. functions to AllDistributedRelations..
PR #6788 : Actually skip arbitrary_configs_router & nested_execution for
AllNullDistKeyDefaultConfig
PR #6783 : Add a config for arbitrary config tests where all the tables
are null-shard-key tables
PR #6784 : Fix attach partition: citus local to null distributed
PR #6782 : Add an arbitrary config test heavily based on
multi_router_planner_fast_path.sql
PR #6781 : Decide what to do with router planner error at one place
PR #6778 : Support partitioning for dist tables with null dist keys
PR #6766 : fix pip lock file
PR #6764 : Make workerCount configurable for regression tests
PR #6745 : Add support for creating distributed tables with a null shard
key
PR #6696 : This implements MERGE phase-III
PR #6767 : Add pytest depedencies to Pipfile
PR #6760 : Decide core distribution params in CreateCitusTable
PR #6759 : Add multi_create_fdw into minimal_schedule
PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with
HasDistributionKey()
PR #6751 : Stabilize single_node.sql and others that report illegal node
removal
PR #6742 : Refactor CreateDistributedTable()
PR #6747 : Remove unused lock functions
PR #6744 : Fix multiple output version arbitrary config tests
PR #6741 : Stabilize single node tests
PR #6740 : Fix string eval bug in migration files check
PR #6736 : Make run_test.py and create_test.py importable without errors
PR #6734 : Don't blanket ignore flake8 E402 error
PR #6737 : Fixes bookworm packaging pipeline problem
PR #6735 : Fix run_test.py on python 3.9
PR #6733 : MERGE: In deparser, add missing check for RETURNING clause.
PR #6714 : Remove auto_explain workaround in citus explain hook for
ALTER TABLE
PR #6719 : Fix flaky test
PR #6718 : Add more powerfull dependency tracking to run_test.py
PR #6710 : Install non-vulnerable cryptography package
PR #6711 : Support compilation and run tests on latest PG versions
PR #6700 : Add auto-formatting and linting to our python code
PR #6707 : Allow multi_insert_select to run repeatably
PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty
PR #6698 : Miscellaneous cleanup
PR #6704 : Update README for 11.2
PR #6703 : Fix dubious ownership error from git
PR #6690 : Bump Citus to 11.3devel

## Too long changelog entries

The following PRs have changelog entries that are too long to fit in a
single line. I'd expect authors to supply at changelog entries in
`DESCRIPTION:` lines that are at most 78 characters. If you want to
supply multi-line changelog items, you can have multiple lines that
start with `DESCRIPTION:` instead.

PR #6837 : fixes update propagation bug when
`citus_set_coordinator_host` is called more than once
PR #6738 :  Identity column implementation refactorings
PR #6756 : Schedule parallel shard moves in background rebalancer by
removing task dependencies between shard moves across colocation groups.
PR #6793 : Add a GUC to disallow planning the queries that reference
non-colocated tables via router planner
PR #6726 : fix memory leak during altering distributed table with a lot
of partition and shards
PR #6722 : fix memory leak during distribution of a table with a lot of
partitions
PR #6693 : prevent memory leak during ConvertTable with a lot of
partitions

## Empty changelog entries.

The following PR had an empty `DESCRIPTION:` line. This generates an
empty changelog line that needs to be removed manually. Please either
provide a short entry, or remove `DESCRIPTION:` line completely.

PR #6810 : Make CDC decoder an independent extension
PR #6827 : Makefile changes to build CDC in builddir for pgoutput and
wal2json.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
hanefi added a commit that referenced this pull request May 2, 2023
In this release, I tried something different. I experimented with adding
the PR number and title to the changelog right before each changelog
entry. This way, it is easier to track where a particular changelog
entry comes from. After reviews are over, I plan to remove those lines
with PR numbers and titles.

I went through all the PRs that are merged after 11.2.0 release and came
up with a list of PRs that may need help with changelog entries. You can
see details on PRs grouped in several sections below.

The following PRs below do not have a changelog entry. If you think that
this is a mistake, please share it in this PR along with a suggestion on
what the changelog item should be.

PR #6846 : fix 3 flaky tests in failure schedule
PR #6844 : Add CPU usage to citus_stat_tenants
PR #6833 : Fix citus_stat_tenants period updating bug
PR #6787 : Add more tests for ddl coverage
PR #6842 : Add build-cdc-* temporary directories to .gitignore
PR #6841 : Add build-cdc-* temporary directories to .gitignore
PR #6840 : Bump Citus to 12.0devel
PR #6824 : Fixes flakiness in multi_metadata_sync test
PR #6811 : Backport identity column improvements to v11.2
PR #6830 : In run_test.py actually return worker_count
PR #6825 : Fixes flakiness in multi_cluster_management test
PR #6816 : Refactor run_test.py
PR #6817 : Explicitly disallow local rels when inserting into dist table
PR #6821 : Rename citus stats tenants
PR #6822 : Add some more tests for initial sql support
PR #6819 : Fix flakyness in
citus_split_shard_by_split_points_deferred_drop
PR #6814 : Make python-regress based tests runnable with run_test.py
PR #6813 : Fix flaky multi_mx_schema_support test
PR #6720 : Convert columnar tap tests to pytest
PR #6812 : Revoke statistics permissions from public and grant them to
pg_monitor
PR #6769 : Citus stats tenants guc
PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool
parameter, pass a NULL as the value is not used
PR #6797 : Attribute local queries and cached plans on local execution
PR #6796 : Parse the annotation string correctly
PR #6762 : Add logs to citus_stats_tenants
PR #6773 : Add initial sql support for distributed tables that don't
have a shard key
PR #6792 : Disentangle MERGE planning code from the modify-planning code
path
PR #6761 : Citus stats tenants collector view
PR #6791 : Make 8 more tests runnable multiple times via run_test.py
PR #6786 : Refactor some of the planning code to accommodate a new
planning path for MERGE SQL
PR #6789 : Rename AllRelations.. functions to AllDistributedRelations..
PR #6788 : Actually skip arbitrary_configs_router & nested_execution for
AllNullDistKeyDefaultConfig
PR #6783 : Add a config for arbitrary config tests where all the tables
are null-shard-key tables
PR #6784 : Fix attach partition: citus local to null distributed
PR #6782 : Add an arbitrary config test heavily based on
multi_router_planner_fast_path.sql
PR #6781 : Decide what to do with router planner error at one place
PR #6778 : Support partitioning for dist tables with null dist keys
PR #6766 : fix pip lock file
PR #6764 : Make workerCount configurable for regression tests
PR #6745 : Add support for creating distributed tables with a null shard
key
PR #6696 : This implements MERGE phase-III
PR #6767 : Add pytest depedencies to Pipfile
PR #6760 : Decide core distribution params in CreateCitusTable
PR #6759 : Add multi_create_fdw into minimal_schedule
PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with
HasDistributionKey()
PR #6751 : Stabilize single_node.sql and others that report illegal node
removal
PR #6742 : Refactor CreateDistributedTable()
PR #6747 : Remove unused lock functions
PR #6744 : Fix multiple output version arbitrary config tests
PR #6741 : Stabilize single node tests
PR #6740 : Fix string eval bug in migration files check
PR #6736 : Make run_test.py and create_test.py importable without errors
PR #6734 : Don't blanket ignore flake8 E402 error
PR #6737 : Fixes bookworm packaging pipeline problem
PR #6735 : Fix run_test.py on python 3.9
PR #6733 : MERGE: In deparser, add missing check for RETURNING clause.
PR #6714 : Remove auto_explain workaround in citus explain hook for
ALTER TABLE
PR #6719 : Fix flaky test
PR #6718 : Add more powerfull dependency tracking to run_test.py
PR #6710 : Install non-vulnerable cryptography package
PR #6711 : Support compilation and run tests on latest PG versions
PR #6700 : Add auto-formatting and linting to our python code
PR #6707 : Allow multi_insert_select to run repeatably
PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty
PR #6698 : Miscellaneous cleanup
PR #6704 : Update README for 11.2
PR #6703 : Fix dubious ownership error from git
PR #6690 : Bump Citus to 11.3devel

The following PRs have changelog entries that are too long to fit in a
single line. I'd expect authors to supply at changelog entries in
`DESCRIPTION:` lines that are at most 78 characters. If you want to
supply multi-line changelog items, you can have multiple lines that
start with `DESCRIPTION:` instead.

PR #6837 : fixes update propagation bug when
`citus_set_coordinator_host` is called more than once
PR #6738 :  Identity column implementation refactorings
PR #6756 : Schedule parallel shard moves in background rebalancer by
removing task dependencies between shard moves across colocation groups.
PR #6793 : Add a GUC to disallow planning the queries that reference
non-colocated tables via router planner
PR #6726 : fix memory leak during altering distributed table with a lot
of partition and shards
PR #6722 : fix memory leak during distribution of a table with a lot of
partitions
PR #6693 : prevent memory leak during ConvertTable with a lot of
partitions

The following PR had an empty `DESCRIPTION:` line. This generates an
empty changelog line that needs to be removed manually. Please either
provide a short entry, or remove `DESCRIPTION:` line completely.

PR #6810 : Make CDC decoder an independent extension
PR #6827 : Makefile changes to build CDC in builddir for pgoutput and
wal2json.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit 9344300)
hanefi added a commit that referenced this pull request May 2, 2023
In this release, I tried something different. I experimented with adding
the PR number and title to the changelog right before each changelog
entry. This way, it is easier to track where a particular changelog
entry comes from. After reviews are over, I plan to remove those lines
with PR numbers and titles.

I went through all the PRs that are merged after 11.2.0 release and came
up with a list of PRs that may need help with changelog entries. You can
see details on PRs grouped in several sections below.

The following PRs below do not have a changelog entry. If you think that
this is a mistake, please share it in this PR along with a suggestion on
what the changelog item should be.

PR #6846 : fix 3 flaky tests in failure schedule
PR #6844 : Add CPU usage to citus_stat_tenants
PR #6833 : Fix citus_stat_tenants period updating bug
PR #6787 : Add more tests for ddl coverage
PR #6842 : Add build-cdc-* temporary directories to .gitignore
PR #6841 : Add build-cdc-* temporary directories to .gitignore
PR #6840 : Bump Citus to 12.0devel
PR #6824 : Fixes flakiness in multi_metadata_sync test
PR #6811 : Backport identity column improvements to v11.2
PR #6830 : In run_test.py actually return worker_count
PR #6825 : Fixes flakiness in multi_cluster_management test
PR #6816 : Refactor run_test.py
PR #6817 : Explicitly disallow local rels when inserting into dist table
PR #6821 : Rename citus stats tenants
PR #6822 : Add some more tests for initial sql support
PR #6819 : Fix flakyness in
citus_split_shard_by_split_points_deferred_drop
PR #6814 : Make python-regress based tests runnable with run_test.py
PR #6813 : Fix flaky multi_mx_schema_support test
PR #6720 : Convert columnar tap tests to pytest
PR #6812 : Revoke statistics permissions from public and grant them to
pg_monitor
PR #6769 : Citus stats tenants guc
PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool
parameter, pass a NULL as the value is not used
PR #6797 : Attribute local queries and cached plans on local execution
PR #6796 : Parse the annotation string correctly
PR #6762 : Add logs to citus_stats_tenants
PR #6773 : Add initial sql support for distributed tables that don't
have a shard key
PR #6792 : Disentangle MERGE planning code from the modify-planning code
path
PR #6761 : Citus stats tenants collector view
PR #6791 : Make 8 more tests runnable multiple times via run_test.py
PR #6786 : Refactor some of the planning code to accommodate a new
planning path for MERGE SQL
PR #6789 : Rename AllRelations.. functions to AllDistributedRelations..
PR #6788 : Actually skip arbitrary_configs_router & nested_execution for
AllNullDistKeyDefaultConfig
PR #6783 : Add a config for arbitrary config tests where all the tables
are null-shard-key tables
PR #6784 : Fix attach partition: citus local to null distributed
PR #6782 : Add an arbitrary config test heavily based on
multi_router_planner_fast_path.sql
PR #6781 : Decide what to do with router planner error at one place
PR #6778 : Support partitioning for dist tables with null dist keys
PR #6766 : fix pip lock file
PR #6764 : Make workerCount configurable for regression tests
PR #6745 : Add support for creating distributed tables with a null shard
key
PR #6696 : This implements MERGE phase-III
PR #6767 : Add pytest depedencies to Pipfile
PR #6760 : Decide core distribution params in CreateCitusTable
PR #6759 : Add multi_create_fdw into minimal_schedule
PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with
HasDistributionKey()
PR #6751 : Stabilize single_node.sql and others that report illegal node
removal
PR #6742 : Refactor CreateDistributedTable()
PR #6747 : Remove unused lock functions
PR #6744 : Fix multiple output version arbitrary config tests
PR #6741 : Stabilize single node tests
PR #6740 : Fix string eval bug in migration files check
PR #6736 : Make run_test.py and create_test.py importable without errors
PR #6734 : Don't blanket ignore flake8 E402 error
PR #6737 : Fixes bookworm packaging pipeline problem
PR #6735 : Fix run_test.py on python 3.9
PR #6733 : MERGE: In deparser, add missing check for RETURNING clause.
PR #6714 : Remove auto_explain workaround in citus explain hook for
ALTER TABLE
PR #6719 : Fix flaky test
PR #6718 : Add more powerfull dependency tracking to run_test.py
PR #6710 : Install non-vulnerable cryptography package
PR #6711 : Support compilation and run tests on latest PG versions
PR #6700 : Add auto-formatting and linting to our python code
PR #6707 : Allow multi_insert_select to run repeatably
PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty
PR #6698 : Miscellaneous cleanup
PR #6704 : Update README for 11.2
PR #6703 : Fix dubious ownership error from git
PR #6690 : Bump Citus to 11.3devel

The following PRs have changelog entries that are too long to fit in a
single line. I'd expect authors to supply at changelog entries in
`DESCRIPTION:` lines that are at most 78 characters. If you want to
supply multi-line changelog items, you can have multiple lines that
start with `DESCRIPTION:` instead.

PR #6837 : fixes update propagation bug when
`citus_set_coordinator_host` is called more than once
PR #6738 :  Identity column implementation refactorings
PR #6756 : Schedule parallel shard moves in background rebalancer by
removing task dependencies between shard moves across colocation groups.
PR #6793 : Add a GUC to disallow planning the queries that reference
non-colocated tables via router planner
PR #6726 : fix memory leak during altering distributed table with a lot
of partition and shards
PR #6722 : fix memory leak during distribution of a table with a lot of
partitions
PR #6693 : prevent memory leak during ConvertTable with a lot of
partitions

The following PR had an empty `DESCRIPTION:` line. This generates an
empty changelog line that needs to be removed manually. Please either
provide a short entry, or remove `DESCRIPTION:` line completely.

PR #6810 : Make CDC decoder an independent extension
PR #6827 : Makefile changes to build CDC in builddir for pgoutput and
wal2json.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit 9344300)
onurctirtir added a commit that referenced this pull request May 3, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
emelsimsek pushed a commit that referenced this pull request May 24, 2023
In this release, I tried something different. I experimented with adding
the PR number and title to the changelog right before each changelog
entry. This way, it is easier to track where a particular changelog
entry comes from. After reviews are over, I plan to remove those lines
with PR numbers and titles.

I went through all the PRs that are merged after 11.2.0 release and came
up with a list of PRs that may need help with changelog entries. You can
see details on PRs grouped in several sections below.

## PRs with missing entries

The following PRs below do not have a changelog entry. If you think that
this is a mistake, please share it in this PR along with a suggestion on
what the changelog item should be.

PR #6846 : fix 3 flaky tests in failure schedule
PR #6844 : Add CPU usage to citus_stat_tenants
PR #6833 : Fix citus_stat_tenants period updating bug
PR #6787 : Add more tests for ddl coverage
PR #6842 : Add build-cdc-* temporary directories to .gitignore
PR #6841 : Add build-cdc-* temporary directories to .gitignore
PR #6840 : Bump Citus to 12.0devel
PR #6824 : Fixes flakiness in multi_metadata_sync test
PR #6811 : Backport identity column improvements to v11.2
PR #6830 : In run_test.py actually return worker_count
PR #6825 : Fixes flakiness in multi_cluster_management test
PR #6816 : Refactor run_test.py
PR #6817 : Explicitly disallow local rels when inserting into dist table
PR #6821 : Rename citus stats tenants
PR #6822 : Add some more tests for initial sql support
PR #6819 : Fix flakyness in
citus_split_shard_by_split_points_deferred_drop
PR #6814 : Make python-regress based tests runnable with run_test.py
PR #6813 : Fix flaky multi_mx_schema_support test
PR #6720 : Convert columnar tap tests to pytest
PR #6812 : Revoke statistics permissions from public and grant them to
pg_monitor
PR #6769 : Citus stats tenants guc
PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool
parameter, pass a NULL as the value is not used
PR #6797 : Attribute local queries and cached plans on local execution
PR #6796 : Parse the annotation string correctly
PR #6762 : Add logs to citus_stats_tenants
PR #6773 : Add initial sql support for distributed tables that don't
have a shard key
PR #6792 : Disentangle MERGE planning code from the modify-planning code
path
PR #6761 : Citus stats tenants collector view
PR #6791 : Make 8 more tests runnable multiple times via run_test.py
PR #6786 : Refactor some of the planning code to accommodate a new
planning path for MERGE SQL
PR #6789 : Rename AllRelations.. functions to AllDistributedRelations..
PR #6788 : Actually skip arbitrary_configs_router & nested_execution for
AllNullDistKeyDefaultConfig
PR #6783 : Add a config for arbitrary config tests where all the tables
are null-shard-key tables
PR #6784 : Fix attach partition: citus local to null distributed
PR #6782 : Add an arbitrary config test heavily based on
multi_router_planner_fast_path.sql
PR #6781 : Decide what to do with router planner error at one place
PR #6778 : Support partitioning for dist tables with null dist keys
PR #6766 : fix pip lock file
PR #6764 : Make workerCount configurable for regression tests
PR #6745 : Add support for creating distributed tables with a null shard
key
PR #6696 : This implements MERGE phase-III
PR #6767 : Add pytest depedencies to Pipfile
PR #6760 : Decide core distribution params in CreateCitusTable
PR #6759 : Add multi_create_fdw into minimal_schedule
PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with
HasDistributionKey()
PR #6751 : Stabilize single_node.sql and others that report illegal node
removal
PR #6742 : Refactor CreateDistributedTable()
PR #6747 : Remove unused lock functions
PR #6744 : Fix multiple output version arbitrary config tests
PR #6741 : Stabilize single node tests
PR #6740 : Fix string eval bug in migration files check
PR #6736 : Make run_test.py and create_test.py importable without errors
PR #6734 : Don't blanket ignore flake8 E402 error
PR #6737 : Fixes bookworm packaging pipeline problem
PR #6735 : Fix run_test.py on python 3.9
PR #6733 : MERGE: In deparser, add missing check for RETURNING clause.
PR #6714 : Remove auto_explain workaround in citus explain hook for
ALTER TABLE
PR #6719 : Fix flaky test
PR #6718 : Add more powerfull dependency tracking to run_test.py
PR #6710 : Install non-vulnerable cryptography package
PR #6711 : Support compilation and run tests on latest PG versions
PR #6700 : Add auto-formatting and linting to our python code
PR #6707 : Allow multi_insert_select to run repeatably
PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty
PR #6698 : Miscellaneous cleanup
PR #6704 : Update README for 11.2
PR #6703 : Fix dubious ownership error from git
PR #6690 : Bump Citus to 11.3devel

## Too long changelog entries

The following PRs have changelog entries that are too long to fit in a
single line. I'd expect authors to supply at changelog entries in
`DESCRIPTION:` lines that are at most 78 characters. If you want to
supply multi-line changelog items, you can have multiple lines that
start with `DESCRIPTION:` instead.

PR #6837 : fixes update propagation bug when
`citus_set_coordinator_host` is called more than once
PR #6738 :  Identity column implementation refactorings
PR #6756 : Schedule parallel shard moves in background rebalancer by
removing task dependencies between shard moves across colocation groups.
PR #6793 : Add a GUC to disallow planning the queries that reference
non-colocated tables via router planner
PR #6726 : fix memory leak during altering distributed table with a lot
of partition and shards
PR #6722 : fix memory leak during distribution of a table with a lot of
partitions
PR #6693 : prevent memory leak during ConvertTable with a lot of
partitions

## Empty changelog entries.

The following PR had an empty `DESCRIPTION:` line. This generates an
empty changelog line that needs to be removed manually. Please either
provide a short entry, or remove `DESCRIPTION:` line completely.

PR #6810 : Make CDC decoder an independent extension
PR #6827 : Makefile changes to build CDC in builddir for pgoutput and
wal2json.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
emelsimsek pushed a commit that referenced this pull request May 24, 2023
)

With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants