Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column order is not what user define when create table #187

Closed
jiacai2050 opened this issue Aug 11, 2022 · 7 comments · Fixed by #340
Closed

Column order is not what user define when create table #187

jiacai2050 opened this issue Aug 11, 2022 · 7 comments · Fixed by #340
Assignees
Labels
A-SQL Area: SQL layer bug Something isn't working

Comments

@jiacai2050
Copy link
Contributor

jiacai2050 commented Aug 11, 2022

Describe this problem

Table' columns order is not what user define when create table

Steps to reproduce

Execute those SQL

CREATE TABLE test_order (
    host string tag,
    v2 int,
    ts timestamp NOT NULL,
    v1 int,
    timestamp KEY (ts)
)ENGINE = Analytic WITH (
    enable_ttl = 'false'
);

desc test_order;

desc will output

ts	timestamp	1	0	0
tsid	uint64	1	0	0
host	string	0	1	1
v1	int	0	1	1
v2	int	0	1	0

Expected behavior

The column order should be same with the order when created.

Additional Information

https://github.com/CeresDB/ceresdb/blob/3e826ffbb98d2c5351e63b28dd9dd4644e7d4c1e/sql/src/planner.rs#L265

The root cause is currently we use a BTreeMap when build create table plan, it should use a vector to keep the ordering

ref #154

@jiacai2050 jiacai2050 added bug Something isn't working A-SQL Area: SQL layer labels Aug 11, 2022
@jiacai2050
Copy link
Contributor Author

After some thoughts, I find to fix this issue is a non-trivial task.

Currently CeresDB will put primary key columns at the front, and use a num_key_columns in Schema to indicate how many columns are in primary key, so if we keep the original column ordering, then we need more info to know which column belongs to primary key.

@dust1
Copy link
Contributor

dust1 commented Sep 15, 2022

@jiacai2050
Copy link
Contributor Author

We can organize columns like what you said, but the more tricky issue is the order is used for primary key identification.

The first num_key_columns columns are primary key, if we keep the original order, then we need to keep some info to tell which columns are primary key

@dust1
Copy link
Contributor

dust1 commented Sep 15, 2022

This looks like the BTree needs to be removed, will it infect the storage layer?

@jiacai2050
Copy link
Contributor Author

will it infect the storage layer?

I think it will.

The rows within one table will be ordered by primary key. For example, the key is composed of primary key + sequence in memtable

If we want to fix this, we can remove num_key_columns() method in schema, and replace all its usage with key_columns().

FYI, num_key_columns is only used in a few places, so this change may not be very large.
image

@dust1
Copy link
Contributor

dust1 commented Sep 18, 2022

maybe I can try to fix this, can it be assigned to me?

@jiacai2050
Copy link
Contributor Author

Sure, go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-SQL Area: SQL layer bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants