importccl: UUID clash while importing data #61203
Labels
A-disaster-recovery
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
O-community
Originated from the community
X-blathers-triaged
blathers was able to find an owner
Describe the problem
I am evaluating crdb for our business needs. I was trying to migrate from an existing database into crdb. Import worked seamlessly, using a similar approach described below, for smaller tables but generated uuid clash when importing big tables.
To Reproduce
My existing table schema is:
create table T1
(
id bigint primary key not null default autoincrement,
...
<around 50 other fields>
);
This table has close to 10 million records. For performance reasons, crdb recommends using UUID or a combination of other fields as the primary key is used for partitioning. So I changed the schema to:
create table T1
(
id uuid primary key not null default gen_randon_uuid(),
oldid bigint unique not null,
...
);
I then used import statement to import the data from csv file (unloaded from my existing db) into cockroach. The import failed with duplicate primary index value. I tried splitting my csv file in 20 smaller chunks. I still got the collision.
To overcome this, I changed the schema to:
create table T1
(
id uuid null,
oldid bigint unique not null,
...
);
Now the import complete in 5 minutes. This is faster than I can load the data in my existing database. Bravo!
But now I had to fix id field so I ran:
UPDATE T1 SET id = gen_random_uuid();
This statement did not finish execution in 12 hours or so... I cancelled it.
My only workaround is to unload from existing database using a special select statement that generates the unique uuid for me. I have tried it and it works.
Expected behavior
I should be able to load the table and crdb generates unique primary keys for me. The workaround I am using is not feasible in all the scenarios that I need to support.
Additional context
What was the impact?
I have to import 200+ tables and ~20 tables are huge. I wonder how many custom unloads will I have to do.
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: