Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(migrate): speed up resource migration and add transaction control #243

Merged
merged 3 commits into from
Sep 7, 2023

Conversation

yhilmare
Copy link
Contributor

@yhilmare yhilmare commented Sep 7, 2023

What type of PR is this?

type-refactor

What this PR does / why we need it:

  1. The efficiency of the resource migration framework is low, and the migration takes a lot of time. This pr improves the migration efficiency.
    Before refactor:
obclient> select type, sum(execution_millis) from migrate_schema_history group by type;
+----------+-----------------------+
| type     | sum(execution_millis) |
+----------+-----------------------+
| SQL      |                 36270 |
| JDBC     |                 77963 |
| RESOURCE |                 52505 |
+----------+-----------------------+
3 rows in set (0.00 sec)

After refactor:

obclient> select type, sum(execution_millis) from migrate_schema_history group by type;
+----------+-----------------------+
| type     | sum(execution_millis) |
+----------+-----------------------+
| SQL      |                 39402 |
| JDBC     |                 82199 |
| RESOURCE |                 19997 |
+----------+-----------------------+
3 rows in set (0.00 sec)

as you can see: 60% reduction in time consumption

  1. Added transaction control for resource migration to prevent data from being in an inconsistent state due to exceptions during the migration process.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

  1. speed up resource migration, the resource migration framework previously needed to execute 3 SQL statements for each record migration:

    1. query by unique keys to check if this record is present.
    2. insert the record.
    3. query it by unique keys to full fill the generated id.

    Step 1 and step 3 is not necessary.

    If a record allows duplicates, we don't have to do step 1, and we can insert the record regardless of whether they already exist.

    We can get generated keys by jdbc directly rather than query it.

  2. Added transaction control in ResourceMigrator

Additional documentation e.g., usage docs, etc.:


@yhilmare yhilmare added the type-refactor refactor code or rename variables label Sep 7, 2023
@yhilmare yhilmare added this to the ODC 4.2.1 milestone Sep 7, 2023
@yhilmare yhilmare self-assigned this Sep 7, 2023
Copy link
Contributor

@yizhouxw yizhouxw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍👍
nice improvment!

@yhilmare yhilmare merged commit ee12bf3 into dev/4.2.1 Sep 7, 2023
11 checks passed
@yhilmare yhilmare deleted the refactor/shanlu_speedup_migrate branch September 7, 2023 11:47
yhilmare added a commit that referenced this pull request Jan 15, 2024
…ntrol (#243)

* refactor(migrate): speed up resource migrator

* refactor(migrate): remove force allow duplicate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-refactor refactor code or rename variables
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants