Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate texera database from mysql to postgres #3254

Merged
merged 25 commits into from
Feb 21, 2025

Conversation

shengquan-ni
Copy link
Collaborator

@shengquan-ni shengquan-ni commented Feb 10, 2025

Overall Direction

This pull request removes all MySQL dependencies and transitions the Texera platform to rely on PostgreSQL.

Key Code Changes

Type Changes

  1. org.jooq.types.UInteger was replaced with scala Integer.
  2. TINYINT(1) columns (often used as booleans in MySQL) changed to proper BOOLEAN columns.
  3. Status fields switched from Byte to Short or Enum-based values based on the jooq code generation.

Updated jOOQ

  1. Regenerated all classes under dao/jooq/generated to reflect the new PostgreSQL schema.
  2. Upgraded jOOQ library from 3.14.16 to 3.16.6 to include bug fixes on generated SQL queries.
  3. Upgraded jOOQ codegen from 3.12.4 to 3.16.10 to include bug fixes on generating DAO files.

Query Adjustments

  1. Rewrote queries that previously used MySQL-specific syntax in fulltext-search.
  2. Revised a few raw SQL strings to jOOQ queries for better maintainability.

MockDB changes

  1. Mock DB now uses an embedded Postgres DB.

Migration Guide

  1. before pulling the latest master, backup your workflows and datasets by downloading them if necessary.
  2. pull the latest master and refresh sbt project to load the updated sql packages.
  3. install postgres database (version > 14) on your local dev environment.
  4. use psql -U postgres to log in to postgres. If psql command is not found and you are using Mac OS, follow this answer to fix the issue.
  5. run \i /path/to/texera_ddl.sql to initialize the texera DB.
  6. enable the user system on both backend and frontend, then create a user for yourself.
  7. re-upload your dataset and workflow if needed.

@shengquan-ni shengquan-ni self-assigned this Feb 10, 2025
@shengquan-ni shengquan-ni added enhancement ddl-change Changes to the TexeraDB DDL labels Feb 10, 2025
Copy link
Collaborator

@kunwp1 kunwp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left some minor comments.

Copy link
Collaborator

@aglinxinyuan aglinxinyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Tested on my local machine.

@shengquan-ni shengquan-ni merged commit d7a163d into master Feb 21, 2025
8 checks passed
@shengquan-ni shengquan-ni deleted the shengquan-migrate-to-postgres branch February 21, 2025 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ddl-change Changes to the TexeraDB DDL enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants