Skip to content

Conversation

@Xiao-zhen-Liu
Copy link
Contributor

@Xiao-zhen-Liu Xiao-zhen-Liu commented Sep 29, 2025

Purpose

This PR sets user system to be enabled by default in the configuration. Currently, this flag is by default set to be disabled (a.k.a. the non-user mode). As no one is using the non-user mode and we are requiring all the developers to enable the user system, we have decided to abandon the non-user mode.

Challenge & Design

The major blocker of setting the flag to be enabled by default is two e2e test suites that rely on the non-user mode. These two test suites execute a workflow in the Amber engine in each of their test cases. Enabling the user mode would require texera_db in the test environment, as in the user-system mode, the execution of a workflow requires an eid (and subsequently a vid, wid, and uid) in texera_db.

We could use MockTexeraDB, which is currently used by many unit tests. MockTexeraDB creates an embedded postgres instance per test suite, and the embedded db is destroyed at the end of each such test suite.

However, a complexity of the two e2e test cases is they both access a singleton resource WorkflowExecutionsResource, which caches the DSL context from SqlServer (i.e., it only gets evaluated once per JVM):

 final private lazy val context = SqlServer
    .getInstance()
    .createDSLContext()

In fact, most of the singleton resources in our current codebase cache the DSLContext / Dao, as the DSLContext never gets updated during the real Texera environment (i.e., the realtexera_db's address never changes).

In the test environment, however, when working with MockTexeraDB, that assumption does not hold, as each instance of MockTexeraDB has a different address, and gets destroyed before other test suite runs. Since all the test suites are executed in the same JVM during CI run, using MockTexeraDB would cause the 2nd of the two e2e test cases to fail because it still uses the DSL context from the 1st test suite's MockTexeraDB.

The diagrams below show what happens when using the embedded MockTexeraDB to run two e2e test suites that both need to access the same singleton resource during their execution.

The 1st test suite creates an embedded DB (DB1) and lets the singleton SqlServer object set its DSLContext to point to DB1. When the test cases first access WorkflowExecutionsResource (WER), WER grabs the DSLContext from SqlServer and caches it. WER then queries DB1 for all the test cases of test suite 1. When test suite 1 finishes, DB1 gets destroyed.
DB and CI - 1

Later, In the same JVM, when test suite 2 starts, it also creates its own embedded DB (DB2) and lets SqlServer point to DB2. However, as the DSLContext in WER is cached, it does not get updated when the test cases access WER, so WER still points to DB1, which is already destroyed, and causes failures.
DB and CI - 2

To solve this problem, we could either:

  1. Avoid caching DSLContext/Dao in the codebase, or
  2. Let the two e2e test cases use the same real, external database (same as production environment) instead of MockTexeraDB.

We choose the 2nd design, as these two are e2e tests which should emulate production behavior with a real database. To avoid polluting the developer's local texera_db, we use a separate test database with the same schema.

Changes

  • Sets user-sys to be enabled by default.
  • Introduces a texera_db_for_test_cases specifically for test cases and CIs. texera_ddl.sql is updated to allow creating the database with a name other than texera_db (and still defaults to texera_db), and CIs will automatically create texera_db_for_test_cases with the same schema as texera_db.
  • Updates DataProcessingSpec and PauseSpec to use texera_db_for_test_cases. The two test suites now populate and cleanup this database during their run.
  • MockTexeraDB is updated to incorporate the changes to the DDL script.
  • SqlServer is also updated with a clearInstance logic so that other unit tests that use MockTexeraDB can clear their instance in SqlServer properly so that they do not interfere with the two e2e tests.

Next Step

Remove the user-sys'senabled flag and its if-else handling logic completely.

@Xiao-zhen-Liu Xiao-zhen-Liu changed the title chore(config): enable user system by default wip: enable user system by default Sep 29, 2025
@github-actions github-actions bot added engine ddl-change Changes to the TexeraDB DDL ci changes related to CI build labels Oct 3, 2025
@Xiao-zhen-Liu Xiao-zhen-Liu changed the title wip: enable user system by default feat(amber): enable user system by default Oct 3, 2025
@Xiao-zhen-Liu Xiao-zhen-Liu self-assigned this Oct 3, 2025
@aglinxinyuan
Copy link
Contributor

Do we want to remove the user-system flag directly? Does user-system = false still work after this PR? If it doesn't work, I suggest we just remove it.

@chenlica
Copy link
Contributor

chenlica commented Oct 3, 2025

Is it possible to include a diagram to visualize the change?

@Xiao-zhen-Liu
Copy link
Contributor Author

Do we want to remove the user-system flag directly? Does user-system = false still work after this PR? If it doesn't work, I suggest we just remove it.

That would be the next PR. It would be too many changes for this PR.

@Xiao-zhen-Liu
Copy link
Contributor Author

Is it possible to include a diagram to visualize the change?

I will include a diagram about the behaviors of these test cases in relation to Singleton classes and the databases.

Copy link
Contributor

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left one minor comment

@aglinxinyuan
Copy link
Contributor

I will review this PR after PR #3824 is merged.

Copy link
Contributor

@aglinxinyuan aglinxinyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Xiao-zhen-Liu Xiao-zhen-Liu merged commit 83076ac into main Oct 7, 2025
10 checks passed
@Xiao-zhen-Liu Xiao-zhen-Liu deleted the xiaozhen-default-user-sys branch October 7, 2025 23:26
Xiao-zhen-Liu added a commit that referenced this pull request Oct 10, 2025
# Purpose

This PR is a successor of #3782. As the non-user system mode is no
longer used or maintained, we can remove the flag to switch between
user-system being enabled/disabled, and keep only the mode of
user-system being enabled.

# Content

- Removed the `user-sys.enabled` flag, both in the frontend and backend.
- Removed all the if-else statements based on this flag in the codebase.
Only the cases of user system being enabled are kept.
- Removed `ExecutionResourceMapping` in the backend as it is no longer
needed.
- Removed `WorkflowCacheService` in the frontend as it is no longer
needed.

---------

Co-authored-by: Xinyuan Lin <xinyual3@uci.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build ci changes related to CI ddl-change Changes to the TexeraDB DDL engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants