Skip to content

Conversation

@anchovYu
Copy link
Contributor

@anchovYu anchovYu commented Jan 5, 2024

What changes were proposed in this pull request?

This PR fixes the undesired behavior that concurrent CREATE VIEW IF NOT EXISTS queries could throw TABLE_OR_VIEW_ALREADY_EXISTS exceptions. It's because the current implementation did not propagate the 'IF NOT EXISTS' when the detecting view/table does not exists.

Why are the changes needed?

Fix the above issue.

Does this PR introduce any user-facing change?

Yes in the sense that if fixes an issue in concurrent case.

How was this patch tested?

Without the fix the following test failed while with this PR if passed. But following the comment, I removed the test from this PR.

  test("CREATE VIEW IF NOT EXISTS never throws TABLE_OR_VIEW_ALREADY_EXISTS") {
    // Concurrently create a view with the same name, so that some of the queries may all
    // get that the view does not exist and try to create it. But with IF NOT EXISTS, the
    // queries should not fail.
    import ExecutionContext.Implicits.global
    val concurrency = 10
    val tableName = "table_name"
    val viewName = "view_name"
    withTable(tableName) {
      sql(s"CREATE TABLE $tableName (id int) USING parquet")
      withView("view_name") {
        val futures = (0 to concurrency).map { _ =>
          Future {
            Try {
              sql(s"CREATE VIEW IF NOT EXISTS $viewName AS SELECT * FROM $tableName")
            }
          }
        }
        futures.map { future =>
           val res = ThreadUtils.awaitResult(future, 5.seconds)
           assert(
             res.isSuccess,
             s"Failed to create view: ${if (res.isFailure) res.failed.get.getMessage}"
           )
        }
      }
    }
  }

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jan 5, 2024
@anchovYu anchovYu changed the title [WIP][SQL] Propagate allowExisting in view creation when the view/table does not exists [SPARK-46602][SQL] Propagate allowExisting in view creation when the view/table does not exists Jan 5, 2024
withTable(tableName) {
sql(s"CREATE TABLE $tableName (id int) USING parquet")
withView("view_name") {
val futures = (0 to concurrency).map { _ =>
Copy link
Contributor

@cloud-fan cloud-fan Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fix is trivial, we don't need to commit this fragile and heavy test. You can just run it offline and verify that the fix works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have verified that without the fix this test failed.

@cloud-fan
Copy link
Contributor

The failed kafka test is unrelated, thanks, merging to master/3.5!

@cloud-fan cloud-fan closed this in 9b3c70f Jan 5, 2024
cloud-fan pushed a commit that referenced this pull request Jan 5, 2024
…e view/table does not exists

### What changes were proposed in this pull request?
This PR fixes the undesired behavior that concurrent `CREATE VIEW IF NOT EXISTS` queries could throw `TABLE_OR_VIEW_ALREADY_EXISTS` exceptions. It's because the current implementation did not propagate the 'IF NOT EXISTS' when the detecting view/table does not exists.

### Why are the changes needed?
Fix the above issue.

### Does this PR introduce _any_ user-facing change?
Yes in the sense that if fixes an issue in concurrent case.

### How was this patch tested?
Without the fix the following test failed while with this PR if passed. But following the [comment](#44603 (comment)), I removed the test from this PR.
```scala
  test("CREATE VIEW IF NOT EXISTS never throws TABLE_OR_VIEW_ALREADY_EXISTS") {
    // Concurrently create a view with the same name, so that some of the queries may all
    // get that the view does not exist and try to create it. But with IF NOT EXISTS, the
    // queries should not fail.
    import ExecutionContext.Implicits.global
    val concurrency = 10
    val tableName = "table_name"
    val viewName = "view_name"
    withTable(tableName) {
      sql(s"CREATE TABLE $tableName (id int) USING parquet")
      withView("view_name") {
        val futures = (0 to concurrency).map { _ =>
          Future {
            Try {
              sql(s"CREATE VIEW IF NOT EXISTS $viewName AS SELECT * FROM $tableName")
            }
          }
        }
        futures.map { future =>
           val res = ThreadUtils.awaitResult(future, 5.seconds)
           assert(
             res.isSuccess,
             s"Failed to create view: ${if (res.isFailure) res.failed.get.getMessage}"
           )
        }
      }
    }
  }
```

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #44603 from anchovYu/create-view-if-not-exist-fix.

Authored-by: Xinyi Yu <xinyi.yu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 9b3c70f)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
vinodkc pushed a commit to vinodkc/spark that referenced this pull request Jan 5, 2024
…e view/table does not exists

### What changes were proposed in this pull request?
This PR fixes the undesired behavior that concurrent `CREATE VIEW IF NOT EXISTS` queries could throw `TABLE_OR_VIEW_ALREADY_EXISTS` exceptions. It's because the current implementation did not propagate the 'IF NOT EXISTS' when the detecting view/table does not exists.

### Why are the changes needed?
Fix the above issue.

### Does this PR introduce _any_ user-facing change?
Yes in the sense that if fixes an issue in concurrent case.

### How was this patch tested?
Without the fix the following test failed while with this PR if passed. But following the [comment](apache#44603 (comment)), I removed the test from this PR.
```scala
  test("CREATE VIEW IF NOT EXISTS never throws TABLE_OR_VIEW_ALREADY_EXISTS") {
    // Concurrently create a view with the same name, so that some of the queries may all
    // get that the view does not exist and try to create it. But with IF NOT EXISTS, the
    // queries should not fail.
    import ExecutionContext.Implicits.global
    val concurrency = 10
    val tableName = "table_name"
    val viewName = "view_name"
    withTable(tableName) {
      sql(s"CREATE TABLE $tableName (id int) USING parquet")
      withView("view_name") {
        val futures = (0 to concurrency).map { _ =>
          Future {
            Try {
              sql(s"CREATE VIEW IF NOT EXISTS $viewName AS SELECT * FROM $tableName")
            }
          }
        }
        futures.map { future =>
           val res = ThreadUtils.awaitResult(future, 5.seconds)
           assert(
             res.isSuccess,
             s"Failed to create view: ${if (res.isFailure) res.failed.get.getMessage}"
           )
        }
      }
    }
  }
```

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#44603 from anchovYu/create-view-if-not-exist-fix.

Authored-by: Xinyi Yu <xinyi.yu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…e view/table does not exists

### What changes were proposed in this pull request?
This PR fixes the undesired behavior that concurrent `CREATE VIEW IF NOT EXISTS` queries could throw `TABLE_OR_VIEW_ALREADY_EXISTS` exceptions. It's because the current implementation did not propagate the 'IF NOT EXISTS' when the detecting view/table does not exists.

### Why are the changes needed?
Fix the above issue.

### Does this PR introduce _any_ user-facing change?
Yes in the sense that if fixes an issue in concurrent case.

### How was this patch tested?
Without the fix the following test failed while with this PR if passed. But following the [comment](apache#44603 (comment)), I removed the test from this PR.
```scala
  test("CREATE VIEW IF NOT EXISTS never throws TABLE_OR_VIEW_ALREADY_EXISTS") {
    // Concurrently create a view with the same name, so that some of the queries may all
    // get that the view does not exist and try to create it. But with IF NOT EXISTS, the
    // queries should not fail.
    import ExecutionContext.Implicits.global
    val concurrency = 10
    val tableName = "table_name"
    val viewName = "view_name"
    withTable(tableName) {
      sql(s"CREATE TABLE $tableName (id int) USING parquet")
      withView("view_name") {
        val futures = (0 to concurrency).map { _ =>
          Future {
            Try {
              sql(s"CREATE VIEW IF NOT EXISTS $viewName AS SELECT * FROM $tableName")
            }
          }
        }
        futures.map { future =>
           val res = ThreadUtils.awaitResult(future, 5.seconds)
           assert(
             res.isSuccess,
             s"Failed to create view: ${if (res.isFailure) res.failed.get.getMessage}"
           )
        }
      }
    }
  }
```

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#44603 from anchovYu/create-view-if-not-exist-fix.

Authored-by: Xinyi Yu <xinyi.yu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 9b3c70f)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants