forked from delta-io/delta
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update with master #12
Merged
JassAbidi
merged 24 commits into
JassAbidi:set_the_right_isolation_level_in_the_CommitInfo
from
delta-io:master
Jul 18, 2021
Merged
update with master #12
JassAbidi
merged 24 commits into
JassAbidi:set_the_right_isolation_level_in_the_CommitInfo
from
delta-io:master
Jul 18, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… - 1 When there’s a checkpoint at version 10 and a delta file at version 11, the earliest version returned should be version 10. We don’t handle that case correctly right now unit test AFFECTED VERSIONS: PROBLEM DESCRIPTION: Author: Li Zhang <li.zhang@databricks.com> GitOrigin-RevId: 8e70e6b2aae76a5043653d3b6fdee45b824a9c27
Minor change in sbt install script. Author: yaohua <yaohua.zhao@databricks.com> GitOrigin-RevId: 26b0fe9df735739332a482ced59bdd90bd7534ec
…a.`<path>`" name ## What changes were proposed in this pull request? Make DeltaTable.forName support "delta.`<path>`" name. Before this change, DeltaTable.forName(s"delta.`$dir`") would result in an error. ## This PR introduces the following *user-facing* changes Before this change, DeltaTable.forName(s"delta.`$dir`") would result in an error. After this change, DeltaTable.forName(s"delta.`$dir`") would be allowed for Delta Table directories, but still blocked for empty (non Delta Table) directories. ## How was this patch tested? Unit tested that DeltaTable.forName(s"delta.`$dir`") on a Delta Table directory is allowed, and that DeltaTable.forName(s"delta.`$dir`") on an empty directory is still blocked. Author: Yuhong Chen <yuhong.chen@databricks.com> Author: Yuhong Chen <mikechen212@gmail.com> #22994 is resolved by FX196/dpgget9c. GitOrigin-RevId: 7f0bd84e5d1064bbc2282d5330f2e8b74a45959d
…t - cleanup This PR cleans up an unused variable in the IBMCOSLogStore. The variable (`writeSize`) was a leftover from an older version of the LogStore implementation. No logic changes are introduced. Closes #692 Signed-off-by: Yijia Cui <yijia.cui@databricks.com> Author: Guy Khazma <33684427+guykhazma@users.noreply.github.com> #23256 is resolved by yijiacui-db/73jdbmso. GitOrigin-RevId: 0a1de46e6b55b7ebff76b187a4d24f5a826df385
Set `spark.databricks.delta.commitLock.enabled` to `true` on Azure, as removing the lock will increase the chance to hit the concurrent error when overwriting the `_last_checkpoint` file concurrently. The new unit tests. -Regression: Azure users may hit concurrent error when overwriting the `_last_checkpoint` file concurrently. Author: Shixiong Zhu <zsxwing@gmail.com> GitOrigin-RevId: df9d11f1982bb71563934d9d389e40a1e37b7add
Minor refactor of test names and comment Author: Zach Schuermann <zach.schuermann@databricks.com> GitOrigin-RevId: 7d7c15e13c4c0b3f41fa3421c91e6a5a02812efa
Minor refactor Author: Yuyuan Tang <yuyuan.tang@databricks.com> GitOrigin-RevId: 1ae361b37d749cac3d06fe4cae18fc172fa464a7
…tection two improvements - every log line prints a unique identifier of the txn. this differentiates logs from concurrent txn to the same table in the same jvm (optimize does this all the time). the id is completely internal and used only for this log4j logging purpose. - addititonal timing metrics to show the breakdown of timing between different steps conflict detection. no unit tests Author: Tathagata Das <tathagata.das1565@gmail.com> GitOrigin-RevId: 3a0c424288660cbbcef5cb76cd66d75050b10828
Minor refactor style N/A Author: Lars Kroll <lars.kroll@databricks.com> GitOrigin-RevId: c9c06110075d32c749eb1afb24ea6f873bfece61
Add new function getBinIndex in FileSizeHistogram, which returns the index of the bin to which given fileSize belongs OR -1 if given fileSize doesn't belongs to any bin Existing UTs. Author: Prakhar Jain <prakhar.jain@databricks.com> GitOrigin-RevId: 9a8bee48e60a4cf2b0e1207c9a7ddc3c31991c82
Minor refactor of Delta conf code Author: Prakhar Jain <prakhar.jain@databricks.com> GitOrigin-RevId: 5ebc318ed5a9ad34529f0bf49ca7bf4b9399bccf
Minor refactor Author: Lars Kroll <lars.kroll@databricks.com> GitOrigin-RevId: d9b49a4fa92dea967104a82fdbae69534c14436a
Minor refactor of EvolvabilitySuiteBase Test-only PR. Author: Zach Schuermann <zach.schuermann@databricks.com> GitOrigin-RevId: 73bc357d0634b1607ed77b3a4d709a39fe625b8b
## What changes were proposed in this pull request? When a snapshot was created as an `InitialSnapshot` for a Delta table and is cached as such (for example, a race condition due to unmounting and mounting paths), then all following reads on that Delta table would return a “This path is not a Delta table” error. This PR adds an `update()` call to the Dataframe read path to prevent this from happening (and give the valid table). This is done by forcing the computation of `snapshot` when we create a `BaseRelation` for `DeltaTableV2`. In short, this will call `deltaLog.update()` so we ensure that the check whether or not the table exists is accurate. This costs an additional RPC but is deemed necessary for correctness. ## How was this patch tested? Added a unit test to simulate reading from a table with cached `InitialSnapshot` and a valid DeltaLog. Author: Zach Schuermann <zach.schuermann@databricks.com> #23778 is resolved by schuermannator/sc-78050. GitOrigin-RevId: 8fd732bbf39788f92ea390f720aa9bb4246e8d12
Minor refactor of DeltaAnalysis code and update comments in DeltaInvariantCheckerExec existing UT Author: Linhong Liu <linhong.liu@databricks.com> GitOrigin-RevId: 4582218e20f7eae532d063cc8613e2d964ee35d9
Strip the full temp view plan for Delta DML commands. This allows us to reenable the test for merging into SQL temp views for MERGE - previously resolution would fail. new unit test Author: Jose Torres <joseph.torres@databricks.com> GitOrigin-RevId: b418f4bd194d6186390261cd8d32c4f2c9ed1048
NullType column is not very useful as they do not contain any contents. Hence we used to drop this NullType column when we create a table from DataFrameReader, but we did not do the same thing on SQL read path. This PR unifies the behavior, which will drop NullType column always in any read/table/sql APIs. Unit tests testing different read APIs. Author: Junyong Lee <junyong.lee@databricks.com> GitOrigin-RevId: 9b55e8fb5e51ffbfb86832a811668e9b920c225e
…acySuite Minor code style change. Author: Prakhar Jain <prakhar.jain@databricks.com> GitOrigin-RevId: 217f785ec1dbb111e6ff1aca88e680774ce68e90
## What changes were proposed in this pull request? This PR adds support for generated command in MERGE ... UPDATE case. Previously, if the generated column is not explicitly updated, we will copy over the old values, which would potentially break the generated column check constraint and fail the query. With this PR, the values of generated columns will be computed correctly using the (potentially updated) referenced columns. This PR mostly reuses the utility functions for UPDATE to generate the correct update expressions, with some small changes needed to cover the schema evolution case. ## How was this patch tested? Added new unit test. Author: Meng Tong <meng.tong@databricks.com> #23499 is resolved by mengtong-db/generated-column-merge. GitOrigin-RevId: 6245c07c323255eb4a0db88150e520ef24e02af8
Add new testsuite OptimisticTransactionSuite UTs Author: Prakhar Jain <prakharjain09@gmail.com> Author: Prakhar Jain <prakhar.jain@databricks.com> GitOrigin-RevId: 6aa1b08ea56220e76b75807c4577d21b4547762c
…efactor DeltaMergeInto to Include Final Schema Move MergeSchema in SchemaUtils to a new file to report finalSchema in DeltaMergeInto. Refactor PreprocessTableMerge to report the fully analyzed DeltaMergeInto. Unit test. Author: Yijia Cui <yijia.cui@databricks.com> GitOrigin-RevId: 5fe7e0d2a2e899a384382d8caa7273be8408ee14
…path A minor refactor to call defaultTablePath only once Author: Yuchen Huo <yuchen.huo@databricks.com> GitOrigin-RevId: 1eafe477d0d2ad9d6980398f57dedea31c020d75
This PR refactors the conflict detection code flow to a separate class so that: - Improve readability of the current code: The current code is has a single `checkForConflict` method which do all the required checks. Existing UTs GitOrigin-RevId: 54ad050e0967fa49a61f2677fe1510242d0916d5
The bintray url is not working now. Use the `repo.typesafe.com` link instead. Closes #711 Signed-off-by: Shixiong Zhu <zsxwing@gmail.com> Author: Shixiong Zhu <zsxwing@gmail.com> #24449 is resolved by zsxwing/den3b8hd. GitOrigin-RevId: 1f8fdb3bba694ff53001d13ebca9f84dfae0748e
JassAbidi
merged commit Jul 18, 2021
e9fd7b4
into
JassAbidi:set_the_right_isolation_level_in_the_CommitInfo
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.