Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise error when dataType was not provided #714

Closed

Conversation

rajagurunath
Copy link
Contributor

throw Error explicitly, if dataType was not provided for DeltaColumnBuilder :

issue Raised and Discussed here: #698

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. Left some style comments. Could you merge master into your branch so that it will have the fix for CI?

@@ -126,6 +127,10 @@ class DeltaColumnBuilder private[tables](
metadataBuilder.putString("comment", comment.get)
}
val fieldMetadata = metadataBuilder.build()
if (dataType == null) {
throw DeltaErrors.analysisException("dataType should not be null for a column," +
Copy link
Member

@zsxwing zsxwing Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s"The data type of column $colName is not provided"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this new error message has the right amount of information 👍

@@ -228,6 +228,18 @@ class DeltaTableBuilderSuite extends QueryTest with SharedSparkSession with Delt
}
}

test("test addColumn using columnBuilder, without dataType") {
val e = intercept[AnalysisException] {
io.delta.tables.DeltaTable.create().addColumn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can just run

DeltaTable.columnBuilder("value")
          .generatedAlwaysAs("true")
          .nullable(true)
          .build()

This would show that we will catch the error in build().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it , makes sense !

@rajagurunath
Copy link
Contributor Author

Thanks for reviewing the code and giving the helpful comments !!

val e = intercept[AnalysisException] {
DeltaTable.columnBuilder("value")
.generatedAlwaysAs("true")
. nullable(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove the space after the dot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, Thanks 👍

@@ -126,6 +127,9 @@ class DeltaColumnBuilder private[tables](
metadataBuilder.putString("comment", comment.get)
}
val fieldMetadata = metadataBuilder.build()
if (dataType == null) {
throw DeltaErrors.analysisException(s"The data type of column $colName is not provided")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: of the column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

. nullable(true)
.build()
}
assert(e.getMessage.equals(s"The data type of column value is not provided"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Use == for equals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced equals with == and removed the brackets '()' is this okay ?

@rajagurunath
Copy link
Contributor Author

Hi @jaceklaskowski
Thanks a lot for taking the time to review the code and giving comments.

(btw your books are awesome ❤️ )

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@tdas tdas added this to the 1.1.0 milestone Nov 8, 2021
scottsand-db pushed a commit to scottsand-db/delta that referenced this pull request Dec 6, 2021
throw Error explicitly,  if dataType was not provided for DeltaColumnBuilder :

issue Raised and Discussed here: delta-io#698

Closes delta-io#714

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: gurunath <gurunathrajagopal@gmail.com>
Author: Shixiong Zhu <zsxwing@gmail.com>

#24669 is resolved by zsxwing/cuj6lcz0.

GitOrigin-RevId: a9d92bc195693bb1ddca515a8069c38ce99d0497
allisonport-db pushed a commit to scottsand-db/delta that referenced this pull request Feb 4, 2022
throw Error explicitly,  if dataType was not provided for DeltaColumnBuilder :

issue Raised and Discussed here: delta-io#698

Closes delta-io#714

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: gurunath <gurunathrajagopal@gmail.com>
Author: Shixiong Zhu <zsxwing@gmail.com>

#24669 is resolved by zsxwing/cuj6lcz0.

GitOrigin-RevId: a9d92bc195693bb1ddca515a8069c38ce99d0497
(cherry picked from commit 104e2a4)
pradomota added a commit to Telefonica/delta that referenced this pull request Mar 14, 2022
* [SC-78072][Delta] Fix partitionedBy in DeltaTableBuilder Python API

Fix partitionedBy in DeltaTableBuilder Python API

Added more usages in the unit tests.

Author: Yijia Cui <yijia.cui@databricks.com>

GitOrigin-RevId: 99e2e89242c8fa41c960df25dcc382faae439ab1
(cherry picked from commit 59aa330)

* [DELTA-OSS-EXTERNAL]  Raise error when dataType was not provided

throw Error explicitly,  if dataType was not provided for DeltaColumnBuilder :

issue Raised and Discussed here: delta-io#698

Closes delta-io#714

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: gurunath <gurunathrajagopal@gmail.com>
Author: Shixiong Zhu <zsxwing@gmail.com>

#24669 is resolved by zsxwing/cuj6lcz0.

GitOrigin-RevId: a9d92bc195693bb1ddca515a8069c38ce99d0497
(cherry picked from commit 104e2a4)

* [DELTA-OSS-EXTERNAL] Move GCS write to a separate thread to workaround a GCS corruption issue

We found a potential GCS corruption due to an issue in GCS Hadoop connector: When writing a file on GCS, if the current thread is interrupted, GCS may close the partial write and upload the incomplete file. Here is a stack trace showing this behavior:

```
	  at java.io.PipedOutputStream.close(PipedOutputStream.java:175)
	  at java.nio.channels.Channels$WritableByteChannelImpl.implCloseChannel(Channels.java:469)
	  at java.nio.channels.spi.AbstractInterruptibleChannel$1.interrupt(AbstractInterruptibleChannel.java:165)
	  - locked <0xc2f> (a java.lang.Object)
	  at java.nio.channels.spi.AbstractInterruptibleChannel.begin(AbstractInterruptibleChannel.java:173)
	  at java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:457)
	  - locked <0xc3b> (a java.lang.Object)
	  at com.google.cloud.hadoop.util.BaseAbstractGoogleAsyncWriteChannel.write(BaseAbstractGoogleAsyncWriteChannel.java:136)
	  - locked <0xc3c> (a com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$2)
	  at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
	  at java.nio.channels.Channels.writeFully(Channels.java:101)
	  at java.nio.channels.Channels.access$000(Channels.java:61)
	  at java.nio.channels.Channels$1.write(Channels.java:174)
	  - locked <0xc3d> (a java.nio.channels.Channels$1)
	  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
	  - locked <0xc3e> (a java.io.BufferedOutputStream)
	  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.write(GoogleHadoopOutputStream.java:118)
	  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
	  at java.io.DataOutputStream.write(DataOutputStream.java:107)
	  - locked <0xc3f> (a org.apache.hadoop.fs.FSDataOutputStream)
	  at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
```

This PR moves GCS write in GCSLogStore to a separate new thread to workaround the issue.

Closes delta-io#782

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: Shixiong Zhu <zsxwing@gmail.com>

#28761 is resolved by zsxwing/mm7ity34.

GitOrigin-RevId: 5bfa936db1983a151da7d6e4f239a200f634690a
(cherry picked from commit 7a3f1e8)

* [DELTA-OSS-EXTERNAL] Set the correct Hadoop configuration for checkpoint write

Checkpoint write doesn't pass the correct Hadoop configuration. It may fail when users specify their storage credentials using Spark session configs.

This PR fixes it and adds a test for it.

Closes delta-io#784

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: Shixiong Zhu <zsxwing@gmail.com>

#28914 is resolved by zsxwing/66uz7rxl.

GitOrigin-RevId: cc42b868f2a3ecffca88424fb7465792dc22c7e6
(cherry picked from commit 43d1422)

* [DELTA-OSS-EXTERNAL][BREAKING]Python: fix convertToDelta to return a delta.tables.DeltaTable

Fix the return type of DeltaTable.convertToDelta in Python.

GitOrigin-RevId: 4cd9bd00708a5cdd3c442bf9715ce9047d2b4893
(cherry picked from commit c586f9a)

* [SC-86940][DELTA]Fix potential checkpoint corruption issue for GCS in OSS Delta

This PR moves checkpoint write to a separate thread for GCS to workaround the potential checkpoint corruption issue when a thread is interrupted.

GitOrigin-RevId: 418df2441923c85c6415f9baec157c923c4c3dca
(cherry picked from commit 95e9076)

* [DELTA-OSS-EXTERNAL] Convert build to GitHub Actions

This PR solves delta-io#792

Closes delta-io#798

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: gurunath <gurunathrajagopal@gmail.com>
Author: Shixiong Zhu <zsxwing@gmail.com>

GitOrigin-RevId: 1b35ce70185ddfcdfca9992d695ffbe1de397e40
(cherry picked from commit ee5d31f)

* [DELTA-OSS-EXTERNAL] Fix an issue in Merge/Update/Delete when the table path contains special chars

getTouchedFiles incorrectly calls toUri which will escape the table path when it contains special chars. This causes Merge/Update/Delete not able to find matches files in this case.

This PR removes toUri to fix the issue and adds a test to confirm the fix.

Fix delta-io#724

Closes delta-io#741

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: Jarred Parrett <jarredparrett.dev@gmail.com>

GitOrigin-RevId: c8e6563fc48e82b86a00b58910eba8983fc5f933
(cherry picked from commit 4359484)

* [ES-113602] Minor change in sbt install script

Minor change in sbt install script.

Author: yaohua <yaohua.zhao@databricks.com>

GitOrigin-RevId: 26b0fe9df735739332a482ced59bdd90bd7534ec
(cherry picked from commit 4bee7ae)

* [DELTA-OSS-EXTERNAL] Fix the sbt launch download url

The bintray url is not working now. Use the `repo.typesafe.com` link instead.

Closes delta-io#711

Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>

Author: Shixiong Zhu <zsxwing@gmail.com>

#24449 is resolved by zsxwing/den3b8hd.

GitOrigin-RevId: 1f8fdb3bba694ff53001d13ebca9f84dfae0748e
(cherry picked from commit 86bbe99)

* Setting version to 1.0.1

Co-authored-by: Yijia Cui <yijia.cui@databricks.com>
Co-authored-by: gurunath <gurunathrajagopal@gmail.com>
Co-authored-by: Shixiong Zhu <zsxwing@gmail.com>
Co-authored-by: Tom Lynch <lync0143@gmail.com>
Co-authored-by: Jarred Parrett <jarredparrett.dev@gmail.com>
Co-authored-by: yaohua <yaohua.zhao@databricks.com>
Co-authored-by: Allison Portis <allison.portis@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants