Skip to content

Commit

Permalink
Issue GoogleCloudDataproc#144: allow writing Spark String to BQ TIME …
Browse files Browse the repository at this point in the history
…type (GoogleCloudDataproc#1017)

Currently, we allow reading BQ's TIME datatype into Spark's long datatype. However, there's no way to write into BQ's TIME datatype.
This change adds that functionality through String.
  • Loading branch information
vishalkarve15 authored Aug 8, 2023
1 parent e431869 commit bd33526
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Next

* Issue #144: allow writing Spark String to BQ TIME type
* PR #1038: Logical plan now shows the BigQuery table of DirectBigQueryRelation. Thanks @idc101 !

## 0.32.2 - 2023-08-07
Expand Down
4 changes: 3 additions & 1 deletion README-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -952,11 +952,13 @@ With the exception of `DATETIME` and `TIME` all BigQuery data types directed map
<tr valign="top">
<td><strong><code>TIME</code></strong>
</td>
<td><strong><code>LongType</code></strong>
<td><strong><code>LongType</code>, <strong><code>StringType</code>*</strong>
</td>
<td>Spark has no TIME type. The generated longs, which indicate <a href="https://avro.apache.org/docs/1.8.0/spec.html#Time+%2528microsecond+precision%2529">microseconds since midnight</a> can be safely cast to TimestampType, but this causes the date to be inferred as the current day. Thus times are left as longs and user can cast if they like.
<p>
When casting to Timestamp TIME have the same TimeZone issues as DATETIME
<p>
* Spark string can be written to an existing BQ TIME column provided it is in the <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#canonical_format_for_time_literals">format for BQ TIME literals</a>.
</td>
</tr>
<tr valign="top">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -339,10 +339,13 @@ && typeWriteable(sourceField.getType(), destinationField.getType())
}

// allowing widening narrow numeric into bignumeric
// allowing writing string to time
@VisibleForTesting
static boolean typeWriteable(LegacySQLTypeName sourceType, LegacySQLTypeName destinationType) {
return (sourceType.equals(LegacySQLTypeName.NUMERIC)
&& destinationType.equals(LegacySQLTypeName.BIGNUMERIC))
|| (sourceType.equals(LegacySQLTypeName.STRING)
&& destinationType.equals(LegacySQLTypeName.TIME))
|| sourceType.equals(destinationType);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1356,6 +1356,43 @@ public void testWriteNumericsToWiderFields() throws Exception {
.isEqualTo(new BigDecimal("12345.123450000000000"));
}

@Test
public void testWriteStringToTimeField() throws Exception {
// not supported for indirect writes
assumeThat(writeMethod, equalTo(WriteMethod.DIRECT));
IntegrationTestUtils.runQuery(
String.format(
"CREATE TABLE `%s.%s` (name STRING, wake_up_time TIME)", testDataset, testTable));
String name = "abc";
String wakeUpTime = "10:00:00";
Dataset<Row> df =
spark.createDataFrame(
Arrays.asList(RowFactory.create(name, wakeUpTime)),
structType(
StructField.apply("name", DataTypes.StringType, true, Metadata.empty()),
StructField.apply("wake_up_time", DataTypes.StringType, true, Metadata.empty())));
df.write()
.format("bigquery")
.mode(SaveMode.Append)
.option("dataset", testDataset.toString())
.option("table", testTable)
.option("writeMethod", writeMethod.toString())
.save();

Dataset<Row> resultDF =
spark
.read()
.format("bigquery")
.option("dataset", testDataset.toString())
.option("table", testTable)
.load();
List<Row> result = resultDF.collectAsList();
assertThat(result).hasSize(1);
Row head = result.get(0);
assertThat(head.getString(head.fieldIndex("name"))).isEqualTo("abc");
assertThat(head.getLong(head.fieldIndex("wake_up_time"))).isEqualTo(36000000000L);
}

public void testWriteSchemaSubset() throws Exception {
StructType initialSchema =
structType(
Expand Down

0 comments on commit bd33526

Please sign in to comment.