-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
53 changed files
with
798 additions
and
379 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,3 +21,4 @@ metastore_db/ | |
# Scala-IDE specific | ||
.scala_dependencies | ||
.worksheet | ||
/.bsp/sbt.json |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Changelog | ||
All notable changes to this project will be documented in this file. | ||
|
||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | ||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | ||
|
||
## [6.0.0] - 2020-05 | ||
### Migrated | ||
- Spark 3.0 Migration | ||
* Migrate to Spark version 3.0.1, Hadoop 3.2.1 and Scala 2.12 | ||
* Spark 3 uses the Proleptic Gregorian calendar. | ||
In case there are problems when data sources have dates before 1582 or other problematics formats, as a quick fix we can set the | ||
following spark parameters in the pipelines: | ||
``` | ||
"spark.sql.legacy.timeParserPolicy": "LEGACY", "spark.sql.legacy.parquet.datetimeRebaseModeInWrite": "LEGACY", "spark.sql.legacy.parquet.datetimeRebaseModeInRead": "LEGACY" | ||
``` | ||
An example of an exception related to parsing dates and timestamps looks like this: | ||
``` | ||
SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '00/00/0000' in the new parser. You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string. | ||
``` | ||
Note 1: there's also two other exceptions that we observed related to reading or writing Parquets with old date/time formats. | ||
They look very similar to the Spark upgrade exception above, but highlight the need to change the respective spark.sql.legacy.parquet.datetimeRebaseModeInXXXXX property. | ||
Note 2: the solution provided above should cover all the exceptions enumerated here for a given data source. | ||
## [5.8.0] - 2020-04 | ||
### Added | ||
- Fix reconciliation execution time by removing unneeded caching stage. | ||
## [5.7.5] - 2020-04 | ||
### Added | ||
- Enable multi-line option for append loads | ||
- fix duplicate issues generated by the latest changes applied to CompetitorDataPreprocessor | ||
### [5.7.2] - 2021-02 | ||
#### Added | ||
- Make init condensation optional, but true by default. | ||
### [5.7.1] - 2020-02 | ||
#### Added | ||
- Modify append load to support more complex partitioning strategies without file_regex | ||
- Added support for configuring write load mode and num output files in append load | ||
- Support for specifying the quote and escape characters. More info on how to specify those here: https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/DataFrameReader.html | ||
### [5.7.0] - 2020-01 | ||
#### Added | ||
- Support for multiple partition attributes (non date-derived) and single non date-derived partition attributes. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,4 @@ | ||
#!/bin/bash | ||
|
||
#!/bin/sh | ||
function array_contains() { | ||
local LOCAL_NEEDLE=$1 | ||
shift | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
sbt.version = 1.3.13 | ||
sbt.version = 1.5.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.