-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destination BigQuery: Accept Dataset ID field prefixed by Project ID #8383
Destination BigQuery: Accept Dataset ID field prefixed by Project ID #8383
Conversation
Hi, @koji-m could you please update the branch and resolve conflicts on PR, tnx |
Hi, @mkhokh-33 |
@@ -46,6 +49,7 @@ | |||
|
|||
private static final Logger LOGGER = LoggerFactory.getLogger(BigQueryUtils.class); | |||
private static final String BIG_QUERY_DATETIME_FORMAT = "yyyy-MM-dd HH:mm:ss.SSSSSS"; | |||
private static final Pattern datasetIdPattern = Pattern.compile("^(([a-z]([a-z0-9\\-]*[a-z0-9])?):)?([a-zA-Z0-9_]+)$"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems to be a constant
pls, rename it to DATASET_ID_PATTERN
.put(BigQueryConsts.CONFIG_DATASET_LOCATION, "US"); | ||
} | ||
|
||
public static Stream<Arguments> validBigQueryIdProvider() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change method definition to private and move below the public methods
assertEquals(expected, actual); | ||
} | ||
|
||
public static Stream<Arguments> invalidBigQueryIdProvider() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change method definition to private and move below the public methods
|
||
public static Stream<Arguments> invalidBigQueryIdProvider() { | ||
return Stream.of( | ||
Arguments.arguments("my-project", ":my_dataset", "BigQuery Dataset ID format must match '[project-id:]dataset_id': :my_dataset"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to check regexp pattern for dataset we could also add here some other incorrect datasets:
1."my-project:my-project:my_dataset".(to check that ':' is not allowed inside project name)
2."my-project-:my_dataset". (project name cannot end with a hyphen)
3."my-project:"
3."my-project: "
@@ -56,7 +56,7 @@ public BigQueryDestination() { | |||
@Override | |||
public AirbyteConnectionStatus check(final JsonNode config) { | |||
try { | |||
final String datasetId = config.get(BigQueryConsts.CONFIG_DATASET_ID).asText(); | |||
final String datasetId = BigQueryUtils.getDatasetId(config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BigQueryUtils.getDatasetId(config) used in 2 cases
- BigQueryDestination.check()
- BigQueryDestination.getConsumer() invoke inside BigQueryUploaderFactory.getUploader then BigQueryUtils.getSchema then BigQueryUtils.getDatasetId(config)
Could you pls consider to add integration test to BigQueryDestinationTest to check that:
- we can create connection with provided dataset
- we can get consumer and write some data
hi @koji-m, thanks for your PR, I left some comments |
@mkhokh-33 Thank you for your review. I've fixed based on your review. |
LGTM, |
…te into parsing-dataset-names-in-bq
@edgao Thank you for your review. I've fixed based on your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! @mkhokh-33 can you take care of releasing this?
actually, lemme publish a separate PR first, will comment here once that finishes |
alright @mkhokh-33 we're good to go here! |
Hi @koji-m , LABEL io.airbyte.version=0.6.2 we also need to change version for destination-bigquery-denormalized cause it uses destination-bigquery implementation LABEL io.airbyte.version=0.2.3
"dockerImageTag": "0.2.3",
Thus I could proceed with merge. Thanks |
@mkhokh-33 Sorry for the late response. I've updated the docker image version numbers. (with docs)
|
@koji-m thank you, merged 👍 |
What
Closes: #1192
How
In the BigQuery destination connector's Dataset ID field, we could accept both syntax:
Make it error, if project-id in the Dataset ID field doesn't match the value in the Project ID field.
Recommended reading order
BigQueryDestinationTest.java
BigQueryDestination.java
Pre-merge Checklist
Expand the relevant checklist and delete the others.
Updating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereThis change is