Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http connector: Normalization sync error #1296

Closed
awoehrl opened this issue Dec 11, 2020 · 8 comments
Closed

Http connector: Normalization sync error #1296

awoehrl opened this issue Dec 11, 2020 · 8 comments
Assignees
Labels
type/bug Something isn't working

Comments

@awoehrl
Copy link

awoehrl commented Dec 11, 2020

Expected Behavior

I've tried to normalize json response from two different http sources.

Current Behavior

Failed sync on normalization step

2020-12-11 13:24:00 INFO (/tmp/workspace/65/1) DefaultSyncWorker(run):75 - configured sync modes: {data=full_refresh}
2020-12-11 13:24:00 INFO (/tmp/workspace/65/1) DefaultAirbyteDestination(start):67 - Running target...
2020-12-11 13:24:00 DEBUG (/tmp/workspace/65/1) DockerProcessBuilderFactory(create):103 - Preparing command: docker run --rm -i -v workspace:/data -v /tmp/airbyte_local:/local -w /data/65/1 --network host airbyte/destination-bigquery:0.1.10 write --config target_config.json --catalog catalog.json
2020-12-11 13:24:00 DEBUG (/tmp/workspace/65/1) DockerProcessBuilderFactory(create):103 - Preparing command: docker run --rm -i -v workspace:/data -v /tmp/airbyte_local:/local -w /data/65/1 --network host airbyte/source-http-request:0.1.0 read --config tap_config.json --catalog catalog.json
2020-12-11 13:24:02 DEBUG (/tmp/workspace/65/1) DefaultAirbyteSource(close):109 - Closing tap process
2020-12-11 13:24:02 DEBUG (/tmp/workspace/65/1) DefaultAirbyteDestination(close):102 - Closing target process
2020-12-11 13:24:02 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:02 �[32mINFO�[m i.a.i.d.b.BigQueryDestination(main):391 - {} - starting destination: class io.airbyte.integrations.destination.bigquery.BigQueryDestination
2020-12-11 13:24:02 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:02 �[32mINFO�[m i.a.i.b.IntegrationRunner(run):79 - {} - Running integration: io.airbyte.integrations.destination.bigquery.BigQueryDestination
2020-12-11 13:24:02 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:02 �[32mINFO�[m i.a.i.b.IntegrationCliParser(parseOptions):135 - {} - integration args: {catalog=catalog.json, write=null, config=target_config.json}
2020-12-11 13:24:02 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:02 �[32mINFO�[m i.a.i.b.IntegrationRunner(run):83 - {} - Command: WRITE
2020-12-11 13:24:02 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:02 �[32mINFO�[m i.a.i.b.IntegrationRunner(run):84 - {} - Integration config: IntegrationConfig{command=WRITE, configPath='target_config.json', catalogPath='catalog.json', statePath='null'}
2020-12-11 13:24:05 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:05 �[32mINFO�[m i.a.i.d.b.BigQueryDestination(createTable):270 - {} - Table created successfully
2020-12-11 13:24:06 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:06 �[32mINFO�[m i.a.i.b.FailureTrackingConsumer(close):50 - {} - hasFailed: false.
2020-12-11 13:24:09 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:09 �[1;31mERROR�[m i.a.i.d.b.BigQueryDestination$RecordConsumer(close):345 - {} - executing on success close procedure.
2020-12-11 13:24:11 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:11 �[32mINFO�[m i.a.i.b.IntegrationRunner(run):120 - {} - Completed integration: io.airbyte.integrations.destination.bigquery.BigQueryDestination
2020-12-11 13:24:11 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 2020-12-11 13:24:11 �[32mINFO�[m i.a.i.d.b.BigQueryDestination(main):393 - {} - completed destination: class io.airbyte.integrations.destination.bigquery.BigQueryDestination
2020-12-11 13:24:11 INFO (/tmp/workspace/65/1) DefaultSyncWorker(run):105 - Running normalization.
2020-12-11 13:24:11 DEBUG (/tmp/workspace/65/1) DockerProcessBuilderFactory(create):103 - Preparing command: docker run --rm -i -v workspace:/data -v /tmp/airbyte_local:/local -w /data/65/1/normalize --network host airbyte/normalization:0.1.2 run --integration-type bigquery --config target_config.json --catalog catalog.json
2020-12-11 13:24:12 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Namespace(config='target_config.json', integration_type=<DestinationType.bigquery: 'bigquery'>, out='/data/65/1/normalize')
2020-12-11 13:24:12 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - transform_bigquery
2020-12-11 13:24:12 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Processing catalog.json...
2020-12-11 13:24:12 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Generating data.sql in /data/65/1/normalize/models/generated/
2020-12-11 13:24:13 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Running with dbt=0.18.1
2020-12-11 13:24:18 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Installing https://github.com/fishtown-analytics/dbt-utils.git@0.6.2
2020-12-11 13:24:19 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Installed from revision 0.6.2
2020-12-11 13:24:19 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 -
2020-12-11 13:24:21 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Running with dbt=0.18.1
2020-12-11 13:24:23 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Found 1 model, 0 tests, 0 snapshots, 0 analyses, 319 macros, 0 operations, 0 seed files, 1 source
2020-12-11 13:24:23 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 -
2020-12-11 13:24:24 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:24 | Concurrency: 32 threads (target='prod')
2020-12-11 13:24:24 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:24 |
2020-12-11 13:24:24 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:24 | 1 of 1 START table model linkpulse.data...................................................................... [RUN]
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:26 | 1 of 1 ERROR creating table model linkpulse.data............................................................. [�[31mERROR�[0m in 1.93s]
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:26 |
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - 13:24:26 | Finished running 1 table model in 3.38s.
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 -
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - �[31mCompleted with 1 error and 0 warnings:�[0m
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 -
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - �[33mDatabase Error in model data (models/generated/data.sql)�[0m
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - No matching signature for function CONCAT with no arguments. Supported signatures: CONCAT(STRING, [STRING, ...]); CONCAT(BYTES, [BYTES, ...]) at [24:21]
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - compiled SQL at ../build/run/airbyte_utils/models/generated/data.sql
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 -
2020-12-11 13:24:26 INFO (/tmp/workspace/65/1) LineGobbler(voidCall):69 - Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
2020-12-11 13:24:27 DEBUG (/tmp/workspace/65/1) DefaultNormalizationRunner(close):97 - Closing tap process
2020-12-11 13:24:27 ERROR (/tmp/workspace/65/1) DefaultSyncWorker(run):112 - Normalization Failed.
io.airbyte.workers.WorkerException: Normalization Failed.
at io.airbyte.workers.DefaultSyncWorker.run(DefaultSyncWorker.java:109) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.DefaultSyncWorker.run(DefaultSyncWorker.java:48) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.wrappers.OutputConvertingWorker.run(OutputConvertingWorker.java:44) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.wrappers.JobOutputSyncWorker.run(JobOutputSyncWorker.java:32) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.lambda$new$0(WorkerRun.java:53) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.call(WorkerRun.java:61) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.call(WorkerRun.java:42) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.execute(LifecycledCallable.java:114) [io.airbyte-airbyte-commons-0.7.1-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.call(LifecycledCallable.java:98) [io.airbyte-airbyte-commons-0.7.1-alpha.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
at java.lang.Thread.run(Thread.java:832) [?:?]
Suppressed: io.airbyte.workers.WorkerException: Tap process wasn't successful
at io.airbyte.workers.normalization.DefaultNormalizationRunner.close(DefaultNormalizationRunner.java:100) ~[io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.DefaultSyncWorker.run(DefaultSyncWorker.java:104) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.DefaultSyncWorker.run(DefaultSyncWorker.java:48) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.wrappers.OutputConvertingWorker.run(OutputConvertingWorker.java:44) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.workers.wrappers.JobOutputSyncWorker.run(JobOutputSyncWorker.java:32) [io.airbyte-airbyte-workers-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.lambda$new$0(WorkerRun.java:53) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.call(WorkerRun.java:61) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.scheduler.WorkerRun.call(WorkerRun.java:42) [io.airbyte-airbyte-scheduler-0.7.1-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.execute(LifecycledCallable.java:114) [io.airbyte-airbyte-commons-0.7.1-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.call(LifecycledCallable.java:98) [io.airbyte-airbyte-commons-0.7.1-alpha.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
at java.lang.Thread.run(Thread.java:832) [?:?]```

Medium
@awoehrl awoehrl added the type/bug Something isn't working label Dec 11, 2020
@ChristopheDuong
Copy link
Contributor

ChristopheDuong commented Dec 11, 2020

Could you try to extract the following files please?

  • catalog.json
  • models/generated/data.sql

You should be able to do so by running the following commands to retrieve them (see docs):

docker cp airbyte-server:/tmp/workspace/65/1/normalize/catalog.json catalog.json
docker cp airbyte-server:/tmp/workspace/65/1/normalize/models/generated/data.sql data.sql

It would probably help us have a better understanding of what broke in the process, thanks!

(or if you can share the HTTP source if it's public then it's even better to reproduce on our side)

@awoehrl
Copy link
Author

awoehrl commented Dec 22, 2020

Hi @ChristopheDuong, I did generate the files for two different http connectors that fail in normalization. These look quite empty so I'm not sure if it's useful...

catalog_data.zip

The sources aren't public unfortunately.

@ChristopheDuong
Copy link
Contributor

ChristopheDuong commented Dec 22, 2020

Thanks for your answer!

Even if it's empty, it's actually a little bit useful...

The conclusion is that the normalization itself hasn't failed but it is the input catalog.json files that are not valid... (so there are no chances that normalization can succeed from these inputs)

Can you share the original schema of the source table/API?

I suspect you are trying to replicate a nested/complex object that is not handled well in the UI yet as reported here:

@ChristopheDuong
Copy link
Contributor

In the meantime, i opened this new issue so we can identify the use case you encountered more rapidly from the logs: #1426

@awoehrl
Copy link
Author

awoehrl commented Dec 23, 2020

Thanks @ChristopheDuong, yes, both APIs I've tried return a nested schema. This would be a redacted and shortened example return:

{

    "meta": {
        "from": "2020-12-22T00:00:00+01:00",
        "to": "2020-12-22T23:59:59+01:00",
        "filter": {
            "url": ""
        },
        "field": {
            "toUrl": "one",
            "toPagetype": "one",
            "clicks": "sum"
        },
        "aggregate": [
            "toUrl",
            "toPagetype"
        ],
        "sort": [
            "-clicks"
        ],
        "page": {
            "offset": "0",
            "limit": "100"
        }
    },
    "links": {
        "self": "",
        "first": "",
        "prev": "",
        "next": ""
    },
    "data": [
        {
            "type": "analytics",
            "id": "",
            "attributes": {
                "toUrl": "",
                "toPagetype": "article",
                "clicks": 141
            }
        },
        {
            "type": "analytics",
            "id": "",
            "attributes": {
                "toUrl": "",
                "toPagetype": "frontpage",
                "clicks": 130
            }
        }
        }
    ]

}

@cgardens cgardens modified the milestones: v0.9.0, v0.10.0, v0.11.0 Jan 4, 2021
@cgardens cgardens modified the milestones: v0.11.0, v0.12.0 Jan 11, 2021
@cgardens cgardens modified the milestones: v0.12.0, Launch 1.0 Jan 19, 2021
@ChristopheDuong ChristopheDuong removed this from the Beta Launch milestone Jan 21, 2021
@cgardens
Copy link
Contributor

cgardens commented Mar 8, 2021

@ChristopheDuong this should be fixed now given all of the upgrades you did to normalization, right?

@ChristopheDuong
Copy link
Contributor

@ChristopheDuong this should be fixed now given all of the upgrades you did to normalization, right?

indeed

@ChristopheDuong
Copy link
Contributor

Closing as this is probably solved in more recent versions of normalization, you can re-open if it still not successful with airbyte version greater than 0.17-2-alpha!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants