Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source mixpanel: basic normalization fails with duplicate field names #5355

Closed
Tracked by #12239
caiopedroso opened this issue Aug 11, 2021 · 8 comments · Fixed by #13372
Closed
Tracked by #12239

🐛 Source mixpanel: basic normalization fails with duplicate field names #5355

caiopedroso opened this issue Aug 11, 2021 · 8 comments · Fixed by #13372

Comments

@caiopedroso
Copy link

caiopedroso commented Aug 11, 2021

Enviroment

  • Airbyte version: 0.29.4-alpha
  • OS Version / Instance: macOS
  • Deployment: Docker
  • Source Connector and version: Mixpanel (0.1.0)
  • Destination Connector and version: BigQuery (0.3.12)
  • Severity: High
  • Step where error happened: Sync job

Current Behavior

I’m trying to sync for the first time and the process fails in the normalization step, apparently, something broke the dbt process, the log is attached.
If I try without normalization (raw data), it runs successfully

Expected Behavior

The normalized data appears at BigQuery without any error

Logs

logs-10-0.txt

**UPDATE - Apparently I have a field called username and another one userName, when I check the destination_catalog.json, I can see both fields, maybe this is causing the ambiguity that BigQuery is not able to resolve?

Steps to Reproduce

  1. Create a Mixpanel Source Connection and a BigQuery destination
  2. Try to sync with the Basic Normalization selected
@caiopedroso caiopedroso added the type/bug Something isn't working label Aug 11, 2021
@marcosmarxm marcosmarxm added the area/connectors Connector related issues label Aug 17, 2021
@sherifnada
Copy link
Contributor

**UPDATE - Apparently I have a field called username and another one userName, when I check the destination_catalog.json, I can see both fields, maybe this is causing the ambiguity that BigQuery is not able to resolve?

@caiopedroso yup that would do it 😅 is renaming these an option at all?

@caiopedroso
Copy link
Author

Sadly Mixpanel don't handle renaming properties/events, they only a have a feature called merge, but I'm not sure that it will hide one of the properties, I can try that.. keep u posted.

@caiopedroso
Copy link
Author

No luck doing that, tried to merge and hide the value, but at the export, the userN[n]ame keep coming as different fields..

@marcosmarxm
Copy link
Member

@caiopedroso normalization module applies some standard conventions for destination, like lowercase column names and other functions. Maybe you can export the normalization generated by Airbyte and run as a custom operator. I that case you can remove both username from final table or (I don't have full Bigquery knowledge here) use a function to select the column, maybe using quotes.

@caiopedroso
Copy link
Author

@marcosmarxm , Got you, I will try that, just waiting for the release of a fix for the M1 version and update here if I come up with a solution.

@misteryeo
Copy link
Contributor

@caiopedroso is this still an issue?

@caiopedroso
Copy link
Author

Yes @misteryeo , I was not able to work myself in a custom dbt model to try to solve that, if it's something that can be "out of the box" on airbyte, would appreciate it.

@lazebnyi lazebnyi linked a pull request Jun 1, 2022 that will close this issue
14 tasks
@roman-romanov-o
Copy link
Contributor

@caiopedroso I've added handling of such case into Export stream in connector

Try to update to latest version of connector, that will fix your problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants