Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Marketo: handle chinese chars #4405

Open
marcosmarxm opened this issue Jun 29, 2021 · 5 comments
Open

Source Marketo: handle chinese chars #4405

marcosmarxm opened this issue Jun 29, 2021 · 5 comments

Comments

@marcosmarxm
Copy link
Member

Tell us about the problem you're trying to solve

Marketo connector need to handle chinese encoding.
Request from user in slack convo.
Looks the actual marketo-singer doesnt handle this encoding.

Describe the solution you’d like

A clear and concise description of what you want to see happen, or the change you would like to see

Describe the alternative you’ve considered or used

A clear and concise description of any alternative solutions or features you've considered or are using today.

Additional context

Add any other context or screenshots about the feature request here.

Are you willing to submit a PR?

Your answer

@marcosmarxm marcosmarxm added the type/enhancement New feature or request label Jun 29, 2021
@abrittis
Copy link

abrittis commented Jun 29, 2021

The issue is with Marketo API call:
https://.mktorest.com/bulk/v1//export/<job_id>/file.json
req. headers:

  • Authorization:
  • User-Agent: Singer.io/tap-marketo

The header response.encoding from Marketo is set to ISO-8859-1
This causes Python's requests.models.iter_content(decode_unicode=True) to use the incorrect decoder.

Possible fixes:

  • Have the Marketo API return encoding=utf-8
  • In tap-marketo.sync.stream_rows: change the header response.encoding from ISO-8859-1 to utf-8

tap-marketo.sync.py

`def stream_rows(client, stream_type, export_id):
with tempfile.NamedTemporaryFile(mode="w+", encoding="utf8", delete=False) as csv_file:
singer.log_info("Download starting.")
resp = client.stream_export(stream_type, export_id)
resp.encoding = 'utf-8'
for chunk in resp.iter_content(chunk_size=CHUNK_SIZE_BYTES, decode_unicode=True):
if chunk:
# Replace CR
chunk = chunk.replace('\r', '')
csv_file.write(chunk)

    singer.log_info("Download completed. Begin streaming rows to file: " + csv_file.name)
    csv_file.seek(0)

    reader = csv.reader((line.replace('\0', '') for line in csv_file), delimiter=',', quotechar='"')
    headers = next(reader)
    for line in reader:
        yield dict(zip(headers, line))`

@erameshbabu
Copy link

singer-io/tap-marketo#74

fix checked in to the main tap_marketo.

@davydov-d
Copy link
Collaborator

hey @YowanR should we validate singer based connector issues against the CDK based ones? if so, should I treat this issue as a bug and include in the certification scope?

@YowanR
Copy link
Contributor

YowanR commented Aug 10, 2022

This one is out of scope for the certification process. @davydov-d
We will look at this issue again if there are more requests for it.

@CyprienBarbault
Copy link
Contributor

Duplicate of #20641

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests