Skip to content

Ignore unsupported UTF-8 characters #127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DimitriFourny
Copy link

If a file contains an unsupported UTF-8 character, it will break the full runner.py script.
Ignoring the unwanted characters seems to be the best solution.

@Waqar144
Copy link
Collaborator

what is the error?

@DimitriFourny
Copy link
Author

I don't remember the exact byte value and position but it was:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 747: invalid start byte

@Waqar144
Copy link
Collaborator

Do you have some minimal code with which I can reproduce the issue?

@DimitriFourny
Copy link
Author

Unfortunately no, but I was using the codebrowser on Chromium source code.
Just putting invalid UTF-8 value in one of the generated file will reproduce the issue in fact.
It was in the /refs directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants