Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting prefetched content breaks after utf8 conversion #335

Open
kolaente opened this issue Jun 27, 2023 · 2 comments
Open

Setting prefetched content breaks after utf8 conversion #335

kolaente opened this issue Jun 27, 2023 · 2 comments

Comments

@kolaente
Copy link

Prefetching content and then setting it with the setContentAsPrefetched as prefetched breaks that content after it gets converted to utf8. I suspect this is because response headers are not present.

I observed this when parsing LinkedIn posts, for example this one results in:

🔒� Apple has joined the chorus of voices warning about the potential risks posed by the #OnlineSafetyBill to end-to-end encryption. 💡Protecting our privacy and security is crucial.

(partial response for clarity)

The same part in the response I prefetched looks like this:

🔒️ Apple has joined the chorus of voices warning about the potential risks posed by the #OnlineSafetyBill to end-to-end encryption.\n\n💡Protecting our privacy and security is crucial.

Not prefetching the content and letting Graby handle it instead does not mangle the emojis. Unfortunately I need to use prefetched content because that lets me test it (I use Laravel's HTTP::get facade and that's mockable whereas Gabys internal response is not).

@j0k3r
Copy link
Owner

j0k3r commented Aug 24, 2023

So, maybe we shouldn't convert the response to utf-8 if it comes from the prefetched content?

@kolaente
Copy link
Author

kolaente commented Aug 25, 2023

Or provide a flag to either pass response headers to graby or tell it to not convert the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants