Skip to content

Conversation

@redstrate
Copy link

Fixes #18444. Inside of UrlPreviewer, we need to combine two dicts (one from oEmbed, and one from HTML) and in Mastodon's case they were very different. The one from HTML is basically useless, due to it being a Javascript application. But the oEmbed one has the actual post content, yet the information from HTML was preferred.

So I flipped which dictionary overlays which, so keys from oEmbed is preferred over the extracted HTML ones. This seems to be the original intention judging by the comment. I also updated to the newer Python synax for merging dictionaries.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

@redstrate redstrate requested a review from a team as a code owner November 26, 2025 19:58
@redstrate
Copy link
Author

For more context, Mastodon puts part of the post into og:description. Here's also a side-by-side of the difference in information:

Before After
image image

(The site image not loading is an unrelated issue.)

@redstrate redstrate force-pushed the work/redstrate/fix-mastodon-embeds branch from 4badeb5 to 308ceff Compare November 26, 2025 20:16
@clokep
Copy link
Contributor

clokep commented Nov 26, 2025

I'm pretty sure this regresses another site, maybe Twitter? I forgot why we started combining them like that.

@redstrate
Copy link
Author

I'm pretty sure this regresses another site, maybe Twitter? I forgot why we started combining them like that.

I'll try a few more sites and see what changes, I'm also worried about regressing some site I don't know about :/

@redstrate
Copy link
Author

redstrate commented Nov 27, 2025

Here's some more test sites:

YouTube

The broken YouTube previews are fixed too, woo!

Before After
image image

Twitter

It looks like this hasn't changed at all, which is good.

Before After
image image

GitHub

This is also identical.

Before After
image image

Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't personally use this feature, but your screenshots look like improvements to me.
I can see this being worse for some URLs but I don't know which ones and I'm not sure how we decide which ones are best to favour.

# information from the HTML and overlaying any information
# from the oEmbed response.
og = {**og_from_html, **og_from_oembed}
og = og_from_oembed | og_from_html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe at this point it would be good to add a comment explaining why this order matters (otherwise I'm half worried someone will come and flip-flop it the other way in a couple of months when another site is better the other way around).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, it now says why the order matters - not only the type of site but also the two examples shown off in the PR.

Fixes element-hq#18444. Inside of UrlPreviewer, we need to combine two
dicts (one from oEmbed, and one from HTML) and in Mastodon's case
they were very different. The one from HTML is basically useless,
due to it being a Javascript application. But the oEmbed one has
the actual post content, yet the information from HTML was preferred.

So I flipped which dictionary overlays which, so keys from oEmbed
is preferred over the extracted HTML ones. This seems to be the
original intention judging by the comment. I also updated to the
newer Python synax for merging dictionaries.
@redstrate redstrate force-pushed the work/redstrate/fix-mastodon-embeds branch from 308ceff to cfa64f5 Compare November 28, 2025 15:17
@devonh
Copy link
Member

devonh commented Nov 28, 2025

This seems related to #17462. Do you think it will help with that issue as well, or is that an additional problem?

@redstrate
Copy link
Author

This seems related to #17462. Do you think it will help with that issue as well, or is that an additional problem?

It looks like it does help that issue indeed, the buggy behavior matches up with what I see now. But with this patch it looks correct (see the YT section above)

@redstrate redstrate requested a review from reivilibre December 2, 2025 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect preview URL for Mastodon posts

4 participants