Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: crawl URLs in <meta> tags #9900

Merged
merged 13 commits into from
May 17, 2023
Merged

feat: crawl URLs in <meta> tags #9900

merged 13 commits into from
May 17, 2023

Conversation

LorisSigrist
Copy link
Contributor

As described in #5228

Up until now, urls in <meta> tags have not been crawled. This has made programmatic social-images tricky.

This PR crawls urls in <meta> tags, depending on if the have a name or property attribute that is whitelisted.
Not all meta-tags can contain urls, so a whitelist is needed to only crawl the ones that can. I added all the usual social-media tags to the whitelist.

This did require some minor reorganisation of prepublish/crawl.js, since this is the first time that more than one attribute needed to be evaluated to crawl single tag.

Please don't delete this checklist! Before submitting the PR, please make sure you do the following:

  • It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • This message body should clearly illustrate what problems it solves.
  • Ideally, include a test that fails without this PR but passes with it.

Tests

  • Run the tests with pnpm test and lint the project with pnpm lint and pnpm check

Changesets

  • If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

@changeset-bot
Copy link

changeset-bot bot commented May 10, 2023

🦋 Changeset detected

Latest commit: 774bc04

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/kit Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@benmccann benmccann changed the title feat: Crawl urls in <meta> tags feat: crawl URLs in <meta> tags May 11, 2023
Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Copy link
Member

@Rich-Harris Rich-Harris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! had a question about .trim().toLowerCase(), inline

Comment on lines 230 to 236
if (name && CRAWLABLE_META_NAME_ATTRS.has(name)) {
hrefs.push(resolve(base, content.trim().toLowerCase()));
}

if (property && CRAWLABLE_META_NAME_ATTRS.has(property.trim().toLowerCase())) {
hrefs.push(resolve(base, content));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we trimming/lowercasing the content attribute when name is present, but the property attribute when property is present?

in fact do we need to do any trimming or lowercasing at all? this doesn't happen with any other attribute, i'm curious why these would be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops I left that in there accidentally 😅

At one point I was worried about the case where someone would leave some whitespace / wrongly capitalise some letters in an attribute, so during the refactor I trimmed and lowercased all attributes before the checks. After some experimenting, I then decided that it wasn't really required, and would make this PR dual-purpouse (addressing something that wasn't in the issue) so I removed it.

Must have missed that one, since it doesn't break anything. Sorry about that

@Rich-Harris Rich-Harris merged commit ab9f577 into sveltejs:master May 17, 2023
@github-actions github-actions bot mentioned this pull request May 17, 2023
leonardoadame pushed a commit to leonardoadame/Affiliate-tech that referenced this pull request May 17, 2023
* feat: Add a speedier script tag for prerendered redirects (sveltejs#9911)

* feat: Add a script redirect to prerendered pages

* changeset

* Update .changeset/hungry-rocks-hunt.md

Co-authored-by: Simon H <5968653+dummdidumm@users.noreply.github.com>

* fix test

* feat: Update string escaping

* Update .changeset/hungry-rocks-hunt.md

Co-authored-by: Conduitry <git@chor.date>

---------

Co-authored-by: Simon H <5968653+dummdidumm@users.noreply.github.com>
Co-authored-by: Conduitry <git@chor.date>

* fix: avoid inlining raw/url CSS imports (sveltejs#9925)

* fix: use transformRequest for CSS modules

* just avoid raw or url

* test and changeset

* use parsed query

* remove only

* Update packages/kit/test/apps/basics/test/cross-platform/test.js

* drive by test speed up

* feat: prerender & analyse in worker rather than subprocess to support Deno (sveltejs#9919)

* feat(fork): use workers

* Create hot-actors-hope.md

* Update packages/kit/src/utils/fork.js

Co-authored-by: Rich Harris <hello@rich-harris.dev>

---------

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <hello@rich-harris.dev>

* avoid using isMainThread, since it interacts poorly with vitest (sveltejs#9941)

Co-authored-by: Rich Harris <git@rich-harris.dev>

* chore: bump vite and devalue (sveltejs#9933)

* bump vite and devalue

* update templates

* merge master

* fix test

---------

Co-authored-by: Rich Harris <git@rich-harris.dev>

* chore: uvu -> vitest for create-svelte tests (sveltejs#9910)

* chore: uvu -> vitest for create-svelte tests

* format

* concurrency (sveltejs#9921)

* realised we werent typechecking this file, found some errors. fixed

* Wait for beforeAll hook to complete

* simplify

* exclude create-svelte/template files from prettier, so that we can emit correctly formatted templates

* remove unused file

* Revert "exclude create-svelte/template files from prettier, so that we can emit correctly formatted templates"

This reverts commit aa188d4.

---------

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>

* feat: unshadow `form` and `data` in `enhance` (sveltejs#9902)

* feat: Un-shadow `data` and `form` in `enhance`, warn about future deprecation in dev

* changeset

* snek

* Update .changeset/odd-crews-own.md

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>

* Update packages/kit/test/apps/dev-only/package.json

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>

* am not smart

* still not smart

* oops

* oof

* add deprecation notice

---------

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>

* fix: Set loader: { '.wasm': 'copy' } in esbuild config in `adapter-cloudflare-workers` (sveltejs#9940)

* fix: Set loader: { '.wasm': 'copy' } in esbuild config in `adapter-cloudflare-workers`

Copies WASM files in Cloudflare instead of trying to load them.

Related to sveltejs#9909

* Create brave-peaches-buy.md

* Update packages/adapter-cloudflare-workers/index.js

* format

---------

Co-authored-by: Rich Harris <git@rich-harris.dev>
Co-authored-by: Rich Harris <hello@rich-harris.dev>

* fix: Flaky test (sveltejs#9947)

* fix: Flaky test

* reuse locator

* add semi

* drive by test speed up

* another classic

* oh my god brain, y u so bad

---------

Co-authored-by: gtmnayan <50981692+gtm-nayan@users.noreply.github.com>
Co-authored-by: gtmnayan <gtmnayan@gmail.com>

* remove envVarsInUse (sveltejs#9942)

Co-authored-by: Rich Harris <git@rich-harris.dev>

* fix: type `vitePlugin` in config (sveltejs#9946)

* fix: type `vitePlugin` in config

* changeset

* fix: Set `loader: { '.wasm': 'copy' }` in esbuild config in `adapter-vercel` (sveltejs#9944)

* fix: Enable wasm copy in adapter-vercel

* Create olive-rings-eat.md

---------

Co-authored-by: Rich Harris <richard.a.harris@gmail.com>

* feat: crawl URLs in `<meta>` tags (sveltejs#9900)

* Crawl social-image urls during prerender

* Formatting & Linting

* Format changeset & added exhaustive list of crawlable urls

* Changed severity to minor as described in sveltejs#5228

* Added support for `property` attribute & limited valid names to just social tags

* More tests

* Better changeset message - I'm indecisive

* Update .changeset/thirty-garlics-tan.md

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>

* simplify

* simplify

* Removed redundant data-sanitation

* DRY out

---------

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>

* Version Packages (sveltejs#9893)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: support AWS via SST in adapter-auto (sveltejs#9874)

* [feat] support AWS via SST in adapter-auto

* Sync

* Delete 95-adapter-aws-sst.md

* Update .changeset/rotten-ducks-tan.md

---------

Co-authored-by: Rich Harris <hello@rich-harris.dev>

* Version Packages (sveltejs#9953)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* only cache response if response has cache-control header (sveltejs#9885)

* only cache response if response has cache-control header

* add changeset

* Version Packages (sveltejs#9955)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: ensure styles are loaded in dev mode for routes containing special characters (sveltejs#9894)

* Fix loading styles for routes containing special characters in dev mode.

SvelteKit doesn't decode special characters in pathnames when loading CSS modules in dev mode, resulting in an error:

Internal server error: Failed to load url /src/routes/(special)/hinnap%C3%A4ring/+page.svelte?svelte=&type=style&lang.css=&inline= (resolved id: /src/routes/(special)/hinnap%C3%A4ring/+page.svelte?svelte&type=style&lang.css). Does the file exist?

Actual path:

/src/routes/(special)/hinnapäring/

Fix by using decodeURI on the url.pathname when loading CSS modules.

* Create stale-houses-yell.md

* decodeURL inside if

* add test

---------

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>

* feat: Warn users when submitting forms with files but no `enctype="multipart/form-data"` (sveltejs#9888)

* fix: Package name keeps me from filtering with pnpm

* feat: Warn users when submitting a form containing a file without the correct enctype

* changeset

* Update packages/kit/src/runtime/app/forms.js

Co-authored-by: gtmnayan <50981692+gtm-nayan@users.noreply.github.com>

* style

* moar style tweaks

* better test skip

* Update .changeset/tasty-llamas-relate.md

Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>

* DRY out

* only warn once per submit

* Update .changeset/tasty-llamas-relate.md

Co-authored-by: Rich Harris <richard.a.harris@gmail.com>

---------

Co-authored-by: gtmnayan <50981692+gtm-nayan@users.noreply.github.com>
Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>
Co-authored-by: Rich Harris <richard.a.harris@gmail.com>

* Version Packages (sveltejs#9957)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* security: Stop automatically adding URLs from server-side `load` `fetch` calls to dependencies (sveltejs#9945)

* feat: Add `dangerZone` config

* breaking: Don't implicitly track deps in server-side fetch

* changeset

* bein dumb

* fix: Server load invalidation

* fix: Write config for server

* fix: test

* unsurprisingly i am dumb

* docs: Clarify difference between server `fetch` and universal `fetch`

* tests

* rename to trackServerFetches

---------

Co-authored-by: Rich Harris <git@rich-harris.dev>

* Version Packages (sveltejs#9963)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: S. Elliott Johnson <sejohnson@torchcloudconsulting.com>
Co-authored-by: Simon H <5968653+dummdidumm@users.noreply.github.com>
Co-authored-by: Conduitry <git@chor.date>
Co-authored-by: gtmnayan <50981692+gtm-nayan@users.noreply.github.com>
Co-authored-by: Fernando López Guevara <fernando.lguevara@gmail.com>
Co-authored-by: Ben McCann <322311+benmccann@users.noreply.github.com>
Co-authored-by: Rich Harris <hello@rich-harris.dev>
Co-authored-by: Rich Harris <richard.a.harris@gmail.com>
Co-authored-by: Rich Harris <git@rich-harris.dev>
Co-authored-by: Conner <conner@semicognitive.com>
Co-authored-by: gtmnayan <gtmnayan@gmail.com>
Co-authored-by: Loris Sigrist <43482866+LorisSigrist@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Frank <frank@sst.dev>
Co-authored-by: Frank Dumont <fj.dumont@gmail.com>
Co-authored-by: Reio Remma <reio@mrstuudio.ee>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants