Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add canonical url in head #15984

Merged
merged 1 commit into from
Feb 20, 2024
Merged

Conversation

GrygrFlzr
Copy link
Member

Description

Fixes #15342 by adding canonical URL to the <head> of all pages.

To check:

pnpm install
pnpm run docs-build
pnpm run docs-serve

Additional context

  • Why not transformHead instead?
    • transformHead was a cleaner solution, but it only adds the canonical tag to the .html variants of URLs, not the main pages. Canonical URLs are meant to be added to all variants of a page including the canonical page itself.
  • Why ogUrl as the base?
    • This is the constant also used by the other-languages versions of the documentation, which point to https://ko.vitejs.dev etc. Language variants are not considered variants of the English documentation, and thus canonical URLs should point to their own subdomain.
    • Canonical URLs should be absolute URLs, not relative.
  • Why two .replace calls instead of .replace(/(?:\/index)?\.md$/, '/')
    • This is arguably less readable, the current version is clearer in what it tries to achieve.

What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

Before submitting the PR, please make sure you do the following

  • Read the Contributing Guidelines, especially the Pull Request Guidelines.
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. fixes #123).
  • Update the corresponding documentation if needed.
  • Ideally, include relevant tests that fail without this PR but pass with it.

Copy link

stackblitz bot commented Feb 20, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@patak-dev
Copy link
Member

One question before merging, is it ok that the canonical url is https://vitejs.dev/guide/dep-pre-bundling/ instead of https://vitejs.dev/guide/dep-pre-bundling. Isn't the url without the trailing slash more correct here?

@GrygrFlzr
Copy link
Member Author

Isn't the url without the trailing slash more correct here?

I thought so at first, but apparently https://vitejs.dev/guide has a 301 redirect to https://vitejs.dev/guide/ and so on. The only exception is the root page, where both https://vitejs.dev and https://vitejs.dev/ are valid, so I chose the latter to keep the replacement as minimum/consistent as possible.

@patak-dev patak-dev merged commit cc388d9 into vitejs:main Feb 20, 2024
10 checks passed
@patak-dev
Copy link
Member

Thanks for fixing this! 💜

@sapphi-red
Copy link
Member

Thanks!

https://vitejs.dev/guide redirects to https://vitejs.dev/guide/, but https://vitejs.dev/guide/dep-pre-bundling/ redirects to https://vitejs.dev/guide/dep-pre-bundling. I guess it depends on whether the filename is index.md or not.
For index.md files, the path with trailing slash seems to be correct. But for non-index.md files, I guess the path without trailing slash is correct. 🤔

@bluwy
Copy link
Member

bluwy commented Feb 21, 2024

@sapphi-red Does that mean changing .replace(/\.md$/, '/') to .replace(/\.md$/, '') should fix it? I agree with that though.

@patak-dev
Copy link
Member

Maybe we could detect if it is an index.md and add the trailing slash. So we have https://vitejs.dev/guide/ and https://vitejs.dev/guide/dep-prebundling as canonicals, both without the redirect. I don't like the look of that trailing slash, but it is maybe more consistent because https://vitejs.dev/guide/ is one of the guides? 🤔

Another inconsistency is that when you click on the sidebar, you get to https://vitejs.dev/guide/using-plugins.html and not https://vitejs.dev/guide/using-plugins.

@brc-dd maybe we are holding the hammer wrong here? Is there a way to only have the pages without the .html (as in always redirect https://vitejs.dev/guide/using-plugins.html to https://vitejs.dev/guide/using-plugins). It is confusing that people will copy the URL and sometimes it has the .html and sometimes it doesn't. We could add the redirect ourselves but it seems something that VitePress should support out of the box, no?

@brc-dd
Copy link
Contributor

brc-dd commented Feb 21, 2024

cleanUrls: true -- vite is deployed on Netlify right? No extra configuration should be needed. If a fork (translation repo) is deployed on Vercel, they will need configurations.

PS: The regex for canonical should be this though - .relativePath.replace(/(?:(^|\/)index)?\.md$/, '$1') (might be better to use url constructor to join base url to handle the home page case)

PPS: You don't need canonical for algolia to dedupe. Just enable cleanUrls and it should work. We are using that on VitePress docs.

@patak-dev
Copy link
Member

Thanks @brc-dd! @GrygrFlzr let us know if you'd like to follow up with another PR. Just in case, I haven't yet deployed the docs, so there is no rush here.

@GrygrFlzr
Copy link
Member Author

I can take a look in an hour. Is there a way to check on netlify if Algolia has properly deduplicated the index via canonization, at least?

@patak-dev
Copy link
Member

I don't think we can check that until we deploy 🤔
We can deploy after modifying the regex and adding cleanUrls and check then.

@bluwy
Copy link
Member

bluwy commented Feb 21, 2024

I think building the site and verifying that it matches the pattern Sapphi described should be good enough.

  • dist/guide/index.html should have the canonical url as https://vitejs.dev/guide/
  • dist/guide/features.html should have the canonical url as https://vitejs.dev/guide/features
  • dist/index.html should have the canonical url as https://vitejs.dev (special case where root index.html has no trailing slash)

That way we can avoid an additional redirect from Netlify. I think overall it's not urgent though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some pages are displayed twice in search results (with and without .html)
5 participants