[SEO Audits] Document has a valid rel=canonical #3178

rviscomi · 2017-08-29T18:46:06Z

Audit group: Content best practices
Description: Document has a valid rel=canonical
Failure description: Document does not have a valid rel=canonical ({value})
Help text: Canonical links suggest which URL to show in search results. Read more in Use canonical URLs.

Success conditions:

Query selector head > link[rel=canonical] doesn’t match any elements; otherwise
- href value of canonical link is not set to the root; otherwise
  - origins of href and current page are different
  - location.pathname is not the root.
href value of canonical link is absolute

Valid examples:

URL example.de/, <link rel="canonical" href="example.com/de/">
URL example.com/de/, <link rel="canonical" href="example.de/">

Invalid examples:

URL example.com/blog/, <link rel="canonical" href="example.com/">
URL example.de/, <link rel="canonical" href="example.com/">
URL example.com/de/, <link rel="canonical" href="/">

The text was updated successfully, but these errors were encountered:

kdzwinel · 2017-12-11T00:29:26Z

For the record: we agreed that we should also support canonical Link headers.

Questions:

How do we deal with multiple canonicals? There can be multiple headers and/or tags. ATM I'm processing only the first one found (headers before tags). But according to this if there are multiple canonicals pointing to different URLs, search engines will ignore all of them.
How do we deal with invalid URLs? ATM I'm failing the audit when new URL(href) fails.
Where is this part coming from? I didn't found a rule that would explain this.

href value of canonical link is not set to the root; otherwise
origins of href and current page are different
location.pathname is not the root.

Why example.de/ can't have a canonical link to example.com/, but it can have one to example.com/de/?

Your examples indicate that canonical can point to different domain. Official docs seem to agree saying "With content syndication, it's also easy for content to be distributed to different URLs and domains entirely.". However, all the examples given in the docs show the same domain. Also, according to yahoo and bing different domains are not supported. Should we check that with John?

rviscomi · 2017-12-11T19:38:49Z

Good questions :)

How do we deal with multiple canonicals?

We should fail the audit and specify the failure reason being conflicting canonical links.

How do we deal with invalid URLs? ATM I'm failing the audit when new URL(href) fails.

Failing when invalid SGTM. Again, we should be clear about the failure reason.

Where is this part coming from?

The advice came from an offline thread. Forwarding to you for context.

Your examples indicate that canonical can point to different domain.

Yeah worth checking to be sure. It does seem like the docs use different domains, for example with/without the www subdomain, although having different TLDs may not be valid.

kdzwinel · 2017-12-11T23:07:31Z

Thanks for a quick response!

The advice came from an offline thread. Forwarding to you for context.

Thanks! It does sound a bit GoogleBot specific, doesn't it?

It does seem like the docs use different domains, for example with/without the www subdomain, although having different TLDs may not be valid.

Right, I meant TLDs. Sorry for the confusion.

I absolutely agree that we should show failure reason, especially that now we have couple of them. Should we just use the displayValue (it's the one appended to the end of the failure description)? Table feels like an overkill here and debugString (red text) is rather used for audit failures.

ATM we have these failure reasons:

"multiple conflicting URLs"
"invalid URL"
"relative URL"
"points to a different TLD" (to be checked with John)
"???" (one with canonical link set to root)

How that list looks to you? I'd appreciate a bit of help with a copy for the last one :)

kdzwinel · 2017-12-22T00:58:05Z

To sum up our email/hangouts discussion, we decided to fail in these cases:

multiple conflicting URLs
invalid URL
relative URL
current URL and canonical URL have different domains
current URL is not a root but points to a root of the same origin
current URL is a hreflang and canonical URL is a different hreflang

While writing tests I found two more edge cases:

current URL and canonical URL have different domains - I assume that the domain is last two parts of the hostname (e.g. test.example.com), unfortunately this assumption breaks for second-level domains (one.co.uk and two.co.uk will be considered having same domain). Only solution I can see here involves creating a safelist of all second-level domains, but this seems impractical. IMO we should keep current solution.
This article suggests that we should fail not only when multiple conflicting canonical URLs are found but always when multiple canonical URLs are found. I'll double check that with John.

kdzwinel · 2018-04-11T20:21:04Z

@rviscomi I got asked why we fail canonical audit when someone has the same canonical in both request header and head of the page (happens for e.g. https://www.12starsmedia.com/). At first it felt like a bug, but this quote tells me that we did it on purpose (?)

This article suggests that we should fail not only when multiple conflicting canonical URLs are found but always when multiple canonical URLs are found. I'll double check that with John.

I does fill counterintuitive and linked article doesn't really say, now that I reread it, what I claimed it was saying 🤔Do you remember the discussion about it?

TimothyLoyer · 2018-04-11T20:28:33Z

Maybe this? I'm not certain if they mean identical in this case, but it could make sense. Admittedly our duplicate is due to our using the SEOmatic plugin for Craft. If it seems likely Google would penalize us for this, we're more than happy to address it with the plugin authors.

In cases of multiple declarations of rel=canonical, Google will likely ignore all the rel=canonical hints. Any benefit that a legitimate rel=canonical might have offered will be lost.

kdzwinel · 2018-04-11T20:31:58Z

I'm not certain if they mean identical in this case

Yeah, I'm not 100% sure about that either. Maybe Rick will remember, and if not, we will double check with John.

TimothyLoyer · 2018-04-11T20:34:19Z

Also, in the second point of the conclusions.

Check that rel=canonical is only specified once (if at all) and in the head of the page.

Thank you for all your help with this @kdzwinel. :)

rviscomi · 2018-04-11T21:50:24Z

I don't remember exactly, but rereading the doc, it does seem like we're doing the right thing. Please do reach out to John and confirm that having the same canonical URL in both a header and meta tag is invalid.

khalwat · 2018-04-12T18:24:51Z

This article suggests that we should fail not only when multiple conflicting canonical URLs are found but always when multiple canonical URLs are found. I'll double check that with John.

Hello everyone, I'm the author of the SEOmatic plugin for Craft CMS 2 that @TimothyLoyer has referenced.

What I'm doing is adding the exact same canonical URL both as a tag, and also as a header.

I also do this for the robots tag and X-Robots-Tag -- neither was done for any specific reason, other than "why not?"

From the linked article:

Another issue is when pages include multiple rel=canonical links to different URLs. This happens frequently in conjunction with SEO plugins that often insert a default rel=canonical link, possibly unbeknownst to the webmaster who installed the plugin. In cases of multiple declarations of rel=canonical, Google will likely ignore all the rel=canonical hints. Any benefit that a legitimate rel=canonical might have offered will be lost.

To me this implies that there is only an issue if there are multiple conflicting canonical URLs? If they are the same URL, whether appearing multiple times as a tag or one as a tag, another as a header... I'm not seeing anything stating this is an issue?

I'm happy to alter the plugin to do whatever best practices are, but on this topic, I wasn't able to find anything definitive one way or another?

c.f.: nystudio107/craft-seomatic#68

rviscomi · 2018-04-12T18:32:54Z

IMO the ambiguity comes from this sentence:

In cases of multiple declarations of rel=canonical, Google will likely ignore all the rel=canonical hints.

Out of context, it's not clear if "multiple" refers to any two canonical URLs. The previous sentence about different URLs could just be an example of a common cause of this type of error, or it could be the only case.

I reached out to my resident SEO expert and I'll update this thread with their guidance.

khalwat · 2018-04-12T18:45:03Z

Ultimately, what really matters is how Google, and to a lesser extent, other search engines handle this situation. I would think that as long as the canonical URLs are not in conflict, that it should be okay with it, but I have no knowledge of Google's internal workings on this front.

auralon · 2018-04-13T12:11:52Z

Given that it is highlighted as a problem by the Lighthouse tool (which is a Google product), I'd guess that it is recommended to only serve one canonical URL (be it via header or link tag). However, multiple robots tags are not highlighted as an issue by the Lighthouse tool.

khalwat · 2018-04-13T14:40:55Z

@auralon That definitely could be; but Google is a big place, and the team that works on Lighthouse may or may not overlap the team that works on GoogleBot.

auralon · 2018-04-13T14:52:28Z

@khalwat true, true!

rviscomi · 2018-04-13T15:59:59Z

I got confirmation from John Mueller himself (thanks John!) that this is a bug. We should only fail when there are multiple different canonical URLs.

Reopening the issue. @TimothyLoyer @khalwat @auralon would either of you like to implement the fix? @kdzwinel is working on a higher priority issue (#4359) so this may not be fixed as quickly.

khalwat · 2018-04-13T16:14:37Z

Great, thanks for tracking this down!

TimothyLoyer · 2018-04-16T17:12:29Z

Thank you, all, for looking into this!

rviscomi mentioned this issue Aug 29, 2017

[SEO Audits] Content best practices group #3117

Closed

7 tasks

patrickhulce added this to the SEO health-check audits milestone Aug 29, 2017

ebidel assigned kdzwinel Sep 5, 2017

patrickhulce added the needs-priority label Sep 26, 2017

paulirish added the new_audit label Sep 27, 2017

rviscomi added P2 and removed needs-priority labels Sep 29, 2017

paulirish added the seo label Oct 2, 2017

rviscomi added P1 and removed P2 labels Nov 27, 2017

rviscomi modified the milestones: SEO health-check audits, Sprint Cinco: November 28 - Dec 9 Nov 27, 2017

vinamratasingal-zz modified the milestones: Sprint Cinco: November 28 - Dec 9, Sprint Seis: December 11 - 22 Dec 11, 2017

kdzwinel mentioned this issue Jan 3, 2018

new_audit(canonical): document has a valid rel=canonical #4163

Merged

rviscomi modified the milestones: Sprint Seis: December 11 - 22, Sprint Siete: January 2-14 Jan 3, 2018

paulirish closed this as completed in #4163 Jan 10, 2018

TimothyLoyer mentioned this issue Apr 12, 2018

Redundant Canonical URLs with Automatic Render enabled nystudio107/craft-seomatic#68

Closed

rviscomi reopened this Apr 13, 2018

kdzwinel mentioned this issue Apr 15, 2018

core(canonical): pass when there are multiple identical canonical links #4973

Merged

brendankenny closed this as completed in #4973 Apr 18, 2018

LauraMontgomery mentioned this issue May 4, 2018

Invalid canonical - Multiple URLs nystudio107/seomatic#336

Closed

khalwat mentioned this issue Jul 27, 2018

Duplicate canonical URL's nystudio107/craft-seomatic#180

Closed

brendankenny mentioned this issue Apr 5, 2021

Lighthouse flags valid canonical URLs as invalid #12149

Closed

paulirish mentioned this issue Nov 23, 2021

Should Lighthouse accept rel="canonical" on a different domain as valid given that Bing recommends rel="canonical"? #12572

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SEO Audits] Document has a valid rel=canonical #3178

[SEO Audits] Document has a valid rel=canonical #3178

rviscomi commented Aug 29, 2017 •

edited

Loading

kdzwinel commented Dec 11, 2017

rviscomi commented Dec 11, 2017

kdzwinel commented Dec 11, 2017 •

edited

Loading

kdzwinel commented Dec 22, 2017 •

edited

Loading

kdzwinel commented Apr 11, 2018

TimothyLoyer commented Apr 11, 2018

kdzwinel commented Apr 11, 2018

TimothyLoyer commented Apr 11, 2018 •

edited

Loading

rviscomi commented Apr 11, 2018

khalwat commented Apr 12, 2018 •

edited

Loading

rviscomi commented Apr 12, 2018

khalwat commented Apr 12, 2018

auralon commented Apr 13, 2018

khalwat commented Apr 13, 2018

auralon commented Apr 13, 2018

rviscomi commented Apr 13, 2018 •

edited

Loading

khalwat commented Apr 13, 2018

TimothyLoyer commented Apr 16, 2018

[SEO Audits] Document has a valid rel=canonical #3178

[SEO Audits] Document has a valid rel=canonical #3178

Comments

rviscomi commented Aug 29, 2017 • edited Loading

kdzwinel commented Dec 11, 2017

rviscomi commented Dec 11, 2017

kdzwinel commented Dec 11, 2017 • edited Loading

kdzwinel commented Dec 22, 2017 • edited Loading

kdzwinel commented Apr 11, 2018

TimothyLoyer commented Apr 11, 2018

kdzwinel commented Apr 11, 2018

TimothyLoyer commented Apr 11, 2018 • edited Loading

rviscomi commented Apr 11, 2018

khalwat commented Apr 12, 2018 • edited Loading

rviscomi commented Apr 12, 2018

khalwat commented Apr 12, 2018

auralon commented Apr 13, 2018

khalwat commented Apr 13, 2018

auralon commented Apr 13, 2018

rviscomi commented Apr 13, 2018 • edited Loading

khalwat commented Apr 13, 2018

TimothyLoyer commented Apr 16, 2018

rviscomi commented Aug 29, 2017 •

edited

Loading

kdzwinel commented Dec 11, 2017 •

edited

Loading

kdzwinel commented Dec 22, 2017 •

edited

Loading

TimothyLoyer commented Apr 11, 2018 •

edited

Loading

khalwat commented Apr 12, 2018 •

edited

Loading

rviscomi commented Apr 13, 2018 •

edited

Loading