-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommend a redirect strategy for docs #1820
Comments
@dedemorton thank you for raising this issue. This is going to impact a major doc refactoring that is currently ongoing on the Cloud ECE docs. |
Also see the related issue: #1357 |
Yes, this is a major issue for restructuring content. The current process (request redirects from the web team) has led to pain because we have no insight into what redirects already exist. We've ended up with circular redirects and assorted broken-ness. The redirects file in the docs repo was Nik's attempt at bringing some order to the chaos, but was never adopted by the web team & the folks at RAW who handle the actual infra/deployment of the site. Website redirects also only work at the page level, so if the chunking of content changes, they don't really solve the problem. The redirects appendices feel like a very hacky solution--but it's one we can control pretty easily. On the ES side, @jrodewig and I discussed a strategy where we would clean up old entries when a new version was released, rather than keeping them around indefinitely. (Knowing that there are likely to be a number of necessary redirects between the last minor and a new major, so it's not just a matter of deleting all of them and starting over.) One motivation for the redirect appendices was to minimize surprise cross doc links that caused chaos on release days. Now that we have the CI checks in place and have improved the process around releases, it's probably worth enforcing that if you move/remove a topic and break a link from somewhere else in the docs, you need to fix it, not just add an entry to the redirects appendix. The redirects appendix should be used to keep external links (like Google search results) from 404-ing. Ideally, it's best to mark pages that are being removed with a noindex tag and request a reindex from Google before they disappear. I think that we could basically accomplish that by adding a noindex tag to the redirect appendices and requesting a reindex when we roll out big changes or before we clean up old entries. @AnneB-SEO might have other insight into how to minimize the SEO disruption as we reorg the docs. |
Also, simply keeping track of everything that has moved around is a chore. It would be really helpful to have a new & deleted anchor report generated for each PR. |
This is exactly the type of feedback which helps me prioritize what I'm working on! |
I hate to link to a private Slack conversation in a public repository, but I think it's necessary to illustrate how big of a problem this is: Slack 🧵. We should remember to clean up the broken links that already exist with whatever solution we choose to move forward with. |
From an SEO perspective, a manual redirect done using a redirect appendix is inferior to a server-side 301 or 302 redirect. Those manual redirect pages still respond with a 200 HTTP status code, which indicates to search engines that the old page is still alive. This means our new page is competing with the old (redirect) page. As older pages typically have more link juice, the redirect pages may be returned in SERPs before actual content pages. The best case is a 301/302 that passes that link juice on to the new page. However, even a 404 would at least let the old page die. Right now, the redirect appendices are keeping old, zombie pages alive. I also don't think the redirect appendix is the best experience for users. I would love to abolish redirect appendices entirely, except maybe in cases where there is no good redirect. Better control and visibility of server-side redirects would be my preferred path forward. |
Agreed. The redirect appendices are a patch for a broken process. Beyond the issue of ending up with zombie pages that never go away, the manual process simply doesn't scale for major reorganization of existing content. We need to be able to automatically detect changes that require redirects, and manage the redirects in a way that doesn't require multiple spreadsheets and teams. |
After spending a several hours today updating links throughout all the docs, I had a thought about how we can approach linking with the tools and processes that we have now. My solution isn't ideal. Our tools should really maintain the link and link text for us. But having to manually go through a dozen repos to update links (even for a handful of topics) is a major PITA. What if we create one or more shared link files in the So we might have something like:
Writers could use If we want to provide writers with more control over the link text, we could use two attributes:
Then writers would resolve the link by using:
I know this is hacky, but I've had a long day of monkey work and feel like I'm stuck in 1985. (I guess I don't have to use carbon paper or leave enough space for footnotes, but seriously, all this manual monkey work is a time sink.) Hmm...but then we'd also need some kind of versioning, maybe similar to what we do for the versions file? |
@dedemorton 💯 for this approach for any links that are used more than 2 or 3 times in a book. It's obviously not a complete solution, but hopefully reduces some of the pain. If you're trying to link to the same version, could you just throw a |
I think this is definitely worth considering. One thing it makes me ponder is whether this would help with internal/external link issues. For example, I encounter pages that contain internal cross-references (<<my-link>>) which don’t work when the page is re-used in another context (because it doesn’t contain that other linked page. Should we consider using external links at all times or should we use citation maps for all links in each book/context (and define the URL attribute and whether it is an external or internal link appropriately for each book). Either way, we need to make sure our new build tools have thorough link-checking.
|
Hi @jrodewig - missed this post from a few months ago - so sorry. Great summary! Adding a couple notes.
Yes and no. A server-side redirect is always preferred yet only a 301. 302's still can be a bit problematic to search engines and are really for temporary redirects, such as a login URL that performs language detection before assigning a destination URL.
Finally an explanation of how docs generates all those "soft 404s" (a 404 that returns a 200). Thank you!
Link juice will only get passed with a 301. Even if we were to redirect with a 302 and then change to a 301 all the link authority would be lost.
Sounds like it's a poor experience for both users and search engines!
YES! Thanks again for the write up and background on the soft 404s! |
@gtback RE your comment:
The {metricbeat-ref} attribute would take care of resolving the correct branch. I'm thinking more about the situation where we change the HTML filename (maybe to improve SEO) but the change only applies to a specific version and later. The lack of branches in the @lcawl RE your comment:
I wouldn't want to use external links everywhere because we'd lose out on link validation in local builds and that would make it harder to diagnose some build problems before we push to GitHub. Plus we'd have to maintain all the link text manually. Hmmm...it would be cool if we could somehow harness the logic that asciidoctor uses when it creates links and use it to generate a file that's populated with external links that other books can use. I guess we'd need logic so that once an attribute is defined in the link file, only the filename and link text would get updated. (Just trying to think of ways to automate the creation and maintenance of this file so that it doesn't become yet another time sink.) EDITED: As a first step, we could manually create files that capture the high traffic links (like getting started and installation topics). |
💯 for this. I'm looking forward to hosting the docs ourselves, and a big part is exactly for that reason. |
@dedemorton That makes sense, thanks. I'll have to think more about it. @benskelker was asking me a similar question this morning. |
Would be nice to get this fixed for Next Docs, but probably not worth changing the process in the current doc system...so I'm closing. |
We will be refactoring a lot of content in the coming months (beats, cloud, etc). Right now, our strategy for handling moved content (changed URLs) is not clear.
In the past, we've requested that the website team create redirects. Here's a random example of a request.
We also have a legacy redirects page that we planned to use in the future to manage redirects, but I don't think that page is being updated, and I don't think it's actually used by the build.
Some teams maintain a Deleted pages appendix and use that to redirect users manually to a page that's moved.
We need a clear strategy going forward, and I'm not sure whether redirects are the right way to go.
According to our internal wiki (copied from there):
The text was updated successfully, but these errors were encountered: