-
There is useful content in Github wikis. It's unfortunate that external search engines are banned People have put time into building their wikis and it's sad that the information is almost impossible to discover - unless you think to try Github search. |
Beta Was this translation helpful? Give feedback.
Replies: 16 comments 52 replies
-
For reference here's an old issue from the old isaacs repo: isaacs/github#1683 |
Beta Was this translation helpful? Give feedback.
-
It also blocks archive.org. |
Beta Was this translation helpful? Give feedback.
-
they are not banned, robots.txt does nothing to their ability to crawl. It is just that they should not crawl and they don't. |
Beta Was this translation helpful? Give feedback.
-
LIne 11 of https://github.com/robots.txt does not mention wiki anymore Actually wiki is no longer mentioned at https://github.com/robots.txt? |
Beta Was this translation helpful? Give feedback.
-
Hi everyone 👋 we have intentionally excluded |
Beta Was this translation helpful? Give feedback.
-
Wikis are not just the problem: google simply ignores ANY page in github, even readme.md! I can never find any suitable result in google coming from github. The only way for searching in github is using github search engine. But non-developers do not know github, so why should they search there? |
Beta Was this translation helpful? Give feedback.
-
Fortunately there are some search engines that ignore the robot exclusion rule. They will in turn allow DuckDuckGo, Bing, and other bots to crawl their cached version of GitHub wiki pages - allowing us to find stuff on GitHub. Its nice that someone cares more about finding content on the Internet Now that the disallowing of robots has been rendered moot: lets do the right thing and remove any |
Beta Was this translation helpful? Give feedback.
-
Not sure what the conclusion of this discussion is, but we're facing this problem: Around a year ago, I renamed my org from github.com/Otykier to github.com/TabularEditor. Now it seems, that none of the repo issues pages are being crawled. For example, I can take a literal quote from this issue (which was created more than a year ago) and get no hits on Google. This goes for all issues - none of them seem to be crawled, making it very hard for users of our software to find solutions to problems they encounter. We can still get to the main page by searching for "github tabular editor 3", for example, but no search results from the issues pages. Is there some org or repo setting we're missing? I am 90% sure that it used to work before we renamed the org - could that have something to do with it? Thanks! |
Beta Was this translation helpful? Give feedback.
-
I am thinking to move away from GitHub with this robots.txt policy :-( |
Beta Was this translation helpful? Give feedback.
-
alternatives? |
Beta Was this translation helpful? Give feedback.
-
I am sure GitHub can easily arrange with at least one of the big search
engine to be well-crawled and indexed.
And this will give said engine a considerable boost in popularity.
…On Wed, 1 Mar 2023, 09:04 jumpjack, ***@***.***> wrote:
alternatives?
—
Reply to this email directly, view it on GitHub
<#4992 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJXYHBJHJMFBPVOBZ4XUQTWZ37HBANCNFSM5B3UN4OQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
We are seeing github wiki pages now being indexed - has this changed permanently or is it temporarily? |
Beta Was this translation helpful? Give feedback.
-
I asked OpenAI: https://chat.openai.com/c/308cf4b9-2759-41ca-a243-740279bbe1a7 |
Beta Was this translation helpful? Give feedback.
-
I'm getting fairly desperate about various GitHub behaviors, but this is cherry on cake :( My repo has 54K stars and in spite of a comment above suggesting that 500+ stars repos wikis would be indexed it doesn't seem like it is. EDIT I WOULD NEED TO MAKE MY WIKI UNEDITABLE BY USERS TO DO SO ?!!! |
Beta Was this translation helpful? Give feedback.
-
I have a wiki in my repository containing documentation and coding standards for my repository. |
Beta Was this translation helpful? Give feedback.
-
(reposting some details not as a threaded reply) @dipree: This is really harmful and disappointing, actively discouraging people investing in Wiki. I can't comprehend why it hasn't been solved. I understand that abuses are a problem, but a single "Allow only to collaborators" options is very limiting, other mitigating options should be possible (see below). I have 56K+ stars, a growing Wiki https://github.com/ocornut/imgui/wiki and I would like other people than me to work on it, but right now its contents is not indexed in spite of best effort to even link to https://github-wiki-see.page etc. meaning people are having difficulty finding the information they need, and keep asing us the same things, lowering the quality of their software experience and hogging support/dev resources. :( Can't github implement some alternative options? Suggestions in order of simplicity:
If even the first option was added, I would cave in and lock the wiki tomorrow and add likely contributors + infos to request. I suspect that part of your underlying thought might be that by requiring developers to grant full access to a repository to allow people edit a wiki, it is ensuring we don't add too many people? Can't exceptions be made to well behaving projects? |
Beta Was this translation helpful? Give feedback.
Hi everyone 👋 we have intentionally excluded
Disallow: /*/wiki*
from therobots.txt
. However, we have also introduced anx-robots-tag: none
in the http response header of Wiki pages. As a result, Wikis are still not visible to search engine crawlers. Why have we done this? Abusive behavior in Wikis had a negative impact on our search engine ranking and therefore we had to exclude Wikis from getting crawled to mitigate the effects. We are now investigating options how we can open the gates again so that everyone can benefit from the great information documented in Wikis. At this point, we have not decided on whether or when we will allow Wikis to be crawled again, but we are actively revie…