Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to generate offline static HTML files usable without server #3825

Closed
tomchen opened this issue Nov 26, 2020 · 83 comments · Fixed by #9859
Closed

Option to generate offline static HTML files usable without server #3825

tomchen opened this issue Nov 26, 2020 · 83 comments · Fixed by #9859
Labels
apprentice Issues that are good candidates to be handled by a Docusaurus apprentice / trainee proposal This issue is a proposal, usually non-trivial change
Milestone

Comments

@tomchen
Copy link

tomchen commented Nov 26, 2020

🚀 Feature

docusaurus build will build a local production version that has to be docusaurus serve'd to be usable. Can we add an option to build an offline static HTML files that are usable completely without any server, so user can just open index.html with a browser to read the whole documentation.

It's about calculating relative (instead of absolute) URLs, and appending "index.html" at the end of the URLs. Algolia search will have to be removed, any online cloud assets will have to be put in local folders.

Have you read the Contributing Guidelines on issues?

Yes

Comment, Motivation, Pitch

What about other static site generators and libraries?

Gatsby, React, etc.'s build all do the similar thing, they all need a server.

Gatsby has this feature request for option to build such offline static HTML site: gatsbyjs/gatsby#4610, which is closed without the issue being solved. Users keep asking for the feature and for reopening the issue. According to one comment, in Gatsby v1 it actually can generate such static site, it is in v2 it doesn't work.

React serves general purpose and Gatsby is made for any website. But Docusaurus, is primarily made for documentation, it may need the feature of the offline version generation more than React and Gatsby do.

PDF and ebook formats

There is already a feature request, #969, that asks for option to create an offline version in PDF format. It is obviously brilliant to be able to make PDF and maybe also EPUB, MOBI, AZW. PDF and these ebook formats may have less security concern than HTML. But the downsides are, it may be a little time-consuming to achieve the PDF feature; those interactive navs and TOCs and colorful website design and layout will have to be removed in PDF and other ebook formats. Offline static HTML is easier to make. If PDF feature is in the long-term plan, then Offline static HTML could be in a shorter-term to-do list.

Compressed web file format

The offline static web files usable without server, could be simply compressed as a zip or in other common archive formats. User will need to uncompress the file and click index.html in the root folder to use it.

They can also be compiled in CHM (Microsoft Compiled HTML Help), problem is it is a bit old and it does not have native support in non-Windows OS. It's a little surprising there's no standard or universally accepted file format similar to CHM. Perhaps it's due to security concerns.

@tomchen tomchen added feature This is not a bug or issue with Docusausus, per se. It is a feature request for the future. status: needs triage This issue has not been triaged by maintainers labels Nov 26, 2020
@RDIL
Copy link
Contributor

RDIL commented Jan 6, 2021

You can use electron to freeze it I believe

@tomchen
Copy link
Author

tomchen commented Jan 6, 2021

You can use electron to freeze it I believe

That'd be a super overkill. Even you have just one webpage, Electron will make it 80-100 MB, putting the whole browser rendering and scripting engines in it.

@kinzlp

This comment was marked as duplicate.

1 similar comment
@vrabota

This comment was marked as duplicate.

@slorber
Copy link
Collaborator

slorber commented Aug 31, 2021

@ohkimur mentionned he built a postprocessing step to enable support of local browsing, using the file:// protocol.

#448 (comment)

That doesn't look like a bad idea to build this as a postprocessing step.

A Docusaurus plugin could be built in userland to solve this problem. Plugins have a postBuild lifecycle to use for that.

Note: such plugin should take into account the config.trailingSlash option because output files are not always /path/index.html anymore, and can also be /path.html

@slorber
Copy link
Collaborator

slorber commented Aug 31, 2021

Note: for some Docusaurus features (particularly SEO metas such as social image, i18n hreflang...), URLs in the HTML files MUST be fully qualified absolute URLs (domain + absolute path).

Building a site for local offline usage does not prevent you from setting site URL and baseUrl in the config file, otherwise, the build output would not be suitable for online hosting.

For these reasons, it's very unlikely we'll add support for using "relative baseUrl" in Docusaurus, such as baseUrl: '': it would lead to an output that would only be correct for local usage, it's likely users will deploy sites with broken metadata online without noticing the SEO problems.

@roguexz
Copy link

roguexz commented Sep 1, 2021

Moving my conversation from #448 to this thread

@ohkimur - your suggestion works for the most part of it, but Webpack configurations are still being difficult to resolve

@slorber - my use case isn't for offline usage. I am trying to put together a simplistic developer workflow which involves publishing documentation to GitHub pages. At my workplace, we are using GitHub enterprise.

The use case is as follows,

  • Developer forks a repository and works on it
  • In the PR they are also contributing the generated site
  • Before merging the generated site, I would like to view the site on their fork

Given that baseUrl needs to be defined as a fixed path (e.g., /pages/_GH_ORG_/_GH_REPO_NAME_/), it becomes difficult to view the incoming changes till it has been merged.

I understand that this is not how most people work. This is part of an exercise where I am trying to encourage my team to get into the habit of documenting their software. The pandemic has made things worse because reviewing the UI / UX now requires a meeting instead of being able to just view the documentation against their repositories.

Any ideas that you might have to improve this workflow / process are most welcome.

I'm more of a Java/JVM guy .. which isn't helping and making the hacking process that much more challenging. Any help is greatly appreciated.

@ohkimur
Copy link
Contributor

ohkimur commented Sep 2, 2021

@slorber I created the docusaurus-plugin-relative-paths to solve the issue. I used the same post-processing approach using Docusaurus postBuild lifecycle. 🦖😎

@slorber
Copy link
Collaborator

slorber commented Sep 2, 2021

@roguexz , if you used a modern Jamstack tool like Netlify or Vercel (both much better than GH pages), you'd get a much better experience and all PRs would have a "deploy preview" link that includes the changes from the PR, ensuring they are valid (docusaurus can build and you can check the end result a merge would lead to before doing that merge).

See this Docusaurus PR, the Netlify bot added a link so that the PR can be reviewed more easily: #5462 (comment)

This is very easy to set up.


@ohkimur thanks! hope people will like this solution.

One interesting idea could be to have 2 mods:

  • modify build files
  • keep build unchanged, create a copy of build, modify it and generate a build/site.zip archive that users could download?

@ohkimur
Copy link
Contributor

ohkimur commented Sep 2, 2021

@roguexz , if you used a modern Jamstack tool like Netlify or Vercel (both much better than GH pages), you'd get a much better experience and all PRs would have a "deploy preview" link that includes the changes from the PR, ensuring they are valid (docusaurus can build and you can check the end result a merge would lead to before doing that merge).

See this Docusaurus PR, the Netlify bot added a link so that the PR can be reviewed more easily: #5462 (comment)

This is very easy to set up.

@ohkimur thanks! hope people will like this solution.

One interesting idea could be to have 2 mods:

  • modify build files
  • keep build unchanged, create a copy of build, modify it and generate a build/site.zip archive that users could download?

@slorber I think this is a great idea. If you want, you can open a issue here and I will work on it. 🐱‍👤

@Josh-Cena Josh-Cena added proposal This issue is a proposal, usually non-trivial change and removed feature This is not a bug or issue with Docusausus, per se. It is a feature request for the future. status: needs triage This issue has not been triaged by maintainers labels Oct 30, 2021
@larissa-n
Copy link

Picking up on @RDIL's comment: I added the build output files to an electron app and encountered a few issues. After specifying each of the files in package.json and with docusaurus-plugin-relative-paths (thank you, @ohkimur!), the HTML content is rendered fine with all images, but electron is still looking for scripts in file:///assets/ based on a reference in runtime~main.xxx.js. Any idea how this could be fixed?

@larissa-n
Copy link

A very rough way to fix script references is baseUrl: './'. However, this also messes with routes, so a somewhat more correct approach is to change only o.p= in the compiled runtime~main.xxx.js (not sure if there's a more elegant way, but unless there is, one idea might be to make this part of the docusaurus-plugin-relative-paths postprocess script). There are also references in main.xxx.js that point to an absolute directory as well. Now most scripts load, but they re-render all pages as the 404 page / NotFound component. Of course, getting rid of parts.push('exact: true'); in @docusaurus/core/lib/server/routes.js doesn't exactly fix the problem, since sub-routes won't load. Why does it have to check the route match, is that just for the prefetching? It seems odd that content is switched to NotFound once scripts load, since the static content looks fine and everything is in the right place while scripts fail to load.

Also, not sure if this is documented anywhere but to dev with npm run start, I had to deactivate docusaurus-plugin-relative-paths plugin.

Docusaurus creates quite a few js files to keep track of if you work in an environment that requires you to list every single file. I'm used to react-static, and its builds consist of far fewer files.

@ohkimur
Copy link
Contributor

ohkimur commented Jan 6, 2022

@larissa-n Thank you for your observations. I know about the issue you mentioned, but I didn't fix it since I didn't find an elegant approach to do it. If you already have a potential solution (even though it's messy) I invite you to make a pull request in the plugin's repo. I can extend it later if necessary.

Also, can you describe the problem you had when you tried npm start? Isn't the plugin called only when a build is triggered? If not, then this is a bug and it might be a good idea to fix it.

@sigwinch28
Copy link
Contributor

Note: for some Docusaurus features (particularly SEO metas such as social image, i18n hreflang...), URLs in the HTML files MUST be fully qualified absolute URLs (domain + absolute path).

Why must they be fully qualified, @slorber?

I've just started to use docusaurus and I find the baseUrl rather limiting, because I kind of expected a copy-the-html-files-anywhere exeprience for deployment anywhere, without extra configuration. I don't understand why there's a tight coupling to baseUrl. Your comment hints at why it exists.

Is this necessity documented?

@Josh-Cena
Copy link
Collaborator

@sigwinch28 Yes, see https://docusaurus.io/docs/advanced/routing#routes-become-html-files

@slorber
Copy link
Collaborator

slorber commented Jun 24, 2022

Note: for some Docusaurus features (particularly SEO metas such as social image, i18n hreflang...), URLs in the HTML files MUST be fully qualified absolute URLs (domain + absolute path).

Why must they be fully qualified, @slorber?

I've just started to use docusaurus and I find the baseUrl rather limiting, because I kind of expected a copy-the-html-files-anywhere exeprience for deployment anywhere, without extra configuration. I don't understand why there's a tight coupling to baseUrl. Your comment hints at why it exists.

Is this necessity documented?

It's not just coupling to a /baseUrl/, it is coupling to your domain as well.

There are multiple things in Docusauurs relying on that, in particular SEO metadata like canonical URL

<link data-rh="true" rel="canonical" href="https://docusaurus.io/docs/myDoc">

What Google says: https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls

CleanShot 2022-06-24 at 15 14 29@2x

Although relative URLs seem supported (maybe only by Google?), it's not recommended.

Similarly, meta hreflang headers for i18n sites:

CleanShot 2022-06-24 at 15 21 18@2x

https://developers.google.com/search/docs/advanced/crawling/localized-versions

CleanShot 2022-06-24 at 15 16 56@2x

(including the transport method means you also can't switch from HTTP to HTTPS without a Docusaurus config change)

Similarly for og:image metadata responsible for providing the social card preview on your site on social network

<meta property="og:image" content="https://docusaurus.io/img/socialcard.png"/>

Using a relative URL can lead to failures to display the card and does not respect the spec: https://ogp.me/#data_types

CleanShot 2022-06-24 at 15 20 04@2x


It's not a Docusaurus-side constraint, it's a constraint that comes from outside.

You really have to build your site for a specific protocol/domain/baseUrl.

Now I understand in some cases you don't care about the features above and prefer to have more "deployment flexibility", but for now we don't support that.

@sigwinch28
Copy link
Contributor

Fantastic answers. Thank you.

@jeacott1
Copy link

@ohkimur it looks like your completely deleted your docusaurus-plugin-relative-paths project?
what happened?

@ohkimur
Copy link
Contributor

ohkimur commented Nov 18, 2022

@jeacott1 Yeah. I did. I want to invest my time into something different.

@justsml
Copy link

justsml commented Nov 19, 2022

@ohkimur I appreciate your OSS work!

Could you please put your docusaurus repos up as 'Archived'? Even temporarily?
Maybe even email me? 😅

(I'm trying to help a former student make sense of some README notes left by a previous dev. It has permalinks to your docusaurus-plugin-relative-paths, and all I see is 404s. 💔And the mystery deepens...)

If you'd rather not deal with it at all, I do understand. I hope your new focus is rewarding.

Best wishes, Dan.

@ColinLondon

This comment was marked as off-topic.

@ilg-ul

This comment was marked as off-topic.

@slorber
Copy link
Collaborator

slorber commented Feb 16, 2024

Hey, I'm sorry but all these discussions are off-topic and I'm hiding them. This is not the place to ask about what Docusaurus is.

I'm still going to answer briefly.


Docusaurus is a React-based static site generator. It generates static HTML pages using React, and then hydrates React client-side. When navigating, we don't navigate to another HTML document, but we render the next page with client-side JavaScript using soft navigations and the history.pushState() API to update the URL.

This kind of navigation is what permits Docusaurus to feel fast when clicking on a link, and also preserves the state of UI elements on the current page (for example the collapsible state of sidebar categories).

This is a very different model from Jekyll, Hugo, Eleventy, MkDocs, Sphynx, and many other SSG tools that do not use client-side navigation and use a more traditional/old-school approach, but are usually less "interactive". Docusaurus v1 also worked that way, using React only during the build process and not loading React on the client side.

If you open your Chrome DevTools network tab on v1.docusaurus.io VS docusaurus.io, you will notice a big difference when navigating. v1 will request a new HTML page, while now v2+ will request JS to render locally the new page.


If you don't understand what Docusaurus, React, hydration, SPA, history API, and all these things are, then it is unlikely that you will be able to help us solve this issue.

@slorber
Copy link
Collaborator

slorber commented Feb 17, 2024

Investigations

I've investigated 2 approaches so far:

I have also investigated using external tooling such as wget to crawl the site and download it for offline usage:

mkdir wget-test
cd wget-test
cp -R ../projects/docusaurus/website/build/ .
wget -mpEknH http://localhost:3000/

This kind of works, but it is somehow the same solution as the first one (SSG) where each page has a dedicated static html file.


SSG

#9857

To me, the SSG approach is quite challenging. Notably, I'm not even sure dependencies such as React-Router can do routing using the file:// protocol. That remains to be investigated.

However, it could work decently if you are ok with opting out of the SPA mode of Docusaurus and are ok with not hydrating React on the client. This means that things we implement with interactive React code will not work (tabs, theme switch, category collapse button, mobile drawer...). We try to make things work without any JS (#3030) but there are still a few things that require JS and/or React. This also makes it impossible for you to include React interactive inside docs (through MDX) however non-interactive React elements (such as admonitions) are perfectly fine.

If you want to give this mode a try, I'd suggest to run this command on your computer. This is kind-of the equivalent of the HTML post-processing scripts that people shared earlier, the links and assets will use relative paths.

wget -mpEk https://tutorial.docusaurus.io/

(there are JS loading console errors, but that's kind of on purpose: if JS succeeds in loading, then you'll get a 404 page being rendered after React hydration because React-Router does not know what to render for file://. However this mode might be a decent fallback if you want a good-enough/almost working experience)


Hash Router

#9859

The Hash Router solution looks easier to implement, and I'm almost able to make it work on our website, apart from a few linking edge cases to investigate.

However, I'm not sure if it's the solution the community is looking for considering there would be a single HTML file emitted, and that file would initially be empty.

Here's the deploy preview online demo of the Hash Router based app:
https://deploy-preview-9859--docusaurus-2.netlify.app/#/

The local app using file:// would behave the same, and you'll always open it through the index.html entry point. Even though there's a single entry point file, you can still have deep linking and bookmark URLs.

You will notice that we have a "loading..." screen before the content appears. This is because the initial html file is empty and all the app is rendered with JS.

@lebalz
Copy link
Contributor

lebalz commented Feb 18, 2024

+1 for the Hash Router.

We use Docusaurus for interactive teaching-websites in a highschool and we'd like to give our students the ability to get a snapshot of the website when they completed their grade and leave the school.

Important points for our usecase:

  • an easy way to start - a single index.html file inside the .zip-folder would just be perfect
  • react on the client-side - support for all interactive parts and common state management libraries
  • optional: local search
  • optional: bookmarks

Thanks a lot - this really would be a huge thing for our school-department :)

@jeacott1
Copy link

@slorber hashrouter sounds like an excellent solution to me. Definitely preferable to SSG imo, and leaves more features available over the long term.

@slorber
Copy link
Collaborator

slorber commented Feb 19, 2024

Thanks for your feedback.

I'll focus on implementing proper support for the Hash Router then.

It doesn't mean that we won't eventually support other "modes" later, but at least this one is a good starting point.


Other alternatives to be considered:

  • Wrapping the static deployment inside an Electron/Tauri WebView
  • Package the web server as an executable (see vercel/pkg)

Of course, I'm not a fan of those approaches (using a bazooka to kill a fly), but afaik you could implement them in userland today if you really need to solve this problem right now and be able to package/distribute your docs for offline usage.

@ilg-ul
Copy link
Contributor

ilg-ul commented Feb 19, 2024

I'll focus on implementing proper support for the Hash Router then.

Could you confirm that plain static html files for each page will continue to be supported for the foreseeable future?

@slorber
Copy link
Collaborator

slorber commented Feb 19, 2024

I'll focus on implementing proper support for the Hash Router then.

Could you confirm that plain static html files for each page will continue to be supported for the foreseeable future?

This would be a new build mode you enable through a CLI option.

So yes everything else will remain retrocompatible and Docusaurus will remain a static site generator

@tonyeggers
Copy link

tonyeggers commented Mar 27, 2024

Thanks for your feedback.

I'll focus on implementing proper support for the Hash Router then.

It doesn't mean that we won't eventually support other "modes" later, but at least this one is a good starting point.

Other alternatives to be considered:

  • Wrapping the static deployment inside an Electron/Tauri WebView
  • Package the web server as an executable (see vercel/pkg)

Of course, I'm not a fan of those approaches (using a bazooka to kill a fly), but afaik you could implement them in userland today if you really need to solve this problem right now and be able to package/distribute your docs for offline usage.

@slorber ... thank you for working on this. Just to confirm another use case, I'm hosting documentation in this fashion within an ERP platform (which I won't mention) due to abysmal support for doing anything remotely useful or flexible for this purpose. It's basically a static resource defined within the ERP environment, which gives me authentication by default for users already authenticated into the ERP. So now I can have a flexible git-controlled documentation process and keep my docs private and secure. I could even build a CI/CD process to load updates into the ERP if I wanted. Right now, I'm using the post-process solution created by @andrigamerita. I have to wrap it using a supported technique, but as long it works completely offline it works. Thanks!

@lebalz
Copy link
Contributor

lebalz commented May 20, 2024

Wow, looking forward to try it! Thanks for implementing this option 😍🥳

@slorber
Copy link
Collaborator

slorber commented May 20, 2024

Hey 👋

The Hash Router PR has been merged: #9859

The hash router is useful in rare cases, and will:

  • use browser URLs starting with a /#/ prefix
  • opt-out of static site generation
  • only emit a single index.html file
  • only do client-side routing and rendering
  • can be browsed offline without a web server with the file:// protocol

You can try this new experimental site option:

export default {
  future: {
    experimental_router: 'hash', // default to "browser"
  }
}

If you need to switch conditionally between normal/browser router and hash router, you can use a Node env variable. We don't provide any --router CLI option but you can easily do ROUTER=hash docusaurus build instead, and read process.env.ROUTER in your config file.

To dogfood this feature, make it easier to review and ensure it keeps working over time, we build our own website with the hash router and:

An example artifact you can download is available here: https://github.com/facebook/docusaurus/actions/runs/9159577535

CleanShot 2024-05-20 at 15 41 14

This will download a website-hash-router-archive.zip file.

Unzipping it gives you a static deployment. You can open it and browse locally without a web server by simply clicking the index.html file.

CleanShot 2024-05-20 at 15 43 45@2x

CleanShot 2024-05-20 at 15 46 08

CleanShot 2024-05-20 at 15 46 39


EXPERIMENTAL FEATURE:

The hash router is experimental.

It will be released in Docusaurus v3.4, but can already be tried in canary releases.

We strongly discourage you from using a baseUrl with it. If you have a use-case for a hash baseUrl, please share it, because we might forbid that in the future. It is likely that the useBaseUrl and useBaseUrlUtils have some edge cases with hash routing because these abstractions were not meant to handle a hash router in the first place.

Otherwise, there may be unhandled edge cases that we missed, so please report here any issue you have by providing a repro. Remember that third-party plugin authors may also need to adjust their code to support this new router. Although it should work out-of-the-box for most plugins, we can't guarantee that it will.

Thanks and please let us know if this feature works well for you.

@pfdgithub
Copy link

pfdgithub commented May 30, 2024

In my case, I use the browser route, and docs.routeBasePath: "/" configuration.
The docs is deployed behind traefik and nginx multiple gateways, and the customer network is is a private network.

Traefik uses the stripprefix middleware to strip the xxx prefix.
Nginx uses the try_files $uri $uri/ /index.html directive to find files.
When the url path triggers the gateway fallback, it will respond with the incorrect redirect path.

# url with filename
browser (/xxx/category/index.html) -> traefik (/category/index.html) -> nginx(/category/index.html) -> nginx status 200 -> traefik -> browser

# url with trailing slash
browser (/xxx/category/) -> traefik(/category/) -> nginx(/category/) -> nginx status 200 (rewrite to `/category/index.html`) -> traefik -> browser

# url without trailing slash
browser (/xxx/category) -> traefik(/category) -> nginx(/category) -> nginx status 301 (redirect to `/category/`) -> traefik -> browser (/category/) -> traefik status 404 -> browser

Due to the /category directory exists, nginx attempts to redirect to that directory.
However, nginx is unaware of the xxx prefix, so it responds with the incorrect redirect to /category/.
If the /category directory doesn't exist, nginx can respond with the correct rewrite to /index.html.

# url without trailing slash (`/category` directory doesn't exist)
browser (/xxx/category) -> traefik(/category) -> nginx(/category) -> nginx status 200 (rewrite to `/index.html`) -> traefik -> browser

I can't modify the configurations of traefik and nginx, and trailingSlash: true is not a perfect solution, so I can only clean up redundant directories after building.
Is it possible to allow disabling the Static Site Generation (SSG) functionality, as well as useless SEO metadata under a private network, such as allowing url: "/"?

@slorber
Copy link
Collaborator

slorber commented May 30, 2024

@pfdgithub your comment is quite hard for me to understand. So far I am not even sure if it is even related to the current issue because none of the URLs you share have a hash, and the hash part of the URL shouldn't affect routing and redirects in any way. If you want help, make sure it's relevant to the current issue,create a smaller repro, and try to explain better including fully qualified urls because the way you share urls right now does not even make it clear which router config you use.

@pfdgithub
Copy link

@pfdgithub your comment is quite hard for me to understand. So far I am not even sure if it is even related to the current issue because none of the URLs you share have a hash, and the hash part of the URL shouldn't affect routing and redirects in any way. If you want help, make sure it's relevant to the current issue,create a smaller repro, and try to explain better including fully qualified urls because the way you share urls right now does not even make it clear which router config you use.

Sorry, this comment is not about hash route. It is a further discussion of the following comment.

#448
#3825 (comment)
#3825 (comment)
#3825 (comment)
#3825 (comment)

@dingbo8128
Copy link

Awesome work! but it's a pity that some local search plugins are not compatible with this feature now. for example https://github.com/easyops-cn/docusaurus-search-local

@slorber
Copy link
Collaborator

slorber commented Jul 10, 2024

@dingbo8128 unfortunately all search plugins I know crawl the static HTML files. Since we now emit a single empty HTML file and use client-side JS to display the actual content, it's not possible to crawl the HTML files anymore for search engines to index your content.

The community will have to provide a different implementation for this new hash router mode. Since we can't read the HTML files directly, it will likely require using a headless browser to run the HTML pages and extract the rendered content out of it.

Maybe external search engines like Algolia would keep working, considering they run an external crawler. I don't know, if someone gives it a try I'm curious. Although, it's not ideal since it would require network access to get the search results.

Note that our sitemap does not emit a sitemap.xml when using the hash router. This is a limitation that we could probably handle if it is helpful to implement a local search plugin for the hash router. However I'm not sure it's a good practice to include URLs with # in a sitemap file. In this case sitemap.xml is probably useless for search engines 🤷‍♂️

@dspatoulas
Copy link

I'm running into an issue with the generated links returning a 404. Here's an example from the deployed Docusaurus site after navigating to Blog > Docusaurus 3.4 > Hash Router - Experimental (from right side menu).

https://facebook.github.io/docusaurus/#/blog/releases/3.4%23hash-router---experimental

The link should direct users to the #hash-router---experimental section of the 3.4 release page, but instead returns Page Not Found if copy and pasted into another browser tab.

@slorber
Copy link
Collaborator

slorber commented Jul 26, 2024

@dspatoulas how did you obtain that link?

The GitHub UI doesn't show it this way, but your link is:

https://facebook.github.io/docusaurus/#/blog/releases/3.4%23hash-router---experimental

/3.4%23hash-router---experimental => it's using %23 instead of # (the encoded version)

The same link using # works:

https://facebook.github.io/docusaurus/#/blog/releases/3.4#hash-router---experimental

And afaik nowhere in our UI we use %23 so I wonder how did you get that link in the first place.

To be honest I'm surprised in doesn't work in %23, it should IMHO, but it's likely a bug in React Router v5 hash router impl and I suspect it won't be fixed considering they are working toward v7 stable now.

@hellfiremaga
Copy link

hellfiremaga commented Oct 23, 2024

e strongly discourage you from using a baseUrl with it. If you have a use-case for a hash baseUrl, please share it, because we might forbid that in the future

Hello

I have might have a useCase + solution :
In one of my site developped with Docusaurus, I have a documentation which need to be on multiple networks and one among it if really user-right limited (no web server allowed, no third party app installation allowed, etc...). I need to use a baseURL because I have some other documentation on some networks which are separate from my docusaurus project.

So I tried your solution to produce a static website and this solution seems to works in the user-right limited network. If I don't use baseURL, it works perfectly. I tried to use baseURL and here is what seems to change in my opinion :

  • the main page does not load properly (but if you add the baseURL, everything works fine). Every other pages and links works
  • Images can't be found, but if you move them from the build root to a new folder named "baseURL" (so "buildroot/img" should move to "buildroot/baseURL/img"), it seems to work

Didn't notice anything else in the static website. Mine is an simple one so I must skipped some problems, but the ones I had wasn't hard ones.

PS : english is not my native language, sorry if my english is a but rusty

@slorber
Copy link
Collaborator

slorber commented Oct 23, 2024

@hellfiremaga thanks for sharing your use case but I can't do anything with this unfortunately

I need to use a baseURL because I have some other documentation on some networks which are separate from my docusaurus project.

This is very vague and I'm not sure you "need" to use a baseURL. Please prove it.

Show me concrete details of your setup, including very concrete examples of the other docs in your "network", how to do open those, what are the concrete browser URLs of each docs sites, what are the exact file system locations of these docs sites.

I'm not even sure what you mean by "network". If there's no web server allowed, there's no network, you open the site locally using the file:// protocol.

For these reasons, in its current state, I cannot take your feedback into consideration and decide it's worth it to add support for hash router baseUrl.

@hellfiremaga
Copy link

...
For these reasons, in its current state, I cannot take your feedback into consideration and decide it's worth it to add support for hash router baseUrl.

@slorber Actually, you're right, I agree with you : I don't really need this features. I was able to make a working static website without it, only by making a few change in my docusaurus configuration - thank you for it actually, a really cool feature ! . But, in my opinion, it don't seems an hard work either to use the base URL. I'll explain a bit more my use case, maybe I'm just using docusaurus wrong or I explained it wrong.

Actually, for professional reasons, I'm using docusaurus for the documentation of an internally developed software (let's call it Msoft). I was inspired by colleagues who already use docusaurus for another software (let's call is Osoft) I'm working with, but don't develop anything in it. I'm sure we will have more project like this in the future, but let's stay on the actual time.

In my company, we have multiple networks (I'm not sure it's the right term, we use "information system") and many of them are disconnected from internet. Here are some examples :

  • I have laptop computer without any connexion to any network where I develop my docusaurus project for Msoft documentation without any other docusaurus and where I can have a local webserver to host it. BaseURL is not necessary, nor the static website. But the documentation will basically be for my personnal use.

  • I work in a lab where we have multiple PC on the same network (no internet) and where a limited number of users have access. We already have a docusaurus project hosted on a NAS server with a GitLab and a webserver - the Osoft documentation. This project is the one i'm inspiring on. Everyone has access to Osoft documentation, easy use etc... Since I'm not developper for Osoft, I don't need access to the development of their project but still want to provide the Msoft documentation to everyuser. So we have two docusaurus projects every user need to access but only a few develop on it (and different for each project). We decided to use the baseURL to differentiate the two documentations, with only limited interactions between them.
    Note that the Msoft documentation is the same that the one from my laptop computer, so even if I don't need a baseURL on the laptop, I've added one to make it work in the lab without conflicting

  • I also have a few computers on the company network with internet access, but really limited user-right. Webserver is not allowed, nor security modification, software installation, etc... My need is to provide the Msoft documentation to everyuser from my lab everywhere (and my colleague from Osoft doc should do the same). With your feature, I was able to do a static website which seems to work really fine and allow every user to access our documentation. But it made me change the baseURL from my laptop project for the build purpose, then change it back to still be compatible to the lab version.

Sorry for the size of my explanation, I want you to haven enough details from my use case.

This is why I don't need a baseURL, but I'm pretty sure I will forget to revert the baseURL after a static build and lost time to find this problem. The interraction between docs are not mandatory, if we have two static website for two documentation, it's ok.
For my use case, changing baseURL to an empty one in the begging of the static website build could even do the job if it's an easy modification to you.
So again, I totally agree with you, I don't need to use baseURL, it's only simplier for me if I don't have to make modification when I want to build a static website :)

@slorber
Copy link
Collaborator

slorber commented Oct 24, 2024

@hellfiremaga thanks for the explanation, but this is still not concrete enough. I don't see any concrete file path or URLs being shared here, as I asked.

even if I don't need a baseURL on the laptop, I've added one to make it work in the lab without conflicting

When you use the hash router, you don't need to use a baseUrl to avoid conflicts. The conflicts are already avoided by opening different files on the file system:

These won't conflict:

  • file:///network1/msoft/index.html/#/
  • file:///network1/osoft/index.html/#/
  • file:///network2/msoft/index.html/#/
  • file:///network2/osoft/index.html/#/

Using a baseUrl will only lead to a longer URL:

  • file:///network1/msoft/index.html/#/msoft/
  • file:///network1/osoft/index.html/#/osoft/
  • file:///network2/msoft/index.html/#/msoft/
  • file:///network2/osoft/index.html/#/osoft/

There's no real benefit to using hash router + baseUrl, unless proven otherwise.
Please tell me the advantage you get by using file:///network1/msoft/index.html/#/msoft/ instead of file:///network1/msoft/index.html/#/

This is why I don't need a baseURL, but I'm pretty sure I will forget to revert the baseURL after a static build and lost time to find this problem.

You don't need to "hardcode" the baseurl in your config file, you can pass it dynamically with an env variable for example baseUrl: process.env.BASE_URL and then BASE_URL='/msoft/' yarn build. It's ok to build your project in different variants, and you can also use the --config flag for larger config variations.

You could have scripts in package.json that enable you to conveniently choose the variant you want to build:

"build": "docusaurus build",
"build:withBaseURL": "BASE_URL='/baseUrl' docusaurus build"
"build:withDifferentConfig": "docusaurus build --config docusaurus.config.different.ts"

@hellfiremaga
Copy link

@slorber I guess you found the solution what I was looking for 👍

I agree about the non-benefit of the baseURL in a static docusaurus website for all the reason you have. My point was more about the build and the use of baseURL which "break" the mainpage.

With a simple user POV : he double click the index : "page not found", he will think it's broken. Even if I give him the "good" URL (like "file:///network2/osoft/index.html/#/osoft/"), he will try to get back to the main page by the top left logo at a moment, "page not found", he will think it's broken.

But your solution of using the environment variable is clever. I have to admit I didn't even know we could to such a thing. Thanks a lot for that !

I still think it could be a good idea to ignore the base URL for static site. Or, as you proposed, take is as an error. Other users would have the same problem and, if someone find a real use case, it will still be revertable :)

Oh and thanks for your quick answer and help, highly appreciated :)

@slorber
Copy link
Collaborator

slorber commented Oct 24, 2024

Thanks

Let's wait for a few more feedbacks. If nobody has a use-case for hash router + baseUrl we can force it to / automatically and print a warning in that case.

@jeacott1
Copy link

@hellfiremaga why not differentiate based on file path instead of hash route?
file:///network1/msoft/index.html/#/
file:///network1/osoft/index.html/#/

or, you said there was a webserver and gitlab, so either multiple ports or a reverse proxy I assume?, why not just host your docusaurus as a website?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apprentice Issues that are good candidates to be handled by a Docusaurus apprentice / trainee proposal This issue is a proposal, usually non-trivial change
Projects
None yet
Development

Successfully merging a pull request may close this issue.