Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index pages appear to be invalid according to web standards (i.e. Apache MultiViews and others) #6947

Closed
voidmain opened this issue Apr 29, 2023 · 5 comments
Labels
- P2: nice to have Not breaking anything but nice to have (priority)

Comments

@voidmain
Copy link

voidmain commented Apr 29, 2023

What version of astro are you using?

2.0.15

Are you using an SSR adapter? If so, which one?

None

What package manager are you using?

nom

What operating system are you using?

Mac

What browser are you using?

All

Describe the Bug

When the build format is set to file, it looks like Astro is not generating index.html files that would mimic the source. Additionally, this confuses web servers like Apache which does not really love when there is a file with the same name as a directory.

Generally, index.html represent the directory listing for the path segment of the URL. This is consistent with the last 30 years of the web or so. I would expect that Astro would generate the appropriate index.html file from the corresponding src/pages/foo/index.astro source. However, it instead creates build/foo.html.

Here's an example that causes some confusion:

src/pages/foo.astro -> build/dist/foo.html
src/pages/foo/index.astro -> build/dist/foo.html
src/pages/foo/index.html -> build/dist/foo.html
src/pages/foo/some-page.astro -> build/dist/foo/some-page.html

This is a little surprising since I've defined a file named index.astro or index.html but both get compiled to foo.html. This behavior also clobbers other files such as the foo.astro in the parent directory. If you run this in most default web servers, the URL /foo/ won't resolve to anything even though the page src/pages/foo/index.astro does exist.

The Stackblitz I created won't illustrate the issue exactly because it looks like Stackblitz is translating URLs. It does illustrate that the foo.astro is clobbered though. However, if you clone that project and build it locally, you'll see the full symptom and the issue with directory slashes

A change of this build behavior is likely a breaking change, so I would recommend adding a new build format such as fileIndex or something like that.

Link to Minimal Reproducible Example

https://stackblitz.com/edit/github-ztkedn

Edits

Sunday April 30, 2023 - I added some additional information and update the Stackblitz to include the foo.astro file that gets clobbered.

@ematipico
Copy link
Member

@voidmain what would you expect from this "format" that you're suggesting? Like, it seems that the issue comes from Apache not liking the structure and names of the files.

Given your example, what should be the expected behaviour of this format?

@ematipico ematipico added the needs response Issue needs response from OP label May 4, 2023
@voidmain
Copy link
Author

voidmain commented May 4, 2023

I don't think this is just Apache. Most web servers (Caddy, Nginx, S3, CloudFront, CloudFlare, Google, etc) and web standards treat directories differently than files. More specifically, these URLs technically point to different resources according to the URL specification:

  • https://foo.com/directory
  • https://foo.com/directory/

Here's what I would assume would be the way that Astro would have built the output:

src/pages/foo.astro -> build/dist/foo.html
src/pages/foo/index.astro -> build/dist/foo/index.html
src/pages/foo/index.html -> build/dist/foo/index.html
src/pages/foo/some-page.astro -> build/dist/foo/some-page.html

This doesn't mangle anything and holds true to the file names and locations. By default, every web server I've ever encountered will serve these files like this:

src/pages/foo.astro -> https://example.com/foo
src/pages/foo/index.astro -> https://example.com/foo/
src/pages/foo/index.html -> https://example.com/foo/
src/pages/foo/some-page.astro -> https://example.com/foo/some-page

This works because all web servers treat index.html as a special file that is returned when directories (URLs ending in /) are requested.

@matthewp
Copy link
Contributor

matthewp commented May 8, 2023

Hey @voidmain I think I agree with your assessment here. One thing to keep in mind though, is that not all web servers treat index.html as a special file. S3 is one that does not. You have to type index.html into the URL. But I would think that if you choose to use that server you'd probably avoid using index.astro as source files.

I don't think we should add another option here, as that would make the code significantly more complex. Instead I think we can treat this as a bug fix. It's doubtful to me that anyone is expecting the current behavior here. But to be safe, we'd probably want to fix this in a minor just in case some people do expect the behavior we are less likely to break them.

@ematipico ematipico added - P2: nice to have Not breaking anything but nice to have (priority) and removed needs response Issue needs response from OP labels May 8, 2023
@voidmain
Copy link
Author

That works for me. It sounds like this might not be in a release for a bit, so in the meantime, here's a workaround for anyone that stumbles on this issue:

  1. Add a new integration (mine is local to the project under src/integrations/astro-index-pages)
  2. Update all components that need to understand paths (like navigation links that need active states) to figure out when index pages are being translated inproperly

Here's my integration code:

import fs from "fs";

function recurse(dir) {
  fs.readdirSync(dir, { withFileTypes: true }).forEach(file => {
    const subDir = file.name.replace(/\.html/, "");
    if (file.isDirectory()) {
      recurse(dir + file.name + "/");
    } else if (file.isFile() && file.name.endsWith(".html") && fs.existsSync(dir + subDir)) {
      console.log(`Moving ${dir + file.name} to ${dir + subDir + "/index.html"}`);
      fs.renameSync(dir + file.name, dir + subDir + "/index.html");
    }

  });
}

export default function AstroIndexPages() {
  return {
    name: "astro-index-pages",
    hooks: {
      "astro:build:done": async ({ dir, routes }) => {
        recurse(dir.pathname);
      }
    }
  };
}

And here's how my navigation component is handling index pages:

<li class:list={["group", Astro.url.pathname.startsWith(item.path.substring(0, item.path.length - 1)) ? "active " : ""]}>

This snippet strips off / from the item.path, which is the correct location, using a substring with -1. For example, if the item.path is something like /foo/bar/, it will ensure that the Astro.url.pathname starts with /foo/bar.

@matthewp
Copy link
Contributor

Hello! This was (finally) merged with the new build: { format: 'preserve' } option, that makes it so the output matches the source. That will be released in 4.3 tomorrow. Thanks a lot for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- P2: nice to have Not breaking anything but nice to have (priority)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants