Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-hyphens in post URLs reduce to single-hyphens; regression introduced in v0.50 causes links to break #7288

Closed
kaushalmodi opened this issue May 18, 2020 · 7 comments · Fixed by #9403

Comments

@kaushalmodi
Copy link
Contributor

kaushalmodi commented May 18, 2020

What version of Hugo are you using (hugo version)?

$ hugo version
Hugo Static Site Generator v0.70.0-7F47B99E linux/amd64 BuildDate: 2020-05-06T11:18:50Z

Does this issue reproduce with the latest release?

Yes


Expected behavior: If the content file is named content/foo--bar.md, the page URL is expected to be <site>/foo--bar/.

Observed behavior: If the content file is named content/foo--bar.md, the page URL becomes <site>/foo-bar/ (note that the double-hyphen becomes single-hyphen).

I believe this regression was caused after the switch to Goldmark. in e421696d.

Recipe to reproduce this issue

  1. git clone https://gitlab.com/hugo-mwe/double-hyphens-in-post-urls (Content file: https://gitlab.com/hugo-mwe/double-hyphens-in-post-urls/-/blob/master/content/post--with--double--hyphens.md )
  2. cd double-hyphens-in-post-urls
  3. hugo

Above demo site is also deployed at https://determined-leakey-137470.netlify.app/ .

Observed output

In public/index.html, we will see:

    <body lang="en">
        
        
            <p>Page link = /post-with-double-hyphens/</p>

Expected output

    <body lang="en">
        
        
            <p>Page link = /post--with--double--hyphens/</p>

Same issue reported by other users

Seriousness of this regression

I have quite a few pages with double-hyphens.. Double hyphens are used as idea separator in URLs. Here are few examples:

posts/deeply-nested-org-todo-headings--h1.md
posts/deeply-nested-org-todo-headings--h2.md
posts/deeply-nested-org-todo-headings--h5.md
posts/deeply-nested-org-todo-headings--h6.md
posts/deeply-nested-org-todo-headings--default-h.md
posts/filling-not-preserved-for-chinese-characters--preserve-filling-off.md
posts/filling-not-preserved-for-chinese-characters--preserve-filling-on.md

I link to these pages for references, and now they break because now double-hyphens are auto-reduced to single-hyphens.

Can the number of hyphens specified by the user in the content file name be preserved?

@bep
Copy link
Member

bep commented May 18, 2020

I believe this regression was caused after the switch to Goldmark.

No, I think this is how it has been behaving for several years.

@kaushalmodi kaushalmodi changed the title Double-hyphens in post URLs reduce to single-hyphens; regression causes links to break Double-hyphens in post URLs reduce to single-hyphens; regression introduced in v0.50 causes links to break May 18, 2020
@kaushalmodi
Copy link
Contributor Author

Indeed, this change happened about 1.5 years back in https://github.com/gohugoio/hugo/releases/tag/v0.50 .

Sincere request to allow users to retain their hyphens where they need them.

I only landed up on this today as I was tracing through one of the old ox-hugo tests linked in old issues.

@kaushalmodi
Copy link
Contributor Author

kaushalmodi commented May 18, 2020

Making the best guess from the changelog.. probably this caused it?: fae48d7 /cc @moorereason

Update: It's actually this: e421696d

  • Move the "double hyphen and space" logic into UnicodeSanitize

The last bullet may be slightly breaking for some that now does not get the "--" in some URLs, but we need to reduce the amount of URL logic.

@kaushalmodi
Copy link
Contributor Author

@bep This can be made a non-breaking fix if the double-hyphens are retained at least when the user has explicitly set the permalinks option to use the :filename. Example:

[permalinks]
  "/" = "/:filename/"

Right now, even with this option set, when the file name is post--with--double--hyphens.md, the permalink is /post-with-double-hyphens/. Can the filename faithfulness be retained with the :filename used for permalinks option?

Thanks.

@stale
Copy link

stale bot commented Sep 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

@stale stale bot added the Stale label Sep 20, 2020
@kaushalmodi
Copy link
Contributor Author

@bep Can you please look at this issue? This one issue is still bugging me.

@stale stale bot removed the Stale label Sep 20, 2020
moorereason added a commit to moorereason/hugo that referenced this issue Nov 28, 2020
Improve handling of pre-existing hyphens in input to UnicodeSanitize.
This commit accomplishes three things:

1. Explicitly allow hyphens
2. Avoid appending a hyphen if a preceeding hyphen is found
3. Avoid prepending a hyphen if a trailing hyphen is found

Fixes gohugoio#7288
kaushalmodi added a commit to kaushalmodi/ox-hugo that referenced this issue Jan 16, 2022
Hugo does not allow 2 consecutive hyphens in slugs derived from
Markdown file names and auto-converts 2-hyphens to single hyphens. Ref
gohugoio/hugo#7288.
kaushalmodi added a commit to kaushalmodi/ox-hugo that referenced this issue Jan 16, 2022
The documentation ( https://ox-hugo.scripter.co/doc/org-capture-setup/
) suggests using `org-hugo-slug` to auto-generate the file name from
the title.

This change is so that the auto-generated file name doesn't have
consecutive hyphens in its name. Otherwise the Hugo-generated URL will
not match with the file name exactly. See
gohugoio/hugo#7288.

This change does not affect the generation of anchor names within a
post as double-hyphens are OK there.

This commit mainly affects the people using `org-hugo-slug` outside of
ox-hugo, like in their Org Capture templates. Earlier the generated
file name could have been "foo--bar.md". Now it would be "foo-bar.md"
instead.
moorereason added a commit to moorereason/hugo that referenced this issue Jan 17, 2022
Improve handling of existing hyphens in input to UnicodeSanitize.
This commit accomplishes three things:

1. Explicitly allow hyphens
2. Avoid appending a hyphen if a preceeding hyphen is found
3. Avoid prepending a hyphen if a trailing hyphen is found

Fixes gohugoio#7288
@bep bep closed this as completed in #9403 Feb 23, 2022
bep pushed a commit that referenced this issue Feb 23, 2022
Improve handling of existing hyphens in input to UnicodeSanitize.
This commit accomplishes three things:

1. Explicitly allow hyphens
2. Avoid appending a hyphen if a preceeding hyphen is found
3. Avoid prepending a hyphen if a trailing hyphen is found

Fixes #7288
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants