Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of taxonomy terms containing slashes #4388

Merged
merged 8 commits into from
Aug 22, 2018

Conversation

moorereason
Copy link
Contributor

@moorereason moorereason commented Feb 8, 2018

This PR attempts to resolve #4090. The first commit contains tests that demonstrate the problem.

Special thanks to @tsuereth for his initial work on #4092, which laid the groundwork for this PR.

This PR may fix the title issues in #3577, but I haven't tested it yet.

@moorereason
Copy link
Contributor Author

@bep,
Can you take a look at this PR? I'm getting random test results when the race detector is enabled (on linux), which matches what Travis is showing (plus tip failed on darwin). Without the race detector, I can't get it to fail. For the life of me, I don't see what could be causing the failures.

I've confirmed the failures with Go 1.9.3, 1.9.4, 1.10rc1, and 1.10rc2.

The following will bypass the test cache in Go 1.10:

sh$ cd gohugoio/hugo/hugolib
sh$ while :; do echo `date` && go1.10rc2 test -race -run=TestTaxonomiesWithAndWithoutContentFile  ; done
Thu Feb 8 15:52:42 CST 2018
PASS
ok      github.com/gohugoio/hugo/hugolib        1.480s
Thu Feb 8 15:52:49 CST 2018
--- FAIL: TestTaxonomiesWithAndWithoutContentFile (0.00s)
    --- FAIL: TestTaxonomiesWithAndWithoutContentFile/uglyURLs=false,preserveTaxonomyNames=true (0.10s)
        Error Trace:    testhelpers_test.go:41
                        taxonomy_test.go:136
                        taxonomy_test.go:60
        Error:          Should be true
        Test:           TestTaxonomiesWithAndWithoutContentFile/uglyURLs=false,preserveTaxonomyNames=true
        Messages:       File no match for
                        "Cat1" in
                        "public/categories/cat1/index.html":
                        List|CAt1|
FAIL
exit status 1
FAIL    github.com/gohugoio/hugo/hugolib        0.551s
Thu Feb 8 15:52:51 CST 2018
--- FAIL: TestTaxonomiesWithAndWithoutContentFile (0.00s)
    --- FAIL: TestTaxonomiesWithAndWithoutContentFile/uglyURLs=true,preserveTaxonomyNames=true (0.10s)
        Error Trace:    testhelpers_test.go:41
                        taxonomy_test.go:136
                        taxonomy_test.go:60
        Error:          Should be true
        Test:           TestTaxonomiesWithAndWithoutContentFile/uglyURLs=true,preserveTaxonomyNames=true
        Messages:       File no match for
                        "Cat1" in
                        "public/categories/cat1.html":
                        List|CAt1|
    --- FAIL: TestTaxonomiesWithAndWithoutContentFile/uglyURLs=false,preserveTaxonomyNames=true (0.10s)
        Error Trace:    testhelpers_test.go:41
                        taxonomy_test.go:136
                        taxonomy_test.go:60
        Error:          Should be true
        Test:           TestTaxonomiesWithAndWithoutContentFile/uglyURLs=false,preserveTaxonomyNames=true
        Messages:       File no match for
                        "Cat1" in
                        "public/categories/cat1/index.html":
                        List|CAt1|
FAIL
exit status 1
FAIL    github.com/gohugoio/hugo/hugolib        0.468s

@moorereason
Copy link
Contributor Author

Correction: I'm able to reproduce the anomaly in Go 1.10rc2 without the race detector.

@moorereason
Copy link
Contributor Author

Fixed the issue. PTAL

Copy link
Member

@bep bep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid, just a couple of minore comments.

@@ -75,6 +75,35 @@ func (filepathBridge) Separator() string {

var fpb filepathBridge

// segmentReplacer replaces some URI-reserved characters in a path segments.
var segmentReplacer = strings.NewReplacer("/", "-", "#", "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you replace "-" with "-"? (the same character).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewReplacer uses "old-new" pairs. This replaces '/' and '#' with '-'.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is a ? character? Is it possible to use a common url encoding here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the tests below. The ? is removed. URL encoding is beyond the scope of this PR.

var last byte
b := make([]byte, len(s))

for i := 0; i < len(s); i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read this correctly this removes double-slashes. I assume that come from "My // Taxonomy" cases. Which is fine as long as this isn't used in the "general case" (which would break some URLs and make people grumpy). But you should add a test variant for it below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we would replace both slashes with - using segmentReplacer (other interesting characters like a space and ? are replaced by MakePathSanitized used above), and this loop removes consecutive dashes.

So, My // Taxonomy would become my-taxonomy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@stale
Copy link

stale bot commented Jun 24, 2018

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

@stale stale bot added the Stale label Jun 24, 2018
@moorereason moorereason added Keep and removed Stale labels Jun 25, 2018
@moorereason
Copy link
Contributor Author

@bep,
It's been a while since I submitted this PR, but it's held up by your requested changes that I believe I've made. Can you add this to a future milestone for review? Thanks

@onedrawingperday
Copy link
Contributor

onedrawingperday commented Aug 20, 2018

@moorereason
Does your PR also apply to Taxonomies with custom permalinks configuration?

Let's say that I have a taxonomy called color and its permalinks are configured like so: color = "/:slug/"

Then the taxonomy is assigned to content files with: color = "handbags/backpacks/gold"

With the above Hugo is able to nest a taxonomy under a Section or another Taxonomy.

Also here is another use case with the above technique to generate Chronological Blog Archives.

As you understand in both use cases the forward slashes are desirable.

@moorereason
Copy link
Contributor Author

@onedrawingperday, I think it will work as you expect, but I'd be more than happy to test a sample site if you have one available.

@onedrawingperday
Copy link
Contributor

onedrawingperday commented Aug 21, 2018

@moorereason Here is a sample repo with Nested Taxonomies for you to test.
https://github.com/onedrawingperday/hugo-test

I have included both of the use cases that I described above.
If you have a question let me know.

Thanks!

Add tests with slashes in terms to illuminate failing code paths.

Updates gohugoio#4090
MakePathSegment properly sanitizes path segments (like taxonomy terms) with
slashes (and pound signs).

Updates gohugoio#4090
This commit attempts to sanitize term keys and path segments to fix the
handling of taxonomy terms containing slashes.

Based upon gohugoio#4092 by @tsuereth.

Fixes gohugoio#4090
Don't mix casing in the "cat1" taxonomy term test because Hugo doesn't
support multiple terms whose lowercase strings collide.

Related to gohugoio#4238
MakeSegment will replace slashes in a path segment, so make sure we
include tests for that.
Case changed since last git rebase.
More cleanups from rebasing with master.
@moorereason
Copy link
Contributor Author

Thanks, @onedrawingperday. I tested your site with my original PR branch and also rebased with the latest master. In both cases, your test site comes out exactly the same. 👍

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slashes in a taxonomy term break path resolution
4 participants