-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-ascii tag creates empty slug, exception django.urls.exceptions.NoReverseMatch #3721
Comments
This looks to be a simple missing dependancy of TagBase for "unidecode", as otherwise slugify() in TagBase just returns the "tag" without any cleanup. Will ponder how to best deal with this. |
+1 just adding unidecode as a required package. That one is simplest. |
Firstly, I think there should be a validation constraint on the tags model and/or table to forbid empty slugs. That would at least prevent getting the database into a broken state. Next, I note that this problem goes away if tags had to be created before using them (separate feature request #3703). In that case there would be an "Add Tag" page, with a field for the slug, and empty slugs could be rejected before saving them - just like "Add Device Role" currently. Users would then be forced to create a non-empty slug which is meaningful to them. (I've also checked what happens when you do a CSV-import of device roles: the "slug" column is mandatory. So there's no problem here either: Netbox does not auto-assign slugs from names) Now, I have to say unidecode is quite impressive, although it's nearly 2MB of dependency:
It could be exposed as an AJAX completion feature. However, once you've decided to create tags in a form, you'd probably be better off using a Javascript unidecode library in the browser instead. |
After digging into this, I see that there are two separate issues:
As an immediate workaround, I can confirm that installing
(This works because I'll work on preventing Tags from being created with blank slugs. But as far as the slugification piece goes, I can't speak to whether the behavior provided by |
Clearly some slug is required, so a default of |
I think using the unidecode as a slugifier would be the better way to go, IMO, rather then going with a standard incremental index (which we already have on the pk and if we can semi-properly decode unicode, why not?) I will do some more testing with it, but I think it will be a "good enough" effort on our part to close it out. It will not handle Japanese, however that is just the nature of the language (uses the same characters as Chinese)
Again though, I think this is a good enough approach for something like this. |
Did you mean a different issue? That one is already closed. |
Sorry, yes #3703. It would mean that tags would be handled in exactly the same way as device roles, platforms, device types, manufacturers etc: the user would be required to enter a non-empty slug (of their choice) at time of creation of the tag. |
To be clear, #3703 is a feature request separate from the issue being addressed here. This needs be resolved regardless of the response to #3703. It seems we have two options:
if allow_unicode:
value = unicodedata.normalize('NFKC', value)
else:
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii') An example: >>> slugify('What does 台灣 mean?', allow_unicode=True)
'what-does-台灣-mean' I don't have a strong opinion either way. |
Yes...
...I disagree, because if #3703 is implemented, then it solves this issue at the same time. Consider for example Device Roles. You can already create a Device Role whose full name consists entirely of high unicode code points. But there is no need to apply unidecode on them to synthesise a slug; you simply cannot save a Device Role with an empty slug. This forces the user to assign a non-empty, ASCII-compatible slug themselves. #3703 would make tags handled the same. Tags would be created explicitly before use; tags could not spring into existence any other way. |
Right, this is a problem. Slugs should be generated automatically regardless. (This is not limited to tags.) |
Sure, but that's a separate enhancement which ought to have a new ticket. Here you go: #3741 Regarding unicode slugs: they need special encoding for use in URLs. It can be done - but since the main point of slugs is to be embedded in URLs, that's something users will need to be aware of. |
Another option would be to use urllib.quote() however the slug this would generate would not be user friendly. |
It looks like hiragana works, but intermixing with kanji you get partial chinese and partial japanese. Any idea's @kobayashi, have you dealt with unicode in your work at all? |
From PyPI:
So it looks like unidecode will only transliterate the standard Chinese meaning. I would like to hear from people with experience in other languages, and see if there is a reasonable approximation for most other languages. |
Slugify with Here is the sample of Japanese. Modern browsers can be resolve the escapaed URL characters.
If there is no objections, I would like to implement it. |
Environment
Steps to Reproduce
Organization > Tags
(/extras/tags/
) in the GUIExpected Behavior
Tag to be functional
Observed Behavior
I get an exception:
If I query the database, I find that a tag has been created with an empty slug (hence cannot match the regexp pattern which requires at least one character)
I also find I get this exception at the home page (
/
), but not at/dcim/devices/
.Cleanup
The exception persists for 15 minutes or until you invalidate the cacheops cache:
Links
Reported previously as #2853, #3079, #3717 (insufficient detail to reproduce).
Reported on the google group here, here, here, here.
The text was updated successfully, but these errors were encountered: