-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-latin symbols in slug does not work anymore? #288
Comments
Hello there, This is an intended behaviour change. It was changed in the last release (2024.03) and was mentioned in the release notes. Slug rules for posts and pages were always intended to be a-z, 0-9, and hyphen only, but were for a long time not being enforced properly. As mentioned in the release notes, a workaround when editing existing posts and pages is to empty the slug field to preserve the existing slug on update. If you want the slugs to be relaxed globally for all posts, pages, categories, and tags, you can change a constant in inlcudes/common.php - but think carefully before doing this, because it has side-effects. |
Ugh, sorry, I've skipped this release.
Is there any reasons behind this decision? I've seen #287 and it's a good start (this PR adds only Russian subset of Cyrillic, but we have a lot more in Ukrainian, Serbian, Kazakh, and many others). It will work fine for many languages… but if someone will add Chinese, Vietnamese and/or other Asian script(s), which have a lot of (multibyte!) symbols, @cuixiping has already suggested using url-encoded, it's a way cheaper operation and browsers can convert it back to human-readable text since long ago (but I'm not sure if it will work the opposite). |
Chyrp Lite has followed the philosophy that URLs should be able to survive transit through multiple potentially misbehaving systems, by eliminating all chars that a system might naively try to escape or convert. URL encoding breaks that, because a naive system might try to escape the percent signs and you'll end up with everything double-encoded. This philosophy has always applied to categories and tags. The help documentation for posts and pages states that it applies to them too - but I realised last year that it was not being properly enforced, hence the change to align the behaviour with the docs and to standardise the behaviour across posts, pages, tags and categories. The Certainly I will never accepted a pull request that, for example, attempts to transliterate the 8000+ multi bytes chars of Simplified Chinese into Latin, firstly because this would be unwieldy and secondly because the transliteration of Chinese characters to Latin is acknowledged to be highly imperfect. You can get exactly the behaviour you are requesting by setting the constant SLUG_STRICT to false in common.php. As explained in that post, this change could cause side-effects for tags that contain multi byte chars, requiring administrator action. But in all other respects, it is safe to do and will continue to be supported in future. In summary, I've done things the way I think they should be done, but I've retained the option for you to do things differently if you disagree. I hope that makes sense! |
I'm not disagree, I'm curious. ☺
↑ This was my question and everything below is just my thoughts about the problem I didn't understand at the moment.
It makes sense. Thanks for the detailed answer. |
Thank you for asking! I'm always happy to explain the reasoning for my choices. ^_^ |
I'm not sure if this is a bug or feature…
Some time ago it worked fine, I have a lot of articles with non-latin symbols in their slugs, but today I edited an article and Chyrp Lite doesn't allow me to save/publish it.
Is this how it was intended?
The text was updated successfully, but these errors were encountered: