-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.0] Implements URL routing (i.e. better queryless/pretty URLs) #8421
base: release-3.0
Are you sure you want to change the base?
[3.0] Implements URL routing (i.e. better queryless/pretty URLs) #8421
Conversation
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Are these permanent redirects? I read somewhere that some browsers can silently update the bookmark if it targets a permanent redirect.
Slugs cannot be arbitrary because users with a vendetta against the site will seed bad words into the address bar. I remember over a decade ago, a story broke where users would do this to news site and it was a big story because of f-bombs in the address bar that led to the article. Don't let |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
4b7cd68
to
7c6fe9f
Compare
To be clear, I would very much like feedback from @dragomano, @live627, @jdarwood007, @sbulen, @BrickOzp, @LexArma, @Kindred-999, @MissAllSunday, @Oldiesmann, @MissAllSunday, @Arantor, or anyone else who wants to comment on this matter. I'm open to being persuaded to transliterate all slugs to ASCII, if given good arguments for it. I'm also open to considering other ideas about this, if someone wants to suggest something else. |
The only issue I can see is that browsers will paste the URLs with HTML encoding even if displayed normally. See for instance the main page of the Russian version of Wikipedia, where "Заглавная_страница"
I'm not sure how that will matter for international users though. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Co-authored-by: John Rayes <live627@gmail.com>
Users prefer to transliterate URLs rather than use national characters in URLs for several reasons:
Yes, there are “.рф” domains, those addresses are completely in Cyrillic. But many still hate such URLs and prefer to use solutions like the Pretty URLs mod. Here’s an example with Japanese characters. Do you want such URLs on forums?
|
That seems to depends on the browser. In Safari, for example, copying the URL directly from address bar copies the Unicode characters, whereas in Firefox the percent-encoded values are copied. Based on Wikipedia's example, though, I'm now wondering whether I should actually be less aggressive in stripping out diacritical marks (accents and such). They leave the all the diacritics in their URLs. |
I do like the idea of optional built in pretty-urls functionality. Not sure if localization is worth the effort though, but I'll keep an eye on this discussion. I'm not for or against. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
I use |
Yes, it's a good library too. |
How is the url consumed into the app? Looks like we just dump it into the query string and then parse it back out later? Just checking as while we can write .htaccess, I want to understand how we would support other web servers like nginx. If we just dump the request URI into the query string, that can be done easily for other web servers. Just not a oob since those typically require editing a server config, not a local .htaccess |
Yes, that's exactly what it does. So extending support to other web servers is certainly possible, as you say. |
I click a link to view a topic and I get error
EDIT: This is redirected from the old format. |
Yeah, I believe that both @dragomano's error and @live627's error were introduced by the redirection code I added for dealing with non-canonical slugs. I'll try to fix it in the next couple of days. |
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
I believe I have fixed those two bugs now. I am still looking into options for ASCII slugs. I am marking this as a draft until that part has been dealt with. |
The fact that users can link to topics using any phrases results in the canonical address matching those phrases, leading to different links to the same topic appearing in search results: /topics/some-title-28/
/topics/other-title-28/
/topics/title-28/
/topics/28/ Am I right? |
No. If the slug is incorrect, the redirection logic will always force a redirect to the correct canonical URL. |
b43adef
to
4592a3b
Compare
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
4592a3b
to
a919038
Compare
I'm just going to point out the obvious. Do with it what you will. Actions should not be aware of or take part in their own routing. I can not imagine a way to more tightly couple it than that. |
Having them tightly coupled was an intentional choice. My first draft had a separate route map, but I quickly realized that it would be extremely easy for that to get out of sync with the expected parameters of the actual actions. Instead, my thinking was that each action knows best what its own expected parameters are, and therefore what its route should look like. That's why QueryString::buildRoute() and QueryString::parseRoute() ask the appropriate action how its route should be built/parsed. But if you have a different way of thinking about it, please do explain. I'm entirely open to revamping this system. |
Will post back my thoughts on it. I am currently working on something related, for SMF. |
The main thing this does is to enable SMF native support for URLs to be rewritten as routes (i.e. virtual paths that are interpreted as well-structured queries). In other words, it implements a robust form of pretty URLs.
Examples:
https://example.com/index.php?board=1.0
https://example.com/boards/general-discussion-1
https://example.com/index.php?board=1.20
https://example.com/boards/general-discussion-1/20
https://example.com/index.php?topic=123.15
https://example.com/topics/my-great-topic-123/15
https://example.com/index.php?action=moderate;area=modlog
https://example.com/moderate/modlog
https://example.com/index.php?action=calendar;year=2025;month=1;day=23
https://example.com/calendar/2025/01/23/
https://example.com/index.php?action=post;board=1.0
https://example.com/boards/general-discussion-1/post
https://example.com/index.php?action=markasread;sa=board;board=1.0
https://example.com/boards/general-discussion-1/markasread/
https://example.com/index.php?action=profile;area=account;u=1
https://example.com/members/sesquipedalian-1/account
There are four types of routes:
/boards/<slug>-<id>
or/topics/<slug>-<id>
, with an optional/<start>
value appended for pagination purposes. See examples 1 through 3 above. There are also routes to individual posts using/msgs/<id>
, but these redirect to the canonical URL just like?msg=<id>
does./<action>/<area>/<sa>
(with the area and sub-action being optional). Example 4 above shows this. Some actions (see example 5) provide additional routing elements for commonly used query parameters that are specific to those particular actions./boards/<slug>-<id>/<action>
. See examples 6 and 7 above./members/<slug>-<id>/<area>/<sa>
(with the area and sub-action being optional). See example 8 above.Internally, routes are translated into query strings very early in the process (specifically, during
QueryString::cleanRequest()
), so that all other code works with the query string just like it always has. Similarly, forum URLs are rewritten as routes only near the very end of the process, duringUtils::obExit()
,Utils::redirectexit()
, orMail::send()
.There are two settings that control this behaviour:
queryless_urls
setting. This is the primary setting. As has always been the case, it is only supported on the Apache, LiteSpeed, and lighttpd web servers.hide_index_php
setting. When this is enabled,/index.php
will be removed from URLs pointing to pages within the forum. This setting is supported on any web server.These two settings can operate independently. If queryless URLs are enabled but the option to hide index.php is disabled, then
https://example.com/index.php?board=1.0
will becomehttps://example.com/index.php/boards/general-discussion-1
. If the option to hide index.php is enabled but queryless URLs are disabled, thenhttps://example.com/index.php?board=1.0
will becomehttps://example.com/?board=1.0
.If the admin enables both of these settings together, SMF writes a small mod_rewrite rule to the .htaccess file in order to make everything work correctly. This write operation performs safety checks; if the write operation fails, SMF will show an error message and will refuse to enable the settings.
Backward compatibility for the old-school queryless URLs has been maintained. Those forms of queryless URLs will no longer be generated, but they will still be recognized and parsed. This ensures that existing links from external sites, browser bookmarks, etc., will continue to work.
Slugs are generated on the fly for boards, topics, and members. Strictly speaking, these slugs are just fluff and don't matter for the functioning of the system. (Indeed, the slug can be left out or changed to any random string; all we need is the ID value that appears at the end). However, including a memorable slug in the URL allows the user to start typing the part of the URL that they remember into their browser's address bar and then have autocomplete suggest the remainder of the URL (including the harder-to-remember ID value) for them.
EDTIT: Based on @live627's feedback, we now redirect to the correct URL if an incorrect slug is given.
@VBGAMER45 will likely be particularly interested in the new
integrate_rewrite_as_queryless
andintegrate_parse_route
hooks. With these hooks, you should be able to create a hooks-only version of your Pretty URLs mod in order to support people who have used your mod with SMF 2.1 and below and/or who just prefer the format of the URLs that your mod produces.