-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary preg_replace? #459
Comments
Maybe Mastodon has changed something in the meantime, but in the early days they had a |
That may still be the case. But I think this regex would only remove the first newline it encountered. |
It will replace it in the complete content: https://onlinephp.io/c/3e435 |
I must stop doing regex in my head.... 😆 That said, I'm not sure it changes the presentation on Mastodon which, in my experience, tends to eat whitespace. But perhaps others aren't so forgiving? |
I also remember it being an issue with Mastodon specifically. Used a callback to overwrite, well, pretty much the entire post (or object) content to better suit my custom post types ... and I ran into the same thing. |
Shouldn't there be exceptions inside |
Yes, I've run into this a couple times already :-) Update: ran a quick experiment using the following ...
That works, when I print So I added |
Yes, trying to parse HTML using regex might be a bit optimistic. Newer versions of the Gutenberg-editor seems to replace Glitch seems to replace newlines with It would be nice to preserve the tabs in code blocks. |
I guess one need to change/remove the following:
Considering that Mastodon and others now support |
I'm seeing additional newlines in a lot of (other people's) federated WordPress posts, so I'm thinking: Yes. My solution -- to the necessary Works wonders. I've just now updated it to not strip these characters from inside Except there's something stripping them after all, I'm guessing on the Mastodon side? (Which doesn't make a whole lot of sense, seeing as it introduces newlines elsewhere. Although ... clients are inconsistent here, too.) My one remaining issue is that comments don't get filtered the same way. |
Still makes me think the |
Haha, I'm now pretty sure I have a second, older callback cleaning up the remaining newline characters.
As someone who's written a couple feed readers (and thus has had to parse and sanitize others' markup): it's a pain. What's worked for me, for "cleaning up" random bits of HTML, is strip as much whitespace chars as possible and then run the result through WordPress' |
OK, so this seems to work for me (i.e., this is essentially what's currently in my custom
And then I return that. (I completely discard what the plugin generates based on the Still gotta test in Tusky, but looks good in Mastodon's web UI. My custom "ActivityPub add-on plugin": https://gist.github.com/janboddez/58a2f3d2c86717cd799048af651fa6b4#file-activitypub-php (Note: there's a few things in there that are no longer need, gotta clean up one time.) |
Feel free to open a PR if you have a better solution that the actual! |
Does Wordpress have a build in function to do something similar to:
I suspect one can't rely on https://www.php.net/manual/en/book.tidy.php being available? And the new HTML API can't yet do something similar? Maybe later. In meantime replacing |
Tusky still rendering an extra empty "paragraph" at the start of the blockquote in https://indieweb.social/@ochtendgrijs@ochtendgrijs.be/111772889651276747. Fairly sure it's a Tusky issue; the culprit seems to be a space between the opening |
I'll have another look at this, I think we should (?) at least try to sanitize/filter comments and posts more or less the same way. Additional filtering is probably best left up to site authors/developers. |
Case anyone's still interested (or even following), I moved my Not saying y'all should start using it, quite the contrary! But: if anyone wanted to implement a custom content filter that strips all superfluous whitespace (and thereby having Mastodon, Tusky, and, hopefully, all other clients, render posts just like they would "native" Mastodon posts) while respecting text inside Now if only I can also filter comments this way, I'm going to be very happy. 8-) |
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I'm not sure that this line is needed:
wordpress-activitypub/includes/class-shortcodes.php
Line 218 in db0f9c1
If I've understood correctly, the
trim()
removes the leading & trailing whitespace. Then thepreg_replace
removes a single character inside the remainder of the text.I see the need for the time, but not the
preg_replace
.In my tests, removing the line didn't make any difference to the content of the ActivityPub JSON.
The text was updated successfully, but these errors were encountered: