-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PropertyString, cleanHtml helper and escapeHtmlExt helper. #4004
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @EreMaijala, looks like a great start! A few minor thoughts/suggestions...
themes/bootstrap5/templates/RecordDriver/DefaultRecord/data-summary.phtml
Outdated
Show resolved
Hide resolved
* @license http://opensource.org/licenses/gpl-2.0.php GNU General Public License | ||
* @link https://vufind.org/wiki/development Wiki | ||
*/ | ||
class EscapeHtmlExt extends \Laminas\View\Helper\Escaper\AbstractHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this name, but I also can't think of a better one... :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither. I wanted to extend EscapeHtml, but the silly @Final thing prevents that. And while we could substitute our own EscapeHtml, the Laminas class is used in so many places, that it's not quite straightforward. We could of course change all the references to an interface, but that would require more wide-ranging changes. If you think that'd be a better way forward, I'd be happy to work on that. What would support that is the fact that EscapeHtmlAttr needs a substitute as well (to be able to control the IE-compatibility of the escaping process), so doing both at the same time would probably make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have really strong feelings on this -- no solution feels obviously like the best. Maybe @maccabeelevine will have some thoughts to share... ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And while we could substitute our own EscapeHtml, the Laminas class is used in so many places, that it's not quite straightforward.
This was certainly my original idea with #3998, to make the template changes as simple as possible, and it's safer here than in my PoC implementation since it would do nothing unless you had a PropertyStringInterface value and passed allowHtml
. But I'm sure there are complications I'm not seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A month later, a fresh thought on the name: would escapeOrCleanHtml
be more descriptive than escapeHtmlExt
, since that's what this is doing -- escaping the HTML unless told to clean it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Ere agrees, I like that naming!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it a lot better than escapeHtmlExt
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the idea here, and it's so much better than the original #3998 PoC. I added some specific questions, but I guess my primary question (and maybe the hardest to answer) is how we deal with generic templates that may or may not have HTML in certain fields for some backends.
So, would it be appropriate in the default record driver's core.html to change
<h1<?=$this->schemaOrg()->getAttributes(['property' => 'name'])?>><?=$this->escapeHtml($this->driver->getShortTitle() . ' ' . $this->driver->getSubtitle() . ' ' . $this->driver->getTitleSection())?></h1>
to
<h1<?=$this->schemaOrg()->getAttributes(['property' => 'name'])?>><?=$this->escapeHtmlExt($this->driver->getShortTitle(), allowHtml: true)?> <?=$this->escapeHtml($this->driver->getSubtitle() . ' ' . $this->driver->getTitleSection())?></h1>
Performance-wise, this should be ok, since for any other record driver the title would not be a PropertyString and so the allowHtml would be ignored. But is it ok from a template complexity standpoint?
* @license http://opensource.org/licenses/gpl-2.0.php GNU General Public License | ||
* @link https://vufind.org/wiki/development Wiki | ||
*/ | ||
class EscapeHtmlExt extends \Laminas\View\Helper\Escaper\AbstractHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And while we could substitute our own EscapeHtml, the Laminas class is used in so many places, that it's not quite straightforward.
This was certainly my original idea with #3998, to make the template changes as simple as possible, and it's safer here than in my PoC implementation since it would do nothing unless you had a PropertyStringInterface value and passed allowHtml
. But I'm sure there are complications I'm not seeing.
Regarding @maccabeelevine's question about changing the default driver's core.html to support HTML, I don't think I would have a problem with that in the interest of greater flexibility.... |
@demiankatz Do we want to support HTML for all fields or for longer textual fields like summary? Different choices will lead to different complications e.g. with search term highlighting. |
Co-authored-by: Demian Katz <demian.katz@villanova.edu>
If we wanted to start small, I would think the bare minimum would be longer textual fields and titles. But I'm not opposed to rolling out the support more widely if it's easier to do it all at once than piecemeal. I tend to favor the most pragmatic strategy, whatever that turns out to be. :-) |
Here are a couple of thought:
|
I think the downsides probably outweigh the benefits here, since I can't think of any scenario where this doesn't potentially lead to unexpected side effects or confusion. I think it's better to be explicit.
...and this would be a good way to be explicit in a number of contexts.
Possible lazy solution: truncate based on a tag-stripped version. If the stripped version is under the limit, display the HTML as-is. If the stripped version is too long, show the truncated, stripped version, and the user will have to "see more" to get the rich version. Not ideal, but maybe a quick place to start.
Maybe we need the ability to define multiple named tag allow-lists. Then we could have a "default" list and a "heading" list and the configurable settings could refer to an allow-list, or "none". This would empower users to create their own more granular lists as needed, but we could use these two obvious ones as a starting point.
Given the way Laminas seems to be gradually cutting off the ability to extend anything, I agree that building tools that meet our needs and wrap around Laminas' public interfaces is better than trying to build upon those interfaces directly. |
This PR was accidentally closed by the deletion of the dev-11.0 branch; I have restored and reopened it. Sorry for the inconvenience! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another look at this and had just a few more minor thoughts and questions...
* @license http://opensource.org/licenses/gpl-2.0.php GNU General Public License | ||
* @link https://vufind.org/wiki/development Wiki | ||
*/ | ||
class EscapeHtmlExt extends \Laminas\View\Helper\Escaper\AbstractHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A month later, a fresh thought on the name: would escapeOrCleanHtml
be more descriptive than escapeHtmlExt
, since that's what this is doing -- escaping the HTML unless told to clean it instead?
'<br>', | ||
array_map( | ||
function ($summary) { | ||
$htmlContent = str_starts_with($summary, '<') && str_ends_with($summary, '>'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a way to eliminate this hacky check for angle brackets. Should we do the check in the record driver and wrap the string in a PropertyString? Should the new helper have a switch to enable this check, or even make this behavior part of the standard "allowHtml"? I'm not totally sure, but it feels like we can do better here, especially since as written, there's no safety on the summary in the special case.
// which ensures that the libxml2 options (namely keepBlanks) are set up | ||
// properly, and whitespace nodes are preserved. This should not be an | ||
// issue from libxml2 version 2.9.5, but during testing the issue was | ||
// still intermittently present. Regardless of that, CentOS 7.x have an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this note about CentOS 7 still relevant now that it's so thoroughly EOL'ed?
Related to #3998, this is a draft for PropertyString that can carry additional information along with a plain text string. It now has explicit support for HTML content, and there's also a cleanHtml helper to handle sanitization. escapeHtmlExt is a replacement for escapeHtml allowing the HTML version to be returned if desired.
@maccabeelevine What do you think? Obviously this needs some template changes where HTML is desirable, but should otherwise be fairly easy to use.
TODO: