-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix automatic space insertion between entity references in ConstructingParser. #140
Conversation
(sorry to be annoying and nitpick, as this looks like a nice improvement, but can you make sure the code you've added is indented correctly?) |
Yeah sure, no worries! |
The tests might be failing because they need to go in the Would you be willing to try moving the tests and see if they pass? |
Yeah, I was trying to figure out the problem and found out that |
It might not be necessary to introduce Would that work instead? |
Yeah, it is not necessary and does indeed work. Thanks! |
// Checks if node when converted to string is an entity ref | ||
def checkNodeForEntityRef(n: Node): Boolean = { | ||
val st = n.toString | ||
st(0) == '&' && st(st.length - 1) == ';' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know that n.toString
can never be empty?
also, trivial nitpick, but st.last
is clearer than st(st.length - 1)
imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SethTisue I agree. I'm hoping we can just get rid of this definition in favor of isAtom
. If not, then I was going to suggest using startsWith
and endsWith
to avoid the inevitable out of bounds errors here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If 'txt' is just made up of one or more spaces | ||
if (TextBuffer.fromString(txt).toText == Nil) { | ||
// Check if the last node in 'ts' was an 'Atom' and the next node to be parsed is an entity or character ref | ||
if(ts(ts.length - 1).isAtom && ch == '&') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest: ts.last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know that ts
can never be empty here?
Hi, it's been some time since my last activity. I took a look at #77 and found it somewhat relatable as it also involves whitespaces around text and entity refs. Hence, I've tried to fix the issue. Waiting for your review. Thanks! |
Ok, bummer. Thanks for looking in to that. I really appreciate it. I guess I'll need to come up with some more test cases to better understand how the ConstructingParser parses entities. I kind of wonder if #73 and #77 and #107 are all related problems. Optimistically, we choose a path to fix all of them, but if we can't, then choose which one(s) we can fix. I'll try to study @piyush-jaiswal's fix for #77. Thanks for looking in to that. |
This PR addresses the issue pointed out in #107. The tests are the ones which were provided by @ashawley here. It would be great if this could be reviewed. Thanks!