-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Android Invalid Span Crashes #2279
Conversation
👋 @hypest ! I'm pretty confident that you'll have some thoughts on this, so wanted to give you a heads up that you may want to give it a look, particularly since I think it may make sense to do this as a hotfix to include it in the 14.9 release. Also, I would appreciate if if you could confirm that you don't see any problems with sending "unknown" HTML tags to Sentry (additional context in the "But We Don't Want to Just Ignore This" section of my PR |
53d1f4d
to
0cc01d6
Compare
2b798a2
to
29dd24a
Compare
29dd24a
to
63af455
Compare
I just added an "Android Native Unit Tests" job to circle ci to run the HTML parsing unit test I added (hopefully we'll add more tests in the future). My intention is to make this a Required check once this gets merged to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! I think that adding that exception will help us a lot to debug issues.
Cool to see more tests added for this 👏
One thing I noticed by testing some of the cases:
Three spaces
The list should create a new list item and the spaces in the item above it will be trimmed
Spaces aren't trimmed in the first line, they're kept. I think that wouldn't be an issue but figured I'd report it.
Letter + Three Spaces
The list should create a new list item and the spaces in the item above it will be trimmed
Same as above
I also did some writing flow test cases and all good 👍
I'd still wait for @hypest thoughts on this PR since I might miss some things.
...roid/src/main/java/org/wordpress/mobile/ReactNativeAztec/IllegalSelectionIndexException.java
Show resolved
Hide resolved
Thanks @geriux ! Regarding the spaces not being trimmed: that was actually a mistake on my part in writing the test cases. Those spaces used to be trimmed, but a few weeks ago a change was made to no longer trim whitespace. I copied those test cases from before that change, and I failed to notice the references to trimming whitespace needed to be removed. Nice job catching that! I have now updated the test cases to not state that those spaces would be trimmed. 🙂 |
Awesome! Thanks for clarifying, all working then 🎉 |
Thank you Matt! Here are my thoughts just by reading the PR description. I'll have a look at the code afterwards.
I agree that it's time to be more defensive at this point since the issue has been around for long and happening a lot.
Yes, that was always my concern with "swallowing/avoid" the crash: Instead of a single place where the mismatch issue would appear, swallowing would spread it around and manifest it as random and seemingly unrelated writing flow issues. I'm still not 100% sure to be honest. We'll have to keep an eye open to see if we start to have more "random writing flow issues" and assess.
Let's remember though that even with that logging, we will probably not be able to connect the dots when a writing flow issue happens. The issue might be correlated to the logged entry, or not.
I don't think there's a concern here. The logging is quite generic and as you mention too, it can help prioritize adding better support to the tags that matter the most.
+1
+1 |
...roid/src/main/java/org/wordpress/mobile/ReactNativeAztec/IllegalSelectionIndexException.java
Outdated
Show resolved
Hide resolved
...ive-aztec/android/src/main/java/org/wordpress/mobile/ReactNativeAztec/ReactAztecManager.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished a pass Matt and this looks good. I've left a few non-major comments.
I wonder though, can you check if the swallowed exception will leave any Sentry breadcrumbs? It might be useful 🤞 if it does so we can connect the dots between seemingly unrelated issues.
Good thought @hypest . I assumed that having Sentry capture the exception would be enough for Sentry to also use it as a breadcrumb if there was a subsequent crash. I tested this, and I was... what's the right word... oh yeah... wrong. (Relevant discussion: p1590156889259100-slack-platform9) Updated to also send the exception message as a breadcrumb in 999672f. Verified that it is showing in subsequent crashes like this("Illegal selection index for text..."): |
.../src/test/java/org/wordpress/mobile/ReactNativeAztec/IllegalSelectionIndexExceptionTest.java
Outdated
Show resolved
Hide resolved
473e373
to
5065bce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me @mchowning from the code side of thing. By the way, thanks for adding tests!
I'll leave it to @geriux 's testing to sign off the testing part.
It looks like we're going to have 2 fixes in 1.28.1 (maybe 3), so I created a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All working after the latest changes! LGTM!
This addresses wordpress-mobile/WordPress-Android#9832.
Related PRs:
wordpress-mobile/WordPress-Android#11965
WordPress/gutenberg#22500
Description
Android has a too-long history of Span-based crashes
Android has had a number of crashes due to the attempt to apply invalid selection indices. These can arise for different reasons, and we have written a fair bit of JS code trying to predict the specific circumstances where these errors will occur so we can avoid them. We have not been as successful as we need to be because over the last 90 days this crash has caused almost 140,000 crashes affecting 20,000 users. 😬
One of the causes of this crash is the fact that AztecAndroid removes whitespace, which would cause the JS to think the text was longer than it was on the native side. As a result, JS would sometimes try to set an index that was too large. I believe the code we had written to avoid this issue was generally avoiding the crashes here, but this specific cause is no longer an issue because a change was recently merged that stops the native side from removing the whitespace. For that reason, I actually had a PR removing those whitespace checks, but I have closed that in preference for this PR (which also removes those whitespace checks). You can read that PR description for more detail on that specific issue though.
A second cause of these crashes is that Android renders unknown HTML tags as an image span, and does not use the text in the span in the length of the text in the view (for example,
<mark>some text</mark>
is treated kind of like a single character on the native side, but is treated as multiple characters on the javascript side).Approach Taken In This PR
One approach to addressing this would be to continue to try to anticipate the circumstances leading to the crash based on the text itself (i.e., check if there's an unknown tag) and calculate the proper selection indices based on the differences in the text on the JS and javascript side. This would be a tricky task with any mistakes likely resulting in many crashes for our users (as shown by the fact that we're still seeing many of these crashes despite our repeated previous attempts to fix this). A safer approach is to just check the selection index on the native side and avoid using any selection index that is outside the bounds of the text on the native side. I believe this will completely avoid any crashes due to an invalid selection index.
One concern with this approach is that it could leave the user in a bad state because of the mismatch between what JS thinks the state of the view is in and the actual state of the view on the native side. I do not think this is a concern that should stop us from using this approach, however. First, we're currently crashing the app anytime this happens, so it will be pretty difficult for us to get the user into a worse state by failing to update the selection indices. Second, not updating the selection indices when they are invalid is actually the approach that iOS uses, and it seems to have not caused any issues for them.
But We Don't Want to Just Ignore This
This is still an issue that we want to keep track of and address more robustly though. Among other things, because of the mismatched text between native and JS, it is possible to have blocks not split in the correct place (i.e., pressing enter splits the block somewhere other than where the cursor is located). So we want to still track this issue and work on addressing it more robustly even once we get the crashes to stop.
For that reason, I am also sending off an exception to Sentry that includes the information about text size, and the selection indices. In addition, I am also including the tags in any
UnknownHtmlSpan
classes on the text view (so for example, text with<mark>abc</mark>alksdj
would send "mark"). This will both (1) let us know if this is happening in cases where there are notUnknownHtmlSpans
(i.e., maybe there's another cause that I haven't uncovered) and (2) help us track the usage of different tags so we can see which tags have high usage and we may want to support in the future.I do not believe there is any privacy concern with sending the html tag without any of the related content contained inside that tag to Sentry, but I want to call that out specifically so it does not get overlooked.
Here is an example of the exception in Sentry.
Proposing This as a Hotfix
Because I believe this is a pretty safe fix, we're relatively early in the 14.9 release cycle, and the high frequency of this crash, I think it makes sense to do this as a 1.28.1 hotfix so we can include it in the 14.9 release. These PRs are currently set up with that in mind, so if we decide to not make this a hotfix, they'll need to be retargeted and updated.
E2E Tests
I have added E2E tests for the scenarios where this was causing crashes. On one hand, these seem like a perfect case for E2E tests because they are edge cases that will not be covered much by regular testing. On the other hand, these are edge cases and if we covered every edge case with an E2E test, our test suite would take forever. I decided to go ahead and add these for now since our test suite still has a long way to go before it gets too large and if/when that happens we can always remove these tests. Definitely open to other thoughts though.
Testing
In testing this, note that even though the exception is no longer causing a crash, we are still printing the stacktrace, so you can still identify when the exception occurred. In WPAndroid release builds, the exception is also sent to Sentry.
Unknown HTML tags
Based on wordpress-mobile/WordPress-Android#9832 (comment)
Covered in a new E2E test
Create a post with the following HTML:
E
Multiline Paragraph Blocks From Web
From #1507
Covered by a new E2E test
Handling spaces with Lists
From #885
Two spaces + Letter
Covered by a new E2E test
Three spaces
Letter + Three Spaces
PR submission checklist:
RELEASE-NOTES.txt
if necessary.