Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb/test] Fix for flakiness in TestNSDictionarySynthetic #1262

Merged
merged 1 commit into from
May 26, 2020
Merged

[lldb/test] Fix for flakiness in TestNSDictionarySynthetic #1262

merged 1 commit into from
May 26, 2020

Conversation

JDevlieghere
Copy link

Summary:
TestNSDictionarySynthetic sets up an NSURL which does not initialize its
_baseURL member. When the test runs and we print out the NSURL, we print
out some garbage memory pointed-to by the _baseURL member, like:

_baseURL = 0x0800010020004029 @"d��qX"

and this can cause a python unicode decoding error like:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position
10309: invalid start byte

There's a discrepancy here because lldb's StringPrinter facility tries
to only print out "printable" sequences (see: isprint32()), whereas python
rejects the StringPrinter output as invalid utf8. For the specific error
seen above, lldb's isprint32(0xa0) = true, even though 0xa0 is not
really "printable" in the usual sense.

The problem is that lldb and python disagree on what exactly is
"printable". Both have dismayingly hand-rolled utf8 validation code
(c.f. _Py_DecodeUTF8Ex), and I can't really tell which one is more
correct.

I tried replacing lldb's isprint32() with a call to libc's iswprint():
this satisfied python, but broke emoji printing :|.

Now, I believe that lldb (and python too) ought to just call into some
battle-tested utf library, and that we shouldn't aim for compatibility
with python's strict unicode decoding mode until then.

FWIW I ran this test under an ASanified lldb hundreds of times but
didn't turn up any other issues.

rdar://62941711

Reviewers: JDevlieghere, jingham, shafik

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D79645

(cherry picked from commit f807d0b)

Summary:
TestNSDictionarySynthetic sets up an NSURL which does not initialize its
_baseURL member. When the test runs and we print out the NSURL, we print
out some garbage memory pointed-to by the _baseURL member, like:

```
_baseURL = 0x0800010020004029 @"d��qX"
```

and this can cause a python unicode decoding error like:

```
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position
10309: invalid start byte
```

There's a discrepancy here because lldb's StringPrinter facility tries
to only print out "printable" sequences (see: isprint32()), whereas python
rejects the StringPrinter output as invalid utf8. For the specific error
seen above, lldb's `isprint32(0xa0) = true`, even though 0xa0 is not
really "printable" in the usual sense.

The problem is that lldb and python disagree on what exactly is
"printable". Both have dismayingly hand-rolled utf8 validation code
(c.f. _Py_DecodeUTF8Ex), and I can't really tell which one is more
correct.

I tried replacing lldb's isprint32() with a call to libc's iswprint():
this satisfied python, but broke emoji printing :|.

Now, I believe that lldb (and python too) ought to just call into some
battle-tested utf library, and that we shouldn't aim for compatibility
with python's strict unicode decoding mode until then.

FWIW I ran this test under an ASanified lldb hundreds of times but
didn't turn up any other issues.

rdar://62941711

Reviewers: JDevlieghere, jingham, shafik

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D79645

(cherry picked from commit f807d0b)
@JDevlieghere
Copy link
Author

@swift-ci please test macos

@JDevlieghere
Copy link
Author

@adrian-prantl
Copy link

👍

@vedantk
Copy link

vedantk commented May 26, 2020

LGTM. I've been trying to merge this for 10+ days, but kept hitting Linux PR testing failures..

@fredriss fredriss merged commit 1f9c5cd into swiftlang:apple/stable/20200108 May 26, 2020
@JDevlieghere JDevlieghere deleted the python-decode-TestNSDictionarySynthetic branch May 26, 2020 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants