Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tld: incorrectly parsed Wikipedia table #1945

Closed
dgw opened this issue Sep 24, 2020 · 1 comment
Closed

tld: incorrectly parsed Wikipedia table #1945

dgw opened this issue Sep 24, 2020 · 1 comment
Assignees
Labels
Bug Things to squish; generally used for issues Declined Requests that will not be implemented for technical or project direction reasons
Milestone

Comments

@dgw
Copy link
Member

dgw commented Sep 24, 2020

One table of the Wikipedia article we use for TLD details in tld gets parsed wrong, and makes things like this happen:

<dgw> ,tld xn--q7ce6a
<SopelTest> [tld] : Lao | Bulgaria: .ລາວ | Bulgarian: Laos | Cyrillic: Lao | bg: Lao | .bg: Not in use | No: .la
<dgw> ,tld ລາວ
<SopelTest> [tld] : Lao | Bulgaria: .ລາວ | Bulgarian: Laos | Cyrillic: Lao | bg: Lao | .bg: Not in use | No: .la

A quick look at the HTML didn't reveal any obvious structural differences between this table and the correctly-parsed others, but something is obviously tripping up my rudimentary HTMLParser-derived class. I'll probably need to spend some quality time with pdb, trying to figure out where in the parsing routine the data gets mangled.

Follow-up to #1939 (comment)

@dgw dgw added the Bug Things to squish; generally used for issues label Sep 24, 2020
@dgw dgw added this to the 7.1.0 milestone Sep 24, 2020
@dgw dgw self-assigned this Sep 24, 2020
@dgw dgw added the Declined Requests that will not be implemented for technical or project direction reasons label Oct 23, 2020
@dgw
Copy link
Member Author

dgw commented Oct 23, 2020

Peppered debug logging through the parser, spun up my test bot, issued a bunch of TLD commands, and… nothing. Can't reproduce this any more. Will leave in the 7.1 milestone for historical purposes, but seems this likely wasn't our problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Things to squish; generally used for issues Declined Requests that will not be implemented for technical or project direction reasons
Projects
None yet
Development

No branches or pull requests

1 participant