Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: UnicodeEncodeError #1021

Closed
maxpowa opened this issue Feb 8, 2016 · 5 comments · Fixed by #1028
Closed

url: UnicodeEncodeError #1021

maxpowa opened this issue Feb 8, 2016 · 5 comments · Fixed by #1028
Assignees
Labels
Bug Things to squish; generally used for issues Easyfix Bugs or Tweaks that are easy for a new contributor to fix or implement

Comments

@maxpowa
Copy link
Contributor

maxpowa commented Feb 8, 2016

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 352: | ordinal not in range(128) (file "/usr/local/lib/python2.7/dist-packages/sopel/modules/url.py", line | 191, in find_title)

https://github.com/sopel-irc/sopel/blob/master/sopel/modules/url.py#L191

@maxpowa maxpowa added Bug Things to squish; generally used for issues Easyfix Bugs or Tweaks that are easy for a new contributor to fix or implement labels Feb 8, 2016
@maxpowa maxpowa self-assigned this Feb 8, 2016
@jordan-rosenfeld
Copy link

I also encountered this but with \u2019 (right single quotation mark).

@elad661
Copy link
Contributor

elad661 commented Feb 9, 2016

I guess I didn't test this enough on python 2.7

People should really be switching to 3 though!

@elad661 elad661 changed the title web: UnicodeEncodeError url: UnicodeEncodeError Feb 9, 2016
br0ziliy added a commit to br0ziliy/sopel that referenced this issue Feb 16, 2016
This fixes sopel-irc#1021

Following the existing code, `errors='ignore'` causes decoding errors to be
silently ignored.  Also we catch LookupError exception, which is raised if the
encoding is not known and unicode() is not able to decode a byte.
br0ziliy added a commit to br0ziliy/sopel that referenced this issue Feb 16, 2016
This fixes sopel-irc#1021.
Encode a byte to UTF8, ignoring errors.
br0ziliy added a commit to br0ziliy/sopel that referenced this issue Feb 16, 2016
Depending on the URL, response.iter_content() in Python 2/3 will return either:
- `type 'unicode'`/`class 'str'` (for "plain" HTML)
- `type 'str'`/`class 'bytes'` (for binary streams, like file URLs)

To distinguish between the two situations we're checking if we got string or
bytes, and proceed accordingly.

This also fixes sopel-irc#1021
@br0ziliy
Copy link
Contributor

@elad661 Elad, I fixed it, works on botth 2.7 and 3 now.

@fgsch
Copy link

fgsch commented Mar 4, 2016

Any plan to include this on a release? It's a bit annoying

@elad661
Copy link
Contributor

elad661 commented Mar 9, 2016

Sorry for the delay - real life takes priority over software projects.

I'll review this pull request now.

kwaaak added a commit to kwaaak/sopel that referenced this issue Mar 22, 2018
* update movie.py to use omdbapi as imdbapi is non-functional

* minor cosmetic change

* Resolve sopel-irc#926

Also cleaned up dict value retrieval a little, the .get() calls were a bit unnecessary.

* Remove feedparser dep from requirements

* Remove feedparser from RPM spec

* Make CAD currency code case-insensitive

* Resolve sopel-irc#929

Ensures that FilenameAttribute's parse and serialize always have the parameters they need.

* [isup] fix bad indexes preventing bot from recognizing http protocols

`'http://'` is 7 characters long, so `site[:6]` can't ever match it. ditto for `'https://'` and `site[:7]`

* Switch back to get() calls

There's some places where there were already try catch blocks for KeyError, so I left those ones.

* core: Fix issues with reloading folder modules

Closes sopel-irc#899, obviates and closes sopel-irc#932.

* Switch from *args to named args.

* Properly fetch the xml before passing it to xmltodict

I'm a moron.

* Release 6.1.0

* Use flake8 for future checking, and add missing ones

* Fix coding declaration in a few places it was still wrong

Again, fuck windows. sopel-irc#821

* README: Update sopel package location in Arch

* Fix crash during configure

Reported by aam in IRC, http://pastebin.com/0rAQb7Kp

* Fix TypeError: 'NoneType' object is not iterable

Happened when an invalid language hint abbreviation was given.

* Require a valid phrase to translate

* [meetbot] fix crash when starting meeting

Resolves sopel-irc#942

* Release 6.1.1

* Require chanmsg for all adminchannel commands

* Fix sopel-irc#945

We should only get `socket.gaierror` if the user passes an invalid IP or hostname, but it could be in the event that we don't have DNS configured on the machine or a multitude of other things, so we can be somewhat vague in the error message.

* Fix sopel-irc#860

Should resolve the inconsistencies between the implementation of unquote in python 2 and python 3.

* [Formatting]Gray is the same as Grey. Which spelling is correct is a grey area.

* Updated link for help command

was a bad link, updated to correct one

* [countdown] fix remaining time calculation

* [docs] Rename

* Add tracking of users and their accounts

This uses some of the code that @maxpowa wrote for sopel-irc#941, but gives a
somewhat more intuitive API. It also paves the way for potentially
adding direct support for away-notify and metadata-notify.

* rename references to willie in systemd service file

* update project url in comments

* update project url in comment

* update project name in CONTRIBUTING.md (includes log files location, issue tracking url)

* Update github URL to organization, Sopel-IRC, rather than Embolalia

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Also handle negative drop_lowest values.

* Make BTC currency code case-insensitive

* Address comments on PR sopel-irc#961

This includes being more consistent about using pop rather than del to
prevent key errors, and adding some locking around the privilege related
things

* Add user tracking for RFC WHO replies

* Add away tracking

* Add support for account-tag

* Enable the account-tag capability

* Replace channels list with channels dictionary

Hopefully, nobody else is taking advantage of channels being a list,
rather than a dict. If they are, well, oops.

* Add enumeration of IRC events

Closes sopel-irc#960

* Add cap-notify support

See sopel-irc#971

* coretasks: Replace numeric events with their enums

Also add the missing RPL_WHOSPCRPL to tools.events

* [contrib] rename & edit willie out of contrib

Fixes sopel-irc#963

* Huge cleanup of copyright headers and docstrings

They're still super inconsistent and probably a lot are out of date, but
at least there won't be random copyright info showing up in the docs
anymore. Oh, and my domain and name are correct now, too…

* [currency] Make arguments case-insensitive (close sopel-irc#979)

* Fix URL excludes loading (sopel-irc#959)

setup uses it as a list, and in previous versions of sopel it was a list, but in the UrlSection it's defined as a ValidatedAttribute. This was causing each character in the excludes list to be parsed as a regex exclude. Switching to ListAttribute fixes the issue.

* [find_updates] Fix missing RPL_ prefix

* [bot] Fix misleading message

Coretasks is only one module, so if you loaded it and only one other, you'd get the "Couldn't load any modules" warning, even though there was a module loaded.

* [trigger] IRCv3 server-time support

* [tests] Create more trigger test cases

* [tests] Create module tests

* [tests] DB test cleanup

* [tests] Add formatting tests

* [tests] Fix coding declaration

* [tests] Update .travis.yml

* [translate] Fix bad test

* Support using account name for auth

* [tests] Remove pep8 dev-requirement

Also fixed the critical error where py.test thought sopel.py (the entry script) was the sopel package.

* [pep8] Minor clean up to conform with pep8

* [web] Ensure ca_certs is defined

When running tests it's not defined at all, because the only place defining it was the config loader code. Now it's defined, but will still fail out since it's not a valid CA certificate file.

* [pep8] Final pep8 run

Everything should now conform to pep8 and pass flake8.

* [tests] Improve trigger coverage

* [reddit] Prevent specific commands in PM

Resolves sopel-irc#789

* Documentation for target types

* docs/core: Shift around functions to make autodoccing easier

* docs: Start on a major cleanup of documentation

* Make _ssl_recv always return bytes

_ssl_recv returned empty strings instead of the empty bytes object if
the socket was closed or upon ENOENT.

This lead to exceptions when running sopel with python3 because
asynchat.handle_read expects byte objects.

This commit fixes sopel-irc#937.

* docs: Majorly overhaul organization and format

* Release 6.2.0

* Release 6.2.1

* dice: Allow comma delimiter

Closes sopel-irc#998

* adminchannel: Remove totally useless commands

Also add error messages to the somewhat useful ones

* Make it a bit harder to run into the LC_ALL thing

This behavior is stupid. Respecting LC_ALL, or anything else for that
matter, over the encoding fucking noted in the fucking file is a bad
decision, and someone should feel bad. I don't know why it makes things
break in the specific and bizarre way it does, but it does, and there's
no possible good reason for it.

Closes sopel-irc#984. Fuck.

* Escape nick before replacing it in regex

Resolves sopel-irc#1004

* [weather] Fix YQL woeid lookup

Handles an edge case that neither of the PRs handled

Fixes sopel-irc#1006
Closes sopel-irc#1007, sopel-irc#1012

* CONTRIBUTING: Let's drop the [brackets] thing and do what everyone else does

No point in being different from any other FOSS project out there.

* contrib/rpm: willie->sopel

* CONTRIBUTING: Update coding/future import guidelines

`# coding=utf-8` is now the standard in Sopel & supports windows. The future import now conforms with the flake8 future imports (also conforms with @embolalia's formatting passes a bit back)

* web: make web.py into a requests comaptibility layer

Since web is deprecated and everyone should switch to requests,
the first step is to make web.py a requests comaptibility layer.

When web.py was new, requests was not ripe enough to use in Sopel.

But now it's time to switch to requests like the rest of the python
echo-system. web.py is no more.

* find_updates: switch from web.py to requests

* web: Fix typo

This is why we have flake8 ;)

* url: port to requests

Now that web.py is deprecated, we can port url.py to requests.

Originally from pull request sopel-irc#988, committed here with minor bugfixes
and modified commit message.

* translate: port to requests

* movie: port to requests

* xkcd: port to requests

* Track the channel topic

* Improve locale stupidity checking

Thanks to @elad661's comment on b73fc6a

* url: handle capitalized URLs

Trigger rules are case-insensitive regexes, so the auto title responder
will be triggered even for capitalized URLs such as "Http://google.com"
(which can happen, for example, when a mobile device attempts to
auto-capitalize the beginning of a sentence).  Match URLs case
insensitively for title lookup purposes and add error handling in case
no URLs could be extracted from the match.

* core: Tweak rate limiting to be more effective

This doesn't solve the issue, but it should make it slightly less
critical. sopel-irc#952

* reddit: fetch posts by submission_id

Previously, the reddit module fetched posts by the full URL of the post.
This led to RedirectExceptions in some cases, for example when someone
links a naked reddit.com URL instead of www.reddit.com.  Instead, match
only the post ID and pass it to get_submission.

* Release 6.3.0

* Don't warn about non-UTF8 locales when running on Python 2

Python 2 doesn't change string behavior according to the locale env,
that's a py3 specific weirdness.

Also, reword the error message to better explain the issue to the user.

* Fix apparent typo in host_blocks initialization

* Added missing import to xkcd.py

* core: Fix print in handle_error when reaching the exception limit

Fixes issue sopel-irc#1025

* trigger: Fix target for QUIT events

This was fun to debug! Basically, Soepl encountered an exception
when removing unknown users when they QUIT. While this shouldn't
happen, it should still be handled gracefully.

Since it was an exception, Sopel's response was to try and send
the exception line to the channel (sender) the message came from, but
QUIT events don't come from channels (or users by PRIVMSG)!

Since QUIT was not special-cased, the naive assumption that
the first argument is the "sender" was used, and when Sopel
tried to send the exception line to the "sender", and the sender
had a space in it, this would lead to spam if a user exists with a nick
that is identical to the first word in the QUIT message. Ouch.

The fix special-cases QUIT in pretrigger to never have a "sender".
I also added a test to make sure we parse QUIT correctly.

This solves issue sopel-irc#1026.

* core: Never try to send an exception line when sender is None

Just in case.

* Make sure we're working with UTF8 string

Depending on the URL, response.iter_content() in Python 2/3 will return either:
- `type 'unicode'`/`class 'str'` (for "plain" HTML)
- `type 'str'`/`class 'bytes'` (for binary streams, like file URLs)

To distinguish between the two situations we're checking if we got string or
bytes, and proceed accordingly.

This also fixes sopel-irc#1021

* [doc] unify grammatical number of `@commands` example

It seems there is no `@command` decorator. 

At any rate, examples for `@commands` should use the same (and correct) grammatical number.

* Release 6.3.1

* FIX: Private BZ's - AttributeError: 'NoneType'

* [calc] Remove .wa, as API now requires a key

Will be moved to an external module that supports the new API

* search: Remove ad URL results

DDG changed their HTML output slightly and that threw us off, this *should* fix the r.search.yahoo.com URLs that .g was returning.

* Fix issue sopel-irc#1048

* fix .set command for non filename attributes

* Fix loading/reloading modules that share the name of the bot owner

* Typo correction

deamon -> daemon

Squashed into a single commit.

* Fix config loading in some edge cases

Fixes sopel-irc#999

Usually where try/except wouldn't catch NoOptionError, happens when running tests in specific environments.

* add groupdict function to triggers (sopel-irc#1061)

* Add IRCv3 extended-join tests

* Add regular join test

* Replace e.message with str(e), e.message has been deprecated since python 2.6

* Fix nickname examples

help_prefix shouldn't replace the first character if it's a nickname example.

* Fix syntax error

* weather: fix location yql query

Resolves sopel-irc#1050 and sopel-irc#1029

* Implement proper extended-join support

* search: tweak ad result blocking

A slight regex change to avoid yahoo ad results from duck duck go if it ends up using the HTML search

* irc: toggle error replies (sopel-irc#1071)

Adds config option to toggle Sopel replying directly to the error source.

* irc: always send exceptions to logger

We don't really need to check if `trigger.sender` is `None` when we're sending to the logger -- as long as `trigger` is defined, we'll be fine.

This just ensures that the `logging_channel` will always get the exception messages. Also pre-formats the message using format because it's more clear what's going on this way.

* Create suppress-warnings.py

Can be dropped into ~/.ipython/profile_default/startup/ to suppress the DeprecationWarnings you get when starting Sopel with iPython enabled

* run_script: if argv is specified, use it

* [announce] Confirm when all announces have been sent (sopel-irc#1044)

* Add global and channel rate limits (sopel-irc#1065)

* Add global and channel rate limits

* Default user rate and compatibility with jenni modules

* Fix critical keyerror bug in rate limiting

* Simplify syntax for @Rate() decorator and update docs

* Don't reset function timer during cooldown

* fix channel time diff variable

* fix indentation in bot.py

* weather: catch empty forecast results (sopel-irc#1077)

e.g. when the user enters a continent for the location.

* irc: treat error in connect as a disconnect (sopel-irc#845)

* irc: test suite enhancement

Comes with some tweaks to support tests
Daemonizes the ping and timeout threads (they should have been in the first place)

* coretasks: prevent KeyError when untracked user leaves

Fixes sopel-irc#1005

* web: fix header bleed (sopel-irc#1092)

Resolves sopel-irc#1091

* seen: be a smart-ass if people ask the bot about itself (sopel-irc#1086)

* module: ignore privilege requirement in privmsg (sopel-irc#1093)

Resolves sopel-irc#1087

* run_script: fix PID file checking logic when the file is empty

This fixes issue sopel-irc#1075

I don't know why the elif explicitly negated the previous codnition, it's
obviously not needed because else if already implies the previous
condition is False.

Also, whoever added the parenthesis there messed up the logic even further,
before they were there, it worked okay, even if the condition was a bit
more verbose than logically needed. Well, that's what you get when you
blindly try to make code conform to PEP8 without actually reading it.

* unicode_info: fallback if input is None (sopel-irc#958)

Resolves sopel-irc#957

* db: raise ValueError in unalias_nick to match documentation (sopel-irc#1102)

Documentation says that a ValueError should be raised if there is not at least one other nick in the group.

Resolves sopel-irc#1101

* Update .gitignore (sopel-irc#1110)

Renamed willie references to sopel
Added .DS_Store ignore

* coretasks: tweak topic tracking (sopel-irc#1111)

Support different implementation of topic update, RPL_TOPIC appears to only be sent to the user who actually updated the topic.

Resolves sopel-irc#1107

* meetbot/url: fix SSLError

The core.verify_ssl was not passed to url.find_title(), resulting in SSL errors on sites with invalid certs when `verify_ssl = False`

Slight refactor of @psachin's original code for backwards compatibility. Resolves sopel-irc#1113

* coretasks: add support for authentication on Quakenet (sopel-irc#1122)

Added the necessary lines for authenticating Sopel with Q. The implementation is almost exactly like AuthServ's. Added Q to core_section along with the other authentication methods, since it is now supported.

* coretasks: remove .lower() on auth_method

auth_method may be None if it's unset, forgot about that case when merging.

Resolves sopel-irc#1124

* setup: tweak requirements

Remove unsupported requires statement in setup.py
Pin requests dependency to 2.10.0 as 2.11.0 introduced a breaking change against the url.py module

* sopel/trigger.py: fix intent_regex

* url: make find_title more robust

Previously, each 512-byte chunk is prone to decoding mishap when a UTF-8 sequence is incomplete. Now we decode all of content at once, ignoring errors.

The old problem appears reliably for pages with many high codepoints:

~~~
<user> http://www3.nhk.or.jp/news/easy/k10010665021000/k10010665021000.html
<bot-old> [ NEWS WEB EASY|������������人���� ] - www3.nhk.or.jp
~~~

* [reddit] Change NSFW tag to SPOILERS for some subs

Hard-coded rather than configured, since in theory the same list should
apply to everyone, and we should merge in new ones. That and effort.

* setup: Be more flexible about requests version

* Release 6.4.0

* Notify if Bugzilla is private (sopel-irc#1115)

Although the primary error no longer exist, but the bot shows nothing if
the bugzilla has invalid alias, invaid id or if it has no valid
permission to access the bug. The logs should show warnings such as,

  WARNING:sopel.modules.bugzilla:Bugzilla error: NotFound <- (Invalid ID)
  WARNING:sopel.modules.bugzilla:Bugzilla error: NotPermitted <- (No permission)

This patch should notify about those errors.

Closes-Bug: sopel-irc#1112

Signed-off-by: Sachin Patil <psachin@redhat.com>

* Lint imports (sopel-irc#1085)

After realizing I'd left a dead import in calc.py after removing the .wa command,
I decided to go through and clean up any other imports that didn't appear to be in
use any more.

* Fixed missing `verify_ssl` param

- `verify_ssl` param was missing in few function calls

Closes-Bug: sopel-irc#1118

Signed-off-by: Sachin Patil <psachin@redhat.com>

* Add a decorator for url handling

Closes sopel-irc#761. Also add xkcd url handling as a demo.

* Add docstring for url decorator

* Create a gist with the command list

Closes sopel-irc#1105
Closes sopel-irc#1080

* Cache help gist location

* Use custom user agent for title requests

* Add Travis badge

* Increase timeout for DB locked error

This doesn't fix sopel-irc#736, but should at least make it less common

* Fix CI

* Release 6.5.0

* [weather] Use help_prefix in hint text when no location given

* Add a pronouns module

If witch-house/pronoun.is#40 gets merged, it's
probably worth porting to use that, since there are a *lot* of pronoun
sets.

Yes, this should probably support other languages. Sopel's i18n is
horrible and I know it.

* Fix asking for another user's pronouns

* Be a bit less snarky when asked for the bot's pronouns

But only a little

* [etymology] unescape all known HTML entities

Replace bespoke implementation of unescape() with stdlib tools; fix sopel-irc#1153.

* Fixes for pronouns.py

Fixes setpronouns error on lack of trigger.group(2), fixes autocomplete of nicks with a space so that it's stripped out automatically, fixes that it will say the wrong username if you request someone else's pronouns.

* Fixes ConnectionError

url.find_title() throws ConnectionError when hostname/IPaddress is not
readable thereby fails to read title

Sample error
```
15:11:05 psachin:     https://10.65.177.15
15:11:09 BB-8:        requests.exceptions.ConnectionError: HTTPSConnectionPool(host='10.65.177.15', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7feb41b6a2e8>: Failed to establish a new connection: [Errno 113] No route to host',)) (file "/home/tss/virtualenvs/sopel/lib64/python3.5/site-packages/requests/adapters.py", line 487, in send)
```

* Update web.py

fix blank User-Agent, if a custom user-agent is set for web.get()

* Use a common user-agent to get the proper results from DDG

* Update search.py

Hmm maybe single quotes would be better.

* fix some missed stuff

* Missed one more header copy.  Hopefully last one.

* Fix typo

* Remove duplicate item in triggerable tupe check

* Exclude File links from regex matching

Fix sopel-irc#1182

* Added default value to numbered_result

Added missing default value of "True" for "verify_ssl" parameter on "number_result".

* IP example

Fixed broken IP module example

* Fix API urls for Bank of Canada and BitcoinAverage

* Upper/lowercase shouldn't matter for tell module

* Release 6.5.1

* weather: update from deprecated sopel.web to requests

* safety module - catch exception on urllib/parse

* Fix reddit module

* Actually fix reddit

* [ip] Fix example/test (Google Inc. => Google LLC)

Google changed to an LLC, and updated its AS information, which broke
the test assertion.

Changes cherry-picked from sopel-irc#1250 and reworded.

* Update ignored files for tests

Ignore movie.py module because it requires an API key (and will probably
be moved out of core anyway).

Fix ignores for entry script and ipython module (which were still using
the old "willie" name and therefore weren't ignored). This also allows
removing the command-line ignore from the Travis build script.
kwaaak added a commit to kwaaak/sopel that referenced this issue Mar 23, 2018
Delete ip.py

ip.py

update2 (#2)

* update movie.py to use omdbapi as imdbapi is non-functional

* minor cosmetic change

* Resolve sopel-irc#926

Also cleaned up dict value retrieval a little, the .get() calls were a bit unnecessary.

* Remove feedparser dep from requirements

* Remove feedparser from RPM spec

* Make CAD currency code case-insensitive

* Resolve sopel-irc#929

Ensures that FilenameAttribute's parse and serialize always have the parameters they need.

* [isup] fix bad indexes preventing bot from recognizing http protocols

`'http://'` is 7 characters long, so `site[:6]` can't ever match it. ditto for `'https://'` and `site[:7]`

* Switch back to get() calls

There's some places where there were already try catch blocks for KeyError, so I left those ones.

* core: Fix issues with reloading folder modules

Closes sopel-irc#899, obviates and closes sopel-irc#932.

* Switch from *args to named args.

* Properly fetch the xml before passing it to xmltodict

I'm a moron.

* Release 6.1.0

* Use flake8 for future checking, and add missing ones

* Fix coding declaration in a few places it was still wrong

Again, fuck windows. sopel-irc#821

* README: Update sopel package location in Arch

* Fix crash during configure

Reported by aam in IRC, http://pastebin.com/0rAQb7Kp

* Fix TypeError: 'NoneType' object is not iterable

Happened when an invalid language hint abbreviation was given.

* Require a valid phrase to translate

* [meetbot] fix crash when starting meeting

Resolves sopel-irc#942

* Release 6.1.1

* Require chanmsg for all adminchannel commands

* Fix sopel-irc#945

We should only get `socket.gaierror` if the user passes an invalid IP or hostname, but it could be in the event that we don't have DNS configured on the machine or a multitude of other things, so we can be somewhat vague in the error message.

* Fix sopel-irc#860

Should resolve the inconsistencies between the implementation of unquote in python 2 and python 3.

* [Formatting]Gray is the same as Grey. Which spelling is correct is a grey area.

* Updated link for help command

was a bad link, updated to correct one

* [countdown] fix remaining time calculation

* [docs] Rename

* Add tracking of users and their accounts

This uses some of the code that @maxpowa wrote for sopel-irc#941, but gives a
somewhat more intuitive API. It also paves the way for potentially
adding direct support for away-notify and metadata-notify.

* rename references to willie in systemd service file

* update project url in comments

* update project url in comment

* update project name in CONTRIBUTING.md (includes log files location, issue tracking url)

* Update github URL to organization, Sopel-IRC, rather than Embolalia

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Also handle negative drop_lowest values.

* Make BTC currency code case-insensitive

* Address comments on PR sopel-irc#961

This includes being more consistent about using pop rather than del to
prevent key errors, and adding some locking around the privilege related
things

* Add user tracking for RFC WHO replies

* Add away tracking

* Add support for account-tag

* Enable the account-tag capability

* Replace channels list with channels dictionary

Hopefully, nobody else is taking advantage of channels being a list,
rather than a dict. If they are, well, oops.

* Add enumeration of IRC events

Closes sopel-irc#960

* Add cap-notify support

See sopel-irc#971

* coretasks: Replace numeric events with their enums

Also add the missing RPL_WHOSPCRPL to tools.events

* [contrib] rename & edit willie out of contrib

Fixes sopel-irc#963

* Huge cleanup of copyright headers and docstrings

They're still super inconsistent and probably a lot are out of date, but
at least there won't be random copyright info showing up in the docs
anymore. Oh, and my domain and name are correct now, too…

* [currency] Make arguments case-insensitive (close sopel-irc#979)

* Fix URL excludes loading (sopel-irc#959)

setup uses it as a list, and in previous versions of sopel it was a list, but in the UrlSection it's defined as a ValidatedAttribute. This was causing each character in the excludes list to be parsed as a regex exclude. Switching to ListAttribute fixes the issue.

* [find_updates] Fix missing RPL_ prefix

* [bot] Fix misleading message

Coretasks is only one module, so if you loaded it and only one other, you'd get the "Couldn't load any modules" warning, even though there was a module loaded.

* [trigger] IRCv3 server-time support

* [tests] Create more trigger test cases

* [tests] Create module tests

* [tests] DB test cleanup

* [tests] Add formatting tests

* [tests] Fix coding declaration

* [tests] Update .travis.yml

* [translate] Fix bad test

* Support using account name for auth

* [tests] Remove pep8 dev-requirement

Also fixed the critical error where py.test thought sopel.py (the entry script) was the sopel package.

* [pep8] Minor clean up to conform with pep8

* [web] Ensure ca_certs is defined

When running tests it's not defined at all, because the only place defining it was the config loader code. Now it's defined, but will still fail out since it's not a valid CA certificate file.

* [pep8] Final pep8 run

Everything should now conform to pep8 and pass flake8.

* [tests] Improve trigger coverage

* [reddit] Prevent specific commands in PM

Resolves sopel-irc#789

* Documentation for target types

* docs/core: Shift around functions to make autodoccing easier

* docs: Start on a major cleanup of documentation

* Make _ssl_recv always return bytes

_ssl_recv returned empty strings instead of the empty bytes object if
the socket was closed or upon ENOENT.

This lead to exceptions when running sopel with python3 because
asynchat.handle_read expects byte objects.

This commit fixes sopel-irc#937.

* docs: Majorly overhaul organization and format

* Release 6.2.0

* Release 6.2.1

* dice: Allow comma delimiter

Closes sopel-irc#998

* adminchannel: Remove totally useless commands

Also add error messages to the somewhat useful ones

* Make it a bit harder to run into the LC_ALL thing

This behavior is stupid. Respecting LC_ALL, or anything else for that
matter, over the encoding fucking noted in the fucking file is a bad
decision, and someone should feel bad. I don't know why it makes things
break in the specific and bizarre way it does, but it does, and there's
no possible good reason for it.

Closes sopel-irc#984. Fuck.

* Escape nick before replacing it in regex

Resolves sopel-irc#1004

* [weather] Fix YQL woeid lookup

Handles an edge case that neither of the PRs handled

Fixes sopel-irc#1006
Closes sopel-irc#1007, sopel-irc#1012

* CONTRIBUTING: Let's drop the [brackets] thing and do what everyone else does

No point in being different from any other FOSS project out there.

* contrib/rpm: willie->sopel

* CONTRIBUTING: Update coding/future import guidelines

`# coding=utf-8` is now the standard in Sopel & supports windows. The future import now conforms with the flake8 future imports (also conforms with @embolalia's formatting passes a bit back)

* web: make web.py into a requests comaptibility layer

Since web is deprecated and everyone should switch to requests,
the first step is to make web.py a requests comaptibility layer.

When web.py was new, requests was not ripe enough to use in Sopel.

But now it's time to switch to requests like the rest of the python
echo-system. web.py is no more.

* find_updates: switch from web.py to requests

* web: Fix typo

This is why we have flake8 ;)

* url: port to requests

Now that web.py is deprecated, we can port url.py to requests.

Originally from pull request sopel-irc#988, committed here with minor bugfixes
and modified commit message.

* translate: port to requests

* movie: port to requests

* xkcd: port to requests

* Track the channel topic

* Improve locale stupidity checking

Thanks to @elad661's comment on b73fc6a

* url: handle capitalized URLs

Trigger rules are case-insensitive regexes, so the auto title responder
will be triggered even for capitalized URLs such as "Http://google.com"
(which can happen, for example, when a mobile device attempts to
auto-capitalize the beginning of a sentence).  Match URLs case
insensitively for title lookup purposes and add error handling in case
no URLs could be extracted from the match.

* core: Tweak rate limiting to be more effective

This doesn't solve the issue, but it should make it slightly less
critical. sopel-irc#952

* reddit: fetch posts by submission_id

Previously, the reddit module fetched posts by the full URL of the post.
This led to RedirectExceptions in some cases, for example when someone
links a naked reddit.com URL instead of www.reddit.com.  Instead, match
only the post ID and pass it to get_submission.

* Release 6.3.0

* Don't warn about non-UTF8 locales when running on Python 2

Python 2 doesn't change string behavior according to the locale env,
that's a py3 specific weirdness.

Also, reword the error message to better explain the issue to the user.

* Fix apparent typo in host_blocks initialization

* Added missing import to xkcd.py

* core: Fix print in handle_error when reaching the exception limit

Fixes issue sopel-irc#1025

* trigger: Fix target for QUIT events

This was fun to debug! Basically, Soepl encountered an exception
when removing unknown users when they QUIT. While this shouldn't
happen, it should still be handled gracefully.

Since it was an exception, Sopel's response was to try and send
the exception line to the channel (sender) the message came from, but
QUIT events don't come from channels (or users by PRIVMSG)!

Since QUIT was not special-cased, the naive assumption that
the first argument is the "sender" was used, and when Sopel
tried to send the exception line to the "sender", and the sender
had a space in it, this would lead to spam if a user exists with a nick
that is identical to the first word in the QUIT message. Ouch.

The fix special-cases QUIT in pretrigger to never have a "sender".
I also added a test to make sure we parse QUIT correctly.

This solves issue sopel-irc#1026.

* core: Never try to send an exception line when sender is None

Just in case.

* Make sure we're working with UTF8 string

Depending on the URL, response.iter_content() in Python 2/3 will return either:
- `type 'unicode'`/`class 'str'` (for "plain" HTML)
- `type 'str'`/`class 'bytes'` (for binary streams, like file URLs)

To distinguish between the two situations we're checking if we got string or
bytes, and proceed accordingly.

This also fixes sopel-irc#1021

* [doc] unify grammatical number of `@commands` example

It seems there is no `@command` decorator.

At any rate, examples for `@commands` should use the same (and correct) grammatical number.

* Release 6.3.1

* FIX: Private BZ's - AttributeError: 'NoneType'

* [calc] Remove .wa, as API now requires a key

Will be moved to an external module that supports the new API

* search: Remove ad URL results

DDG changed their HTML output slightly and that threw us off, this *should* fix the r.search.yahoo.com URLs that .g was returning.

* Fix issue sopel-irc#1048

* fix .set command for non filename attributes

* Fix loading/reloading modules that share the name of the bot owner

* Typo correction

deamon -> daemon

Squashed into a single commit.

* Fix config loading in some edge cases

Fixes sopel-irc#999

Usually where try/except wouldn't catch NoOptionError, happens when running tests in specific environments.

* add groupdict function to triggers (sopel-irc#1061)

* Add IRCv3 extended-join tests

* Add regular join test

* Replace e.message with str(e), e.message has been deprecated since python 2.6

* Fix nickname examples

help_prefix shouldn't replace the first character if it's a nickname example.

* Fix syntax error

* weather: fix location yql query

Resolves sopel-irc#1050 and sopel-irc#1029

* Implement proper extended-join support

* search: tweak ad result blocking

A slight regex change to avoid yahoo ad results from duck duck go if it ends up using the HTML search

* irc: toggle error replies (sopel-irc#1071)

Adds config option to toggle Sopel replying directly to the error source.

* irc: always send exceptions to logger

We don't really need to check if `trigger.sender` is `None` when we're sending to the logger -- as long as `trigger` is defined, we'll be fine.

This just ensures that the `logging_channel` will always get the exception messages. Also pre-formats the message using format because it's more clear what's going on this way.

* Create suppress-warnings.py

Can be dropped into ~/.ipython/profile_default/startup/ to suppress the DeprecationWarnings you get when starting Sopel with iPython enabled

* run_script: if argv is specified, use it

* [announce] Confirm when all announces have been sent (sopel-irc#1044)

* Add global and channel rate limits (sopel-irc#1065)

* Add global and channel rate limits

* Default user rate and compatibility with jenni modules

* Fix critical keyerror bug in rate limiting

* Simplify syntax for @Rate() decorator and update docs

* Don't reset function timer during cooldown

* fix channel time diff variable

* fix indentation in bot.py

* weather: catch empty forecast results (sopel-irc#1077)

e.g. when the user enters a continent for the location.

* irc: treat error in connect as a disconnect (sopel-irc#845)

* irc: test suite enhancement

Comes with some tweaks to support tests
Daemonizes the ping and timeout threads (they should have been in the first place)

* coretasks: prevent KeyError when untracked user leaves

Fixes sopel-irc#1005

* web: fix header bleed (sopel-irc#1092)

Resolves sopel-irc#1091

* seen: be a smart-ass if people ask the bot about itself (sopel-irc#1086)

* module: ignore privilege requirement in privmsg (sopel-irc#1093)

Resolves sopel-irc#1087

* run_script: fix PID file checking logic when the file is empty

This fixes issue sopel-irc#1075

I don't know why the elif explicitly negated the previous codnition, it's
obviously not needed because else if already implies the previous
condition is False.

Also, whoever added the parenthesis there messed up the logic even further,
before they were there, it worked okay, even if the condition was a bit
more verbose than logically needed. Well, that's what you get when you
blindly try to make code conform to PEP8 without actually reading it.

* unicode_info: fallback if input is None (sopel-irc#958)

Resolves sopel-irc#957

* db: raise ValueError in unalias_nick to match documentation (sopel-irc#1102)

Documentation says that a ValueError should be raised if there is not at least one other nick in the group.

Resolves sopel-irc#1101

* Update .gitignore (sopel-irc#1110)

Renamed willie references to sopel
Added .DS_Store ignore

* coretasks: tweak topic tracking (sopel-irc#1111)

Support different implementation of topic update, RPL_TOPIC appears to only be sent to the user who actually updated the topic.

Resolves sopel-irc#1107

* meetbot/url: fix SSLError

The core.verify_ssl was not passed to url.find_title(), resulting in SSL errors on sites with invalid certs when `verify_ssl = False`

Slight refactor of @psachin's original code for backwards compatibility. Resolves sopel-irc#1113

* coretasks: add support for authentication on Quakenet (sopel-irc#1122)

Added the necessary lines for authenticating Sopel with Q. The implementation is almost exactly like AuthServ's. Added Q to core_section along with the other authentication methods, since it is now supported.

* coretasks: remove .lower() on auth_method

auth_method may be None if it's unset, forgot about that case when merging.

Resolves sopel-irc#1124

* setup: tweak requirements

Remove unsupported requires statement in setup.py
Pin requests dependency to 2.10.0 as 2.11.0 introduced a breaking change against the url.py module

* sopel/trigger.py: fix intent_regex

* url: make find_title more robust

Previously, each 512-byte chunk is prone to decoding mishap when a UTF-8 sequence is incomplete. Now we decode all of content at once, ignoring errors.

The old problem appears reliably for pages with many high codepoints:

~~~
<user> http://www3.nhk.or.jp/news/easy/k10010665021000/k10010665021000.html
<bot-old> [ NEWS WEB EASY|������������人���� ] - www3.nhk.or.jp
~~~

* [reddit] Change NSFW tag to SPOILERS for some subs

Hard-coded rather than configured, since in theory the same list should
apply to everyone, and we should merge in new ones. That and effort.

* setup: Be more flexible about requests version

* Release 6.4.0

* Notify if Bugzilla is private (sopel-irc#1115)

Although the primary error no longer exist, but the bot shows nothing if
the bugzilla has invalid alias, invaid id or if it has no valid
permission to access the bug. The logs should show warnings such as,

  WARNING:sopel.modules.bugzilla:Bugzilla error: NotFound <- (Invalid ID)
  WARNING:sopel.modules.bugzilla:Bugzilla error: NotPermitted <- (No permission)

This patch should notify about those errors.

Closes-Bug: sopel-irc#1112

Signed-off-by: Sachin Patil <psachin@redhat.com>

* Lint imports (sopel-irc#1085)

After realizing I'd left a dead import in calc.py after removing the .wa command,
I decided to go through and clean up any other imports that didn't appear to be in
use any more.

* Fixed missing `verify_ssl` param

- `verify_ssl` param was missing in few function calls

Closes-Bug: sopel-irc#1118

Signed-off-by: Sachin Patil <psachin@redhat.com>

* Add a decorator for url handling

Closes sopel-irc#761. Also add xkcd url handling as a demo.

* Add docstring for url decorator

* Create a gist with the command list

Closes sopel-irc#1105
Closes sopel-irc#1080

* Cache help gist location

* Use custom user agent for title requests

* Add Travis badge

* Increase timeout for DB locked error

This doesn't fix sopel-irc#736, but should at least make it less common

* Fix CI

* Release 6.5.0

* [weather] Use help_prefix in hint text when no location given

* Add a pronouns module

If witch-house/pronoun.is#40 gets merged, it's
probably worth porting to use that, since there are a *lot* of pronoun
sets.

Yes, this should probably support other languages. Sopel's i18n is
horrible and I know it.

* Fix asking for another user's pronouns

* Be a bit less snarky when asked for the bot's pronouns

But only a little

* [etymology] unescape all known HTML entities

Replace bespoke implementation of unescape() with stdlib tools; fix sopel-irc#1153.

* Fixes for pronouns.py

Fixes setpronouns error on lack of trigger.group(2), fixes autocomplete of nicks with a space so that it's stripped out automatically, fixes that it will say the wrong username if you request someone else's pronouns.

* Fixes ConnectionError

url.find_title() throws ConnectionError when hostname/IPaddress is not
readable thereby fails to read title

Sample error
```
15:11:05 psachin:     https://10.65.177.15
15:11:09 BB-8:        requests.exceptions.ConnectionError: HTTPSConnectionPool(host='10.65.177.15', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7feb41b6a2e8>: Failed to establish a new connection: [Errno 113] No route to host',)) (file "/home/tss/virtualenvs/sopel/lib64/python3.5/site-packages/requests/adapters.py", line 487, in send)
```

* Update web.py

fix blank User-Agent, if a custom user-agent is set for web.get()

* Use a common user-agent to get the proper results from DDG

* Update search.py

Hmm maybe single quotes would be better.

* fix some missed stuff

* Missed one more header copy.  Hopefully last one.

* Fix typo

* Remove duplicate item in triggerable tupe check

* Exclude File links from regex matching

Fix sopel-irc#1182

* Added default value to numbered_result

Added missing default value of "True" for "verify_ssl" parameter on "number_result".

* IP example

Fixed broken IP module example

* Fix API urls for Bank of Canada and BitcoinAverage

* Upper/lowercase shouldn't matter for tell module

* Release 6.5.1

* weather: update from deprecated sopel.web to requests

* safety module - catch exception on urllib/parse

* Fix reddit module

* Actually fix reddit

* [ip] Fix example/test (Google Inc. => Google LLC)

Google changed to an LLC, and updated its AS information, which broke
the test assertion.

Changes cherry-picked from sopel-irc#1250 and reworded.

* Update ignored files for tests

Ignore movie.py module because it requires an API key (and will probably
be moved out of core anyway).

Fix ignores for entry script and ipython module (which were still using
the old "willie" name and therefore weren't ignored). This also allows
removing the command-line ignore from the Travis build script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Things to squish; generally used for issues Easyfix Bugs or Tweaks that are easy for a new contributor to fix or implement
Projects
None yet
5 participants