Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asyncio Client improvements #146

Open
1 of 12 tasks
zinsserzh opened this issue Jun 26, 2018 · 9 comments
Open
1 of 12 tasks

Asyncio Client improvements #146

zinsserzh opened this issue Jun 26, 2018 · 9 comments

Comments

@zinsserzh
Copy link

zinsserzh commented Jun 26, 2018

After thoroughly examining the module, I found multiple improvements could/should be done to asynchronous code. I'd like to open a discussion before actually implementing the code because some changes will break the current API.

  • (breaks API) AioSimpleIRCClient currently cannot be created and started once event loop has started and running asynchronous code, because AioConnection.connect calls loop.run_until_complete. In the situation described above, a RuntimeError: This event loop is already running exception is raised. AioConnection.connect should be a coroutine. Waiting for it's finishing should be caller's responsibility. Changing this function to a coroutine definitely breaks the current API.
  • Connection reset by peer should be handled.
  • SSL should be supported. Current async code does not support ssl. connection.Factory should be adapted to async code as well.
  • Circular reference between AioConnection and its underlying protocol should be removed.
  • Async version of bot should be added.
  • Debug usage of print should be removed. print is used 3 times in client_aio.py. I assume the developer forgot to remove them.
  • Make the use of custom event loops easier. If one wants to use AioSimpleIRCClient with a custom event loop, subclassing both AioReactor and AioSimpleIRCClient is required. It is supposed to be as simple as calling AioSimpleIRCClient(loop=loop)
  • loop should not be accepted as a positional argument. For example, AioReactor.__init__ should be declared as

def __init__(self, on_connect=__do_nothing, on_disconnect=__do_nothing, *, loop=None)

  • asyncio.Lock should be used over threading.Rlock.
  • Async throttler
  • Required version check
  • Better tests.
@jaraco
Copy link
Owner

jaraco commented Jun 28, 2018

Those all seem like reasonable ideas. The AsyncIO support is fairly new, so as long as @MattBroach is comfortable with the changes, I'm happy to accept them (even if backward-breaking).

@zinsserzh
Copy link
Author

Sounds great. I'll go ahead and start the implementation. My own project needs it anyway :D

@zinsserzh
Copy link
Author

Do you mind if I also clean some code for the non-async part along the way if I happen to find any? I won't make big changes. If so I'd make a new PR. E.g. in connection.py, identity function could be replaced by lambda x: x.

@MattBroach
Copy link
Contributor

Just popping in to say this all sounds great to me. As @jaraco sad, the asyncio stuff was just added, so now is the best time for breaking changes. Thanks for the thorough review.

@zinsserzh
Copy link
Author

Guys, I start to feel that it's hard to maintain two versions in a single module. A lot of non-async code depends on raw sockets where async code uses protocol and transport extensively. Plus that non-async code needs to support both py2 and py3, which makes it harder to separate pieces and bits to implement py3-only code. I think it's easier to work on a fork and make another package that only supports async irc code and release separately. What do you think? I am willing to maintain the fork.

@jaraco
Copy link
Owner

jaraco commented Jul 1, 2018

Do you mind if I also clean some code for the non-async part along the way if I happen to find any?

In general, this sort of contribution should be handled as a separate PR to be accepted first. You can still implement your asyncio improvements on top of these incidental changes, but then they'll be dependent on the acceptance of those changes and might require resolving of merge conflicts if not accepted exactly as presented. But this project is happy to accept all sorts of changes as long as the concerns can be kept separate. If the changes are small and non-controversial, it's fine to implement those in the same PR, just use separate commits (same guidance, just more lightweight).

E.g. in connection.py, identity function could be replaced by lambda x: x.

You'll find the linter won't accept identity = lambda x: x (although that would have been my stylistic preference as well). You could move the change inline (i.e. wrapper=lambda x: x), but that conveys less information.

Honestly, if I were you, I'd try to look past these stylistic changes and focus on the semantic or structural changes. But again, happy to review any proposed change.

I think it's easier to work on a fork and make another package that only supports async irc code and release separately. What do you think?

A fork, especially in a different package/module, is just asking for trouble. It means creating two lines of development on a project that already has very little development.

I do see the appeal - that maintaining simple socket compatibility is hard - hence why we have something of a fork already with two implementations. I'd rather not increase the divergence with yet another package. It seems reasonable at first until we get the next bug report that affects the common parts of the two libraries - then we have to expend extra effort to ensure that both implementations get that functionality (or otherwise manage the increasing cognitive distance between the two projects).

Ideally, we could refactor the code in such a way that any event loop / IO handler could be chosen, and the rest of the functionality lies atop that, although I suspect that's easier said than done (if possible).

If a proper loop abstraction isn't possible or achievable, I'd be more inclined to drop support for Python 2 and a select loop, and rely entirely on async IO and Python 3 for the library (a backward-incompatible release).

So in that sense, I do support a fork, one in which the select-based functionality is left behind in a maintenance branch that's unlikely to get backports of functionality from the master. If you can bring the asyncio functionality to feature parity with the select loop, then I'm happy to accept that as the sole event loop for the library.

To the extent that you can, try to implement the changes with many minimal steps with stable commits.

Thanks!

@zinsserzh
Copy link
Author

zinsserzh commented Jul 2, 2018

How about this, I'll put all async code in a subfolder, i.e. irc/aio/ which has a similar directory structure as irc and start to refactor the code, separate shared code into base classes (BaseServerConnection, etc). For things that are hard to split, I'll copy some code from irc to irc/aio/ and modify it.

Btw some of the dependencies (jaraco/*) does not support async code as well (e.g. Throttler). How do you want me to deal with it? Add more things to your modules (which appear not to be open source) or implement it in another file in this particular repo?

Btw, I do think wrapper=lambda x: x makes things super clear as it tells the reader wrapper is supposed to be a function/method/closure and its default value is returning the socket untouched (with that in mind maybe wrapper=lambda sock: sock is even better. wrapper=identity requires me to check what is identity and what it does by searching the name in the file.

@jaraco
Copy link
Owner

jaraco commented Jul 2, 2018

I'll put all async code in a subfolder

That seems reasonable. I'd especially like it if there were symmetry with another package that served for non-async mode. For example, there's irc.aio and irc.select and each implements equivalent functionality. But that's not a prerequisite.

some dependencies do not support async code

Ugh. Well, I guess that's an unfortunate situation, but not surprising. Probably best would be to make those features compatible with async as well. I'd be happy to accept those improvements as well. I would like them to be contributed back to the canonical project, all of which are open source (if they're not, that's a bug).

I see Throttler is imported from jaraco.functools, found at GitHub.

If you wish to test with unreleased versions of jaraco.functools, it's easy to:

irc $ .tox/python/bin/pip install -e ../jaraco.functools

Then run tests or exercise the IRC library against the in-development version of jaraco.functools before committing/pushing/pull-requesting.

Btw, I do think wrapper=lambda x: x makes things super clear... wrapper=lambda sock: sock is even better.

Yes, I agree. I think I was planning to move identity to another library (as I've used that pattern in more than a dozen places). It's the kind of thing that I feel should be in the stdlib, like None only for functions. But since it's not, it has to be explicit throughout the various implementations.

That said, there is a tension with lambda. Many Python developers, especially those unfamiliar with functional programming paradigms, find lambda objectionable. I'm perfectly comfortable with it and my sensibilities align with yours here. Just be aware that others may come along later and wish to change it (back).

In any case, sounds like you're on the right track.

@jaraco
Copy link
Owner

jaraco commented Mar 29, 2019

Since this discussion began, this project has dropped support for Python 2 and the async implementation has expanded, so progress is being made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants