-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async versions of web3 apis #1055
Comments
Note previous conversations in #657 and #856 I wanted to think through alternatives for co-locating the logic for sync and async inputs. I was wondering if we could still do something functional. This isn't a working example, just a rough sketch: class Async_Eth(Module):
getUncleByBlock = async_caller(uncle_block_input_formatter, identity)
...
class Eth(Module):
getUncleByBlock = blocking_caller(uncle_block_input_formatter, identity)
...
#################
@curry
async def async_caller(input_formatter, output_formatter, module, *args, **kwargs):
method, params = input_formatter(module, *args, **kwargs)
result = await module.web3.manager.request_async(method, params)
return output_formatter(result)
@curry
def blocking_caller(input_formatter, output_formatter, module, *args, **kwargs):
method, params = input_formatter(module, *args, **kwargs)
result = module.web3.manager.request_blocking(method, params)
return output_formatter(result)
#################
def uncle_block_input_formatter(eth, block_identifier, uncle_index):
method = select_method_for_block_identifier(
block_identifier,
if_predefined='eth_getUncleByBlockNumberAndIndex',
if_hash='eth_getUncleByBlockHashAndIndex',
if_number='eth_getUncleByBlockNumberAndIndex',
)
return (method, [block_identifier, uncle_index]) So it might be possible to do something like that without introducing a lot of new classes or lines of code. |
Just realized we still need to figure out how to do middlewares.... |
@carver In your example, could we co-locate the modules if we allow the caller function to be configured? Is there anything else that needs to vary between the async and blocking modules? I have been wondering about the manager request function, the middlewares and the provider request functions.
|
I don't think we do this now so in theory I think it's not an async specific problem. |
Yeah, this should be deprecated and removed. Might be able to accelerate the removal as it's possible nobody is using it but I don't know how you'd verify that. |
Yeah, this part is likely to suck a bit. Best idea I have so far is to start by reducing the number of middlewares by moving most of the core middleware functionality into the Looking at the default list it appears we'll be able to remove all but For the remaining middlewares we can likely decompose them into sync/async versions striking a balance between readability and re-usable code. |
Something we should be careful about is maintaining the ability to support things like |
I suspect the ENS middleware would need to be duplicated async & sync, too. |
👍 for enforcing formatters and not making them optional. I've already described the problem with making the formatters optional in #1057. |
This may seem out of the blue, but I thought some suggestions and/or tooling might help out a bit in building the async stuff. Sorry I do not have time to write examples or do full technical summary. I will try to give more details as soon as possible if any of this looks promising to you gents. First, please feel free to cannibalize anything that may be useful from this repo: That project is essentially some backend code we use in house for various projects. It is a limited replication of web3py in async await syntax with some other tools baked in. I am not looking to maintain a repo of the size/scope of web3py. I am just putting it out for you guys to see how we handled some of this in the backend of an async system. Sorry for the lack of docs. I will work on that as soon as possible. Until then I am happy to help decode any of the confusing bits. If you guys are struggling with linking the middlewares, take a look at Paco and Janus. Janus is a sync/async thread safe Queue and Paco is an async based functional framework that may help stitch the awaitable pipeline together. Specifically check out paco.compose.
On this point, just put a Queue in front of the transport object. Use the queue as a Fifo and ordering is maintained. This also helps with graceful reconnect since messages may be re-pushed on connection failure and handled once connection is stable again. If transaction handling vs call handling is the concern, callbacks may be used as one option. The other option is to carry a request Q for eth_call(s) and a transaction Q for state modifying methods. In this way you get fast read operations with the ability to ensure ordering of transaction nonce(s). |
Hello, Im starting on the async web3 api again. After I worked through several different approaches in #1088 Ive honed in on one. It would be great to get input before I start fleshing it out too far. I will open a new PR and close #1088, which took several right turns while I worked through this stuff. Here is an outline of the design I would like to implement, i.e. turtles all the way up.
Ive altered the strategy of how to collocate the async/sync module methods. Im favoring replacing them outright with coroutines as it is cleaner than my implementation of @carver's proposed API (#1055 (comment)). Synchronous variants of the coroutines are generated and added to the class. As far as the web3 module api, Im in favor of collocating the sync and async methods in the same module namespace, with couroutines prefixed with This will be a necessarily large pull request. The sync() utility does not support nested calls to sync(). This means internally web3 can only call it's own coroutines when needing to make a client request. A downside to this is it will not be possible to introduce the async changes piece by piece. Converting the manager, middlewares and providers to async also requires async web3 modules. There is a limitation to taking the sync() utility approach. Users writing async code will not be able to use the synchronous web3 api. For example, the following will raise an exception:
Asyncio by design does not allow nested event loops (loop reentrency). See: https://bugs.python.org/issue22239. I don't know how big of a concern this is. It might be a breaking change for some people. Those people could install https://github.com/erdewit/nest_asyncio if they dont want to update their code. After this initial push, there can be refactoring to take advantage of the async internals. This initial design makes a minimum amount of changes to get to an async api. |
Yeah, python asyncio has an annoying habit of being very viral: introduce it to one part of the code, and it spreads everywhere. It's annoying to me because a lot of the things I love about python (eg~ open up a repl to run a few lines of code) don't seem to work well anymore. I don't have a good solution other than never supporting asyncio in the first place. Although the API I mentioned before was partially designed with that goal in mind: how do we keep
Maybe we can take a look together at the async-minimizing API to see if it's possible to clean up? Other than that, the approach sounds okay. Just a few other comments:
Could you maybe run a test to see how much overhead
I don't think this is a blocker. If they are already using async, they shouldn't have potentially slow blocking calls inside, anyway. |
sync() does have overhead. ~ 3 microseconds vs 500 microseconds. import time
from web3._utils.async_tools import sync
class MyTimer():
def __init__(self, label=None):
self.start = time.time()
self.label = label
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
end = time.time()
runtime = end - self.start
msg = 'The test labeled {label} took {time} seconds to complete'
print(msg.format(label=self.label, time=runtime))
async def co_dummy(*args):
return None
def dummy(*args):
return None
if __name__ == '__main__':
with MyTimer('Synchronous'):
dummy()
with MyTimer('sync(Asynchronous)'):
sync(co_dummy())
|
So when using this in a non-async context, it would put a hard cap at ~1700 requests per second (or some 1/n of that when we make multiple under-the-hood requests like |
I just had a crazy idea. Instead of sync() using an asyncio event loop, why not run it in a stripped down loop? When running inside sync, the web3 async code doesnt (as of now) need to yield control to the event loop. *EDIT: after running tests, there are a few failures where asyncio.sleep() is used. I replaced def run_coro(coro):
future = None
while True:
try:
future = coro.send(future)
except StopIteration as exc:
return exc.value Re-running the same benchmark test gets us down to 270 microseconds. Not as good as I hoped, but better. |
Closing, tracking async in Issue #1413 |
What is it?
Implement
w3.async.<something>
which is an async mirror of APIs available onw3.<something>
.This will be exploratory work to figure out how we introduce an
async/await
friendly API into web3.The text was updated successfully, but these errors were encountered: