-
Notifications
You must be signed in to change notification settings - Fork 1.8k
node independent filter middleware #732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4599ba4
to
191584a
Compare
web3/middleware/filter.py
Outdated
def unload_latest_blocks(web3, fromBlock, toBlock): | ||
_fromBlock = fromBlock | ||
|
||
current_block = int(web3.eth.blockNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is something returning a non-integer for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I misread an error and added this and didnt remember to take them out.
web3/middleware/filter.py
Outdated
return _start, _end | ||
|
||
|
||
def unload_latest_blocks(web3, fromBlock, toBlock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me what this function is used for (still ingesting the PR). Maybe add a docstring to explain the intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a docstring. I was having trouble naming this function.
It returns block ranges starting from fromBlock
to the latest mined block, until reaching toBlock
. e.g.:
>>> blocks_to_filter = unload_latest_blocks(w3, 0, 50)
>>> next(blocks_to_filter) # latest block number = 11
(0, 11)
>>> next(blocks_to_filter) # latest block number = 45
(12, 45)
>>> next(blocks_to_filter) # latest block number = 50
(46, 50)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_block_ranges
get_latest_block_ranges
iter_block_rangest
iter_latest_block_ranges
Don't particularly love any of those. The iter_
prefixed ones seems a bit more descriptive/semantically correct.
web3/middleware/filter.py
Outdated
# no new blocks | ||
if current_block < _fromBlock: | ||
yield (None, None) | ||
# reach stop_block after being passed by current_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no stop_block
in the context here. Is this comment outdated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
web3/middleware/filter.py
Outdated
return _start, _end | ||
|
||
|
||
def unload_latest_blocks(web3, fromBlock, toBlock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've maintained using fromBlock
instead of from_block
in most public APIs, but for internal APIs I'd prefer we stick with from_block
style casing.
web3/middleware/filter.py
Outdated
address=None, | ||
topics=None,): | ||
|
||
if not fromBlock or fromBlock == 'latest': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not fromBlock
will result in fromBlock == 0
not iterating from the genesis block as expected, but rather just starting at the current block which looks like a bug.
web3/middleware/filter.py
Outdated
_fromBlock = fromBlock | ||
|
||
if _fromBlock > int(web3.eth.blockNumber): | ||
raise ValueError("fromBlock ({0}) has not yet been mined".format(fromBlock)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this needs to be an error case. Why should we forbid someone setting up a filter in advance? If you want to punt on implementation can you just change this to a NotImplementedError
and change the error messaging to indicate that this just hasn't been implemented yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can implement future block ranges.
web3/middleware/filter.py
Outdated
filter_id_counter = map(to_hex, itertools.count()) | ||
|
||
def middleware(method, params): | ||
if method in ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to both use a set
in places where you are only doing membership checking and to elevate this to a module level constant so we don't have to allocate it every time the filter is called.
web3/middleware/filter.py
Outdated
|
||
filters[filter_id] = _filter | ||
return {'result': filter_id} | ||
elif method == 'eth_getFilterChanges' or method == 'eth_getFilterLogs': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like eth_getFilterLogs
is supposed to have it's own separate implementation? (based on your PR checklist and the differences in their respective functionality)
5cf5c8c
to
cafbbe9
Compare
Im starting to suspect that a pending transaction filter is not possible. Some of the reasons:
Maybe we could add a recipe to the docs for an alternate way from node filters to get information on pending transactions, in lieu of having the filtering middleware handle it. |
@dylanjw re pending transactions.
I still think I'd recommend separating this functionality from this PR and doing it as an independent PR to try and limit the scope of this feature. |
If you rebase this on master, some of these errors will be fixed, because master is upgraded to eth-tester devnum 22 @dylanjw |
cafbbe9
to
b7cf985
Compare
@pipermerriam I implemented the block filter in the PR because it was easy to do; most of the work is done with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments thus far.
with pytest.raises(expected): | ||
w3.manager.request_blocking(method, args) | ||
else: | ||
assert w3.manager.request_blocking(method, args) == expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you are reaching down into the w3.manager
instead of using w3.eth.getFilterchanges
, w3.eth.getFilterlogs
, ...?
('eth_newFilter', [{}], '0x1'), | ||
('eth_newPendingBlockFilter', [{}], NotImplementedError), | ||
('eth_newBlockFilter', [{}], '0x2'), | ||
('eth_getFilterChanges', ['0x0'], []), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this entry is requesting the filter which is setup here: https://github.com/ethereum/web3.py/pull/732/files#diff-ddc63b66c3f0acb70cca05f91281a51dR56
Assuming I'm reading that correct, it seems bad that this test is tightly coupled with state which was setup in that fixture, as well as that the test itself is stateful at the module level, which allows for state to leak across test runs. I think it'd be better if this could be refactored to:
- not have shared state across test runs
- setup the necessary state within this test.
web3/middleware/filter.py
Outdated
yield web3.eth.getBlock(block_number).hash | ||
|
||
|
||
def filter_middleware(make_request, web3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nitpick:
log_filter_middleware
(my preference)event_filter_middleware
local_filter_middleware
Just the term filter_
doesn't quite seem specific enough.
web3/middleware/filter.py
Outdated
"Incompatible start and stop arguments.", | ||
"Start must be less than or equal to stop.") | ||
|
||
def range_counter(start, stop, step): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why there is an outer and inner function call here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wrapper is so the exception is raised when the generator is created. e.g.
without the wrapper:
>>> s = segment_count(5, 0)
>>> next(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "web3/middleware/filter.py", line 45, in segment_count
"Start must be less than or equal to stop.")
TypeError: ('Incompatible start and stop arguments.', 'Start must be less than or equal to stop.')
>>>
With the wrapper:
>>> s = segment_count(5, 0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "web3/middleware/filter.py", line 45, in segment_count
"Start must be less than or equal to stop.")
TypeError: ('Incompatible start and stop arguments.', 'Start must be less than or equal to stop.')
>>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I wasn't aware that generators were lazy in that way. Neat and also kind of awkward. This deserves a comment imho as it's not clear just by reading the code.
web3/middleware/filter.py
Outdated
) | ||
|
||
|
||
def range_counter(start, stop=None, step=5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this whole function can be made significantly simpler using cytoolz.sliding_window
def range_counter(start, stop=None, step=5):
if stop == None:
base_iter = itertools.count(start, step)
else:
# todo: if `stop` won't be included in the `range` output, do something
# like `cytoolz.concatv(range(start, stop, step), [stop]) to add `stop` to
# the list of values that will be generated.
base_iter = range(start, stop, step)
for left, right in sliding_window(2, base_iter):
yield left, right
Haven't tested this code but I think it at least does close to what you are trying to do here with less mental overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can accomplish the generator with validation on creation with the following pattern.
def my_gen():
if thing:
raise Exception
return (v for v in things)
This is assuming that you can fit what you need within a comprehension and still maintain readability.
web3/middleware/filter.py
Outdated
""" | ||
# Return full range if all blocks exist | ||
if to_block is not None and to_block <= web3.eth.blockNumber: | ||
return from_block, to_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this does what you want.
In [14]: def f(v):
...: if v:
...: return 1, 2
...: yield 3, 4
...: yield 5, 6
...:
In [15]: tuple(f(True))
Out[15]: ()
In [16]: tuple(f(False))
Out[16]: ((3, 4), (5, 6))
I think this needs to be a yield
statement followed by a blank return
statement if you want to generate the single value and then terminate.
web3/middleware/filter.py
Outdated
else: | ||
yield from_block, current_block | ||
|
||
yield from get_block_range(current_block + 1, to_block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this pattern isn't going to work as it'll hit the recursion depth limit quite quickly since every step of the iteration requires a new stack frame. This needs to be refactored such that it isn't recursive.
web3/middleware/filter.py
Outdated
def range_counter(start, stop, step): | ||
def iter_ranges(start, step): | ||
yield start, start + step - 1 | ||
yield from iter_ranges(start + step, step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here. This solution is recursive and will easily hit the recursion limit. Needs to be refactored to something non-recursive.
f40114b
to
cc14385
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- missing a test for calling
web3.eth.getFilterChanges
with an unknownfilter_id
. - would be nice to actually test that the filters work as expected. We could potentially fuzz test this against an
eth-tester
backed filter via two separate web3 instances (oneeth-tester
and one middleware backed) to verify that they return the correct log entries.
web3/middleware/filter.py
Outdated
def finite(start, stop, step): | ||
# If the initial range is less than the step | ||
# just return (start, stop) | ||
if stop < step: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be if start + step >= stop
?
web3/middleware/filter.py
Outdated
if stop < step: | ||
yield (start, stop) | ||
return | ||
for i in zip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming nitpick
for segment in zip(...):
...
web3/middleware/filter.py
Outdated
yield (stop - remainder, stop) | ||
|
||
def infinite(start, step): | ||
for i in zip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same naming nitpick s/i/segment
web3/middleware/filter.py
Outdated
if remainder: | ||
yield (stop - remainder, stop) | ||
|
||
def infinite(start, step): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be elevated to a module level function.
web3/middleware/filter.py
Outdated
>>> next(segment_counter) # Remainder is also returned | ||
(9, 10) | ||
""" | ||
def finite(start, stop, step): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be elevated to a module level function.
web3/middleware/filter.py
Outdated
@to_list | ||
def block_hashes_in_range(web3, block_range): | ||
for block_number in range(block_range[0], block_range[1] + 1): | ||
yield web3.eth.getBlock(block_number).hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that web3.eth.getBlock
returns None
if block_number
is greater than the current HEAD block number, which means this will raise an AttributeError
if that case is met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true, I'll handle that case.
web3/middleware/filter.py
Outdated
if method == 'eth_newFilter': | ||
_filter = RequestLogs(web3, **apply_key_map(FILTER_PARAMS_KEY_MAP, params[0])) | ||
|
||
if method == 'eth_newBlockFilter': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/if/elif
and probably good to add an else
that raises NotImplimented
.
)) | ||
|
||
|
||
@pytest.fixture(scope='function') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not highly opposed to this but "function"
is the default scope
for pytest.fixture
iter_block_number.send(1) | ||
|
||
with pytest.raises(NotImplementedError): | ||
w3.eth.filter('pending') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this to it's own test?
|
||
def test_local_filter_middleware(w3, iter_block_number): | ||
block_filter = w3.eth.filter('latest') | ||
assert block_filter.filter_id == '0x0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of asserting the exact id
since this is somewhat testing the implementation, we could have a test that spams a number of different filter creations and asserts that the set of filter_id
values that are returned are all strings and are unique.
b08c278
to
e5ff02b
Compare
@pipermerriam I added a parametrized web3 fixture to tests/core/filtering/conftest.py that runs all the normal filtering tests against the middleware filter and the node filter. Is that what you mean? Or do you want tests with automated inputs? Im not too experienced with fuzz testing. |
e5ff02b
to
b77418e
Compare
Looking at the tests that live under Ideally, I think we need a test or set of tests that cover the following.
|
b0d02e1
to
1897f12
Compare
I am including tests for the issues mentioned in #836, as well as longer and open ended block ranges. |
bf6fde1
to
0e2df37
Compare
6e061e2
to
aba52df
Compare
aba52df
to
b7b4567
Compare
Im getting a heisenbug, where in some cases all tests pass, and in others there are missing log results. Im guessing this could be due to skipping blocks in the ranged block getLogs calls. |
051723a
to
93ccd48
Compare
I added a test that emits an event to 100 blocks in a row, and show that the a "local" filter finds every one. Doesnt add up with my theory that there are breaks in the eth_getLogs block ranges. |
b2441fc
to
cd581d6
Compare
Not ready to merge. Test results are still flaky. |
Function scoped pytest fixtures are not respected by Hypothesis, so the failures have to do with logs in the overlapping blocks between runs, where the chain state is not reset. Im surprised this hasnt been a bigger issue with flaky tests. |
Wow, I can confirm that only import pytest
import itertools
from hypothesis import given, strategies as st
counter = itertools.count()
hypothesis_already_seen = set()
parametrized_already_seen = set()
@pytest.fixture
def next_int():
return next(counter)
@pytest.mark.parametrize('ignored_text', ('', 'a'))
def test_parametrize_and_fixtures(next_int, ignored_text):
assert next_int not in parametrized_already_seen
parametrized_already_seen.add(next_int)
@given(st.text())
def test_hypothesis_and_fixtures(next_int, ignored_text):
assert next_int not in hypothesis_already_seen
hypothesis_already_seen.add(next_int) I would consider this a hypothesis bug. Do you want to open it? If not, I'm happy to. |
Nevermind, it's already there, and old... 😢 HypothesisWorks/hypothesis#377 |
There is a workaround, so at least you can check locally if the scoping issue is the only problem you're seeing: https://github.com/untitaker/pytest-subtesthack/ |
I have gotten around the issue by taking a snapshot and restoring at the beginning of the test function. The issue with hypothesis actually uncovered a bug, where the local filters were picking up events from the last block of the previous hypothesis run. |
Also added snapshot/revert to deal with hypothesis issue
Im not getting flaky test runs anymore. @carver @pipermerriam Ready for final review. |
@carver assigned to you, since this was a concept I came up with I think it's valuable for you to do primary review (cause bias...), I can do another pass when you are done if you think it will help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few things, but then it looks good to go!
web3/middleware/filter.py
Outdated
range(start, stop - step, step), | ||
range(start + step, stop, step)): | ||
range(start, stop - step + 1, step), | ||
range(start + step, stop + 1, step)): | ||
yield segment | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be in an else
clause right? Seems like it should always run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a for/else loop, so it runs at the end of the for loop. It would work the same to remove the else clause, but probably safer to keep it in case anything is added to this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for/else
is typically reserved for loops that have a break
somewhere in them, in order to only run the else
in whatever case the break
does not get triggered (or inversely, having the ability to break
, thus skipping the else
code.
with pytest.raises(expected): | ||
block_ranges(start, stop) | ||
else: | ||
for actual, expected in zip(block_ranges(start, stop), expected): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause problems if block_ranges
returns a different number of pairs than expected. zip()
will stop short, and could give a green test on a broken implementation.
One option is:
missing_flag = object()
comparing_pairs = itertools.zip_longest(block_ranges(start, stop), expected, fillvalue=missing_flag)
for actual, expected in comparing_pairs:
assert actual == expected
Alternatively, you could capture them into a tuple first and check their length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-check on length gives an early exit and is probably preferrable to complicating the comparison loop with an abnormal exit condition which is that implicitely, nothing will evaluate equal to missing_flag
.
actual_tuple = next(latest_block_ranges) | ||
except StopIteration as e: | ||
if e.value: | ||
actual_tuple = e.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iter_latest_block_ranges
doesn't seem to ever return anything. What is the goal here? Since StopIteration
only seems to be raised in an error case, maybe it shouldn't be caught at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some bad naming here. iter_block_number
is a fixture that allows me to set what web3.eth.blockNumber
returns. iter_latest_block_ranges
is what is being tested, and sets up a generator latest_block_ranges
that has its output assigned on line 115 and checked on line 120. The StopIteration catch is leftover from an older version of iter_latest_block_ranges when I was incorrectly returning the final value, instead of yield. I will take that out.
return ( | ||
(from_block, to_block - 1) | ||
for from_block, to_block | ||
in segment_count(start_block, last_block + 1, step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value last_block
could be None
which blows up here.
Looking at the invocation, it looks like it's always defined, and never None
. So maybe just drop the None
default value in the signature, and you can drop the last_block is not None
check above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took out the None
handling from this function and segment_count
elif self._to_block == 'latest': | ||
to_block = self.w3.eth.blockNumber | ||
|
||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit I'm not generally a fan of this gap before the else or elif statement. Up to you, though.
web3/middleware/filter.py
Outdated
self.to_block, | ||
self.address, | ||
self.topics, | ||
max_blocks=50))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably belongs in a constant with the one from line 235 also.
9b4cb26
to
03452cc
Compare
What was wrong?
See #551 (tldr; node filters suck)
How was it fixed?
Added a middleware that stores a "filtering generator" keyed by filter id. Calls to
getFilterChanges
will retrieve the stored generator and retrieve logs viaeth_getLogs
.Features implemented:
toBlock = 'pending' (need to test)eth_newFilter
eth_getFilterChanges
eth_getFilterLogs
[ ] pending transaction filtersTODO:
Cute Animal Picture