Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing the log topics for events (filter_log) may fail depending on the "indexed" inputs in the ABI #335

Open
apehex opened this issue Aug 27, 2023 · 2 comments
Assignees

Comments

@apehex
Copy link

apehex commented Aug 27, 2023

Hey there :)

I just noticed that filter_log sometimes misses events, in the Python SDK.

If the ABI used to trigger the event doesn't exactly match the ABI used to parse, filter_log fails.
In particular, developers are free to choose which event arguments are indexed.

For example the following event:

event Transfer(address indexed from, address to, uint256 value);

Will not be matched by an ABI built for:

event Transfer(address indexed from, address indexed to, uint256 value);

This can be identified by catching LogTopicError exceptions in filter_log, like:
Expected 2 log topics. Got 1

@apehex
Copy link
Author

apehex commented Sep 8, 2023

So far I'm working around this issue locally & manually with custom parsing.
I have a few propositions to handle this issue.

Solution 1: passing all the ABI variants as inputs

First, generate all the variants of the ABI with each input "indexed" flag switch on and off:

def generate_all_abi_indexation_variants(abi: ABIEvent) -> dict:
    """Generate all the variants of the ABI by switching each "indexed" field true / false for the inputs."""
    _count = len(abi.get('inputs', ()))
    _indexed = tuple(itertools.product(*(_count * ((True, False), ))))
    _abis = {_c: [] for _c in range(_count + 1)} # order by number of indexed inputs
    for _i in _indexed: # each indexation variant
        _abis[sum(_i)].append(_apply_indexation_mask(abi=abi, mask=_i))
    return _abis

Then it could be used directly with the current filter_log:

_abis = generate_all_abi_indexation_variants(abi=abi)
tx.filter_log(list(_abis.values())) # tx is a TransactionEvent

Pros

This solution will always work!

It doesn't require any modifications on filter_log since it already handles lists of ABIs.

Cons

It comes at the cost of performances.
With only 3 input arguments in the ABI, there would be 8 variants and the computation would be 8x slower.

Solution 2: Use only the relevant ABI variants

In the snippet above, the variants are sorted by number of indexed inputs.

This could allow to reduce the computation time and only use the variants that match the number of topics.
Inside filter_log, select only the matching ABIs:

_abi = abi.get(len(log['topics']) - 1, None)
contract = web3Provider.eth.contract("0x0000000000000000000000000000000000000000", abi=_abi)
for event_name in event_names:
    try:
        results.append(
            contract.events[event_name]().processLog(log))

filter_log could be modified to handle dict, list and a single ABI value.

Pros

Less processing than solution 1: instead of growing exponantially with inputs, now it's "only" one of the binomial coefficient.

Cons

filter_log would have to be reworked.

Solution 3: using the most probable ABI

Here, the idea would be to generate only one ABI per number of indexed inputs.

For example, the ABI for the ERC20 Transfer event with 1 indexed input would be:

event Transfer(address indexed from, address to, uint256 value); 

And we ignore the other 2 variants with 1 indexed input:

event Transfer(address from, address indexed to, uint256 value);
event Transfer(address from, address to, uint256 indexed value); 

It would be up to the user to generate the mapping from input count to ABI.
The naive way, giving more importance to the arguments from left to right, would be:

def generate_the_most_probable_abi_indexation_variants(abi: ABIEvent) -> dict:
    """Generate the most probable variant of the ABI for each count of indexed inputs."""
    _count = len(abi.get('inputs', ()))
    _indexed = tuple((_i * [True] + (_count - _i) * [False]) for _i in range(_count + 1)) # index from left to right, without gaps
    return {sum(_i): _apply_indexation_mask(abi=abi, mask=_i) for _i in _indexed} # order by number of indexed inputs

And then select the relevant ABI in filter_log:

_abi = abi.get(len(log['topics']) - 1, None)
contract = web3Provider.eth.contract("0x0000000000000000000000000000000000000000", abi=_abi)
for event_name in event_names:
    try:
        results.append(
            contract.events[event_name]().processLog(log))

Pros

This method would not impact performances at all: only one ABI processed per log.

Cons

The filter_log would have to be modified too.

It will still miss events when the most probable ABI doesn't match the actual ABI of the emitted event.

Using the function to its fullest would also require the user to be aware of the issue.
Still, the generation of the variants could be handled directly by filter_log to keep its usage transparent.

@christian-forta
Copy link

@haseebrabbani, what are your thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants