Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrichment Improvements #3538

Closed
wants to merge 2 commits into from
Closed

Enrichment Improvements #3538

wants to merge 2 commits into from

Conversation

hyphenized
Copy link
Contributor

@hyphenized hyphenized commented Jul 10, 2023

At the moment, during enrichment, we have to dispatch 2 RPC calls for name resolution with ENS alone. The first to get the resolver and another to perform the actual lookup. Additionally, we also need to run a query for balances and retrieve the address's code to determine if we're dealing with a contract address.

For a simple ETH transfer this is fine, but for something involving a contract, like an erc20 transfer or a swap, this can involve a lot of addresses which in turn result in a significant amount of requests being made. This is further worsened by the fact that during initial activity load we're also querying block, transaction, and receipts data.

This PR implements batch processing for address enrichment and, on-chain data resolution for enrichment requests, reducing RPC calls by as much as 87.5% during account onboarding, account switching and extension startup. This could be improved further by caching certain results by block height, since we're not performing special enrichment lookups for transactions already mined.

The deployed contract's source is here: https://gist.github.com/hyphenized/decfe5f079fc9c07d1f87939311bc16f

To Test

  • Switch to main, open the network inspector on the background page, then load an account wait a couple of mins.
  • Switch to this branch, reinstall the extension and go through the same process
  • Test extension reload (before and after activity load), onboarding and switching accounts, check performance and requests being made.

Latest build: extension-builds-3538 (as of Mon, 10 Jul 2023 02:34:37 GMT).

Implemented a new batch processing system for address enrichment. This
aggregates enrichment lookup requests, and switches the source of data to
on-chain read only calls through multicall and a custom smart contract.
This significantly minimizes resource usage by reducing the number of
dispatched requests, improving speed and resource usage of the enrichment
process.
A few changes to better accommodate the new batch processing system:

- Restructured data flow to optimize for batch processing of address enrichment.
- Cleanup of unnecessary/duplicate enrichment requests, further enhancing performance.
- Removal of balance checks for non-pending transactions, reducing processing overhead.
- Refactored transaction log parsing to prevent additional redundant requests.
@Shadowfiend
Copy link
Contributor

Just a quick glance here, but is this... Removing the name service from the flow entirely, but only doing that for Ethereum?

@hyphenized
Copy link
Contributor Author

Just a quick glance here, but is this... Removing the name service from the flow entirely, but only doing that for Ethereum?

Yes, this is switching the source of enrichment name resolution for remote resolvers (ens/uns) to read-only calls, in this initial implementation I did not factor in local name resolution but it can be added without a lot of effort. This contract is only deployed on Ethereum since I wanted to gather the team input's on this, and I think it would have been difficult to test this without it.

Considering enrichment is mostly a concern for chains were we're able to retrieve asset transfers, and these limited by alchemy, we could deploy some variant of it on those other networks if this is the direction we want to take. While enrichment requires data for multiple sources (or most likely will), I think it's worth considering the retrieval of certain information for enrichment as it's own separate thing.

This solves an issue we currently have because we're not storing enrichment data, and we're constantly retrieving new transactions to enrich. Another thing to keep in mind is that we can still benefit from batching if we decide not to use a contract for enrichment.

@Shadowfiend
Copy link
Contributor

Short version here:

  • Directionally glorious, using a helper contract to bundle up calls is fantastic and could unlock some huge wins. An absolutely killer spike.
  • Architecturally unworkable in the current state. We're duplicating a whole stack of things that already happen elsewhere in the codebase, and any changes in one place (to add support for more name systems, to support chains differently, etc) will either lead to divergence, uneven support, or explosion in complexity. We're also removing the ability to benefit from this for other cases, e.g. batching periodic balance lookups.

The next step here IMO is to try to get the bulk of this gain across the whole wallet by implementing bundling in the relevant services for their specific calls, leveraging that helper contract. Fleshing out thoughts in no particular about this path:

  • The helper contract can be deployed at a stable address using CREATE2, so that we can just probe on any given network “does this address exist”.
  • We can do timed batch calls similar to what we're doing for diffing in the redux store, except that once we clear the timeout instead of creating a diff we issue a single call for all of the (names | balances | whatever we're batching).
  • We can probably create a generic promise queue-type helper for the above in lib/.
  • This doesn't let us compress N -> 1 calls, but it does let us compress N -> M calls where M is in the number of services. Since all of these lookups are async, we have the option to batch over a small window (3ms, 5ms, etc) and fulfill all the lookups at once with a single RPC call—we just have to choose the right place to batch (e.g. https://github.com/tahowallet/extension/blob/main/background/services/chain/index.ts#L909 for balances).
  • It seems quite possible we can create an aggregator for all of these functions that batches them no matter where they're coming from, letting us achieve the full win in this PR but across the whole extension. That will probably take some more elbow grease and create more code for review though, so it seems like a second step rather than a first one.
  • We'll have to consider what happens if there are so many calls that we need more than one contract call due to calldata/gas limitations in eth_call.
  • We'll also need the contract under version control. It can be in the still-private contracts repo, or we can put it in a subdir here—I don't have a strong opinion right now (eventually we probably want it in the contracts repo though).

I suggest we close this PR for now, but continue to push on this direction.

@hyphenized
Copy link
Contributor Author

Short version here:

  • Directionally glorious, using a helper contract to bundle up calls is fantastic and could unlock some huge wins. An absolutely killer spike.
  • Architecturally unworkable in the current state. We're duplicating a whole stack of things that already happen elsewhere in the codebase, and any changes in one place (to add support for more name systems, to support chains differently, etc) will either lead to divergence, uneven support, or explosion in complexity. We're also removing the ability to benefit from this for other cases, e.g. batching periodic balance lookups.

The next step here IMO is to try to get the bulk of this gain across the whole wallet by implementing bundling in the relevant services for their specific calls, leveraging that helper contract. Fleshing out thoughts in no particular about this path:

  • The helper contract can be deployed at a stable address using CREATE2, so that we can just probe on any given network “does this address exist”.
  • We can do timed batch calls similar to what we're doing for diffing in the redux store, except that once we clear the timeout instead of creating a diff we issue a single call for all of the (names | balances | whatever we're batching).
  • We can probably create a generic promise queue-type helper for the above in lib/.
  • This doesn't let us compress N -> 1 calls, but it does let us compress N -> M calls where M is in the number of services. Since all of these lookups are async, we have the option to batch over a small window (3ms, 5ms, etc) and fulfill all the lookups at once with a single RPC call—we just have to choose the right place to batch (e.g. main/background/services/chain/index.ts#L909 for balances).
  • It seems quite possible we can create an aggregator for all of these functions that batches them no matter where they're coming from, letting us achieve the full win in this PR but across the whole extension. That will probably take some more elbow grease and create more code for review though, so it seems like a second step rather than a first one.
  • We'll have to consider what happens if there are so many calls that we need more than one contract call due to calldata/gas limitations in eth_call.
  • We'll also need the contract under version control. It can be in the still-private contracts repo, or we can put it in a subdir here—I don't have a strong opinion right now (eventually we probably want it in the contracts repo though).

I suggest we close this PR for now, but continue to push on this direction.

Thank you for the insightful analysis. I agree, the helper contract does seem promising, but the complexity and code duplication it adds make it less appealing.

We can still make use of the bundling mechanism and the other fixes included in this PR to improve our existing implementation. I'd suggest we revisit this when we're able to decide if we want to include and/or prioritize it appropriately within our roadmap.

@hyphenized hyphenized closed this Jul 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants