-
Notifications
You must be signed in to change notification settings - Fork 0
feat: implement token lists management with fetcher and parser support #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #20 +/- ##
===========================================
+ Coverage 37.46% 51.72% +14.26%
===========================================
Files 35 54 +19
Lines 1906 2948 +1042
===========================================
+ Hits 714 1525 +811
- Misses 1166 1369 +203
- Partials 26 54 +28
🚀 New features to boost your workflow:
|
b315711
to
0086282
Compare
return ErrNoSymbol | ||
} | ||
|
||
// even theoretically the limit is 256, in practice we should not let users use more than 18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this come from? Any other implementation out there imposing this limitation?
Does it have any practical purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Officially, there is no such limitation, it can even be 255. But in practice, decimals are almost always set to 18 or less, that's kind of a convention (not a maximum), based on:
- https://ethereum.stackexchange.com/questions/106278/how-to-choose-number-of-decimal-places-for-token
- https://docs.openzeppelin.com/contracts/5.x/erc20#a-note-on-decimals
If we need it more, it's easy to change.
return false | ||
} | ||
|
||
func isValidLogoURI(logoURI string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the standard say anything about this? why wouldn't the logo be able to be served, for example, by ftp, torrent magnet, or any other protocol?
If this means which protocols Status can fetch from, then this check belongs in status-go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is no such thing officially, but also officially, there is no thing like "standard token list", that's something that Uniswap defined, set a schema https://raw.githubusercontent.com/Uniswap/token-lists/main/src/tokenlist.schema.json and the community simply adopted, but that's not a standard or so.
Based on what's said in that schema I created that logo validator, that's it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah, by "standard" i mean what Uniswap defined as "standard token list"
I can't find any part of the scheme that specifies the protocol for the logoURI, could you point me to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created that validator based on practical usage I've seen in the most popular lists and those will work in browsers (magnet:
cannot be displayed, and browsers removed support for ftp:
). I agree that the schema doesn't say anything about logoURI
for tokens, but that's why the validator allows an empty logoURI
.
I would like to keep it here, but if you agree, I can make that validator less restrictive and have something like this:
func isValidLogoURI(logoURI string) bool {
if logoURI == "" {
return true
}
_, err := url.Parse(logoURI)
return err == nil
}
This accepts any provided schema.
|
||
// Token represents a token with cross-chain identification. | ||
type Token struct { | ||
CrossChainID string `json:"crossChainId"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've avoided adding JSON tags to my structs so far. If/How to serialize/deserialize depends on the consumer (they could use protobuf/XML instead of json, might want different field names even if they wanted JSON, or their use case might not even require serialization at all).
I think the good practice here would be explicitly define serialization in status-go for whatever types are exposed through the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see searching a bit on this topic there's no official Go design guideline, unfortunately.
I don't see having tags as an obstacle at all, more like a benefit because without tags, we force every consumer who wants a json serialization to redefine the type. But this way they can use the type directly, but if they want different field names, they can use their own wrapper types, if they want protobuf/XML, those have their own tags (protobuf, xml).
Most Go consumers expect JSON as the default, those who need different serialization can still wrap the types. go-ethereum also provides json tags and many others.
|
||
const ( | ||
// #nosec G101 | ||
ListOfTokenListsSchema = `{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably cleaner to leave this in a .schema.json
and load it using https://pkg.go.dev/embed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema is exposed to consumers via this constant, so they can use it in the token manager configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can still make it accessible in go.
//go:embed list_of_token_lists.schema.json
var ListOfTokenListsSchema string
the variable ListOfTokenListsSchema
will load the contents of list_of_token_lists.schema.json
at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally favor putting the tests in a separate directory ( test
) and a separate package fetcher_test
. This way we ensure we only have access to what the package under test exports (test behavior and not implementation).
I try to only use same-package-tests for small tests of internal utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it's cleaner, if I could move all tests there, but if I move current tests to subfolder will have some tests there some next to the files because current set of test are also testing some unexported functions.
) | ||
|
||
// #nosec G101 | ||
const wrongSchemaResponse = `{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, I normally try to put all this stuff in .json files (inside a test
directory) and import them with go:embed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are not final json forms, but constants used for building appropriate responses for tests.
pkg/tokens/fetcher/fetcher.go
Outdated
} | ||
|
||
// fetchDetails fetches a resource from the URL specified in the details. | ||
func (t *fetcher) fetchDetails(ctx context.Context, details FetchDetails, ch chan<- FetchedData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, this looks basically identical to Fetch
, just returning the value through a channel instead of as a return value. FetchConcurrent
could just get the result using Fetch
and handle the channel inside the goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's true.
} | ||
|
||
// FetchConcurrent fetches multiple resources concurrently from the URLs specified in the details. | ||
func (t *fetcher) FetchConcurrent(ctx context.Context, details []FetchDetails) ([]FetchedData, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would probably cover more use cases if you just returned the channel and let the user decide if they want to handle the result values as they come or just accumulate them and process them in bulk. Closing the channel would be the indication that all fetches are finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I didn't want that. This function fetches files concurrently from multiple locations to speed up the process, and it's not concurrent for the consumer. It blocks until the process is done. If the consumer wants to have control over the process, they can run their own go routine and use Fetch
, which is why we have 2 exposed functions.
) | ||
|
||
const ( | ||
defaultRequestTimeout = 5 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't look like there's any way to customize this parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, but yes, I can add config for that.
5970769
to
f5ed371
Compare
ID string `json:"id"` | ||
Symbol string `json:"symbol"` | ||
Name string `json:"name"` | ||
Platforms map[string]string `json:"platforms"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we be able to group tokens together from this list? It doesn't seem we're setting any crossChainID from the coingecko list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this list provides a cross-chain ID, but we don't use this parser, however, I kept it.
) | ||
|
||
// StandardTokenList represents the TokenLists standard format. | ||
type StandardTokenList struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we should be able to form token groups with standard token lists
https://github.com/Uniswap/token-list-bridge-utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that list provides the information, but we don't use it because we agreed to maintain our token list, which will have cross-chain IDs, and refer to it always for grouping. It will contain all the info that current uniswap list https://ipfs.io/ipns/tokens.uniswap.org contains and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go mod
is smart enough to only download files for the packages you use and ignore files with the suffix _test
.
See how this file made it to vendor directory in status-go? https://github.com/status-im/status-go/pull/6912/files#diff-11fc6b045f4c34983a0be1b4eae4c61cbf0abadb2002559dc123651a6ff60434
Please rename it to have _test
at the end... Or put it in a parsers_test
package. Sorry for insisting with this, it's just much cleaner to keep test files separate from the real code (except for some tests that need to use unexported functions, it's fine to put those in the same package, but hopefully the bulk of tests test the exported API and not internal methods, so it should be the exception and not the rule)
) | ||
|
||
// autofetcher handles the background fetch of token lists (thread-safe for concurrent access). | ||
type autofetcher struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unexported? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, cause there is an AutoFetcher
interface in types.go
that should be used everywhere. This way we don't allow direct initialization like f := &fetcher.Fetcher{}
and consumers must use constructors and they validate provided Config
so no way to have nil pointer or panics later.
} | ||
|
||
// Start starts the background autofetcher process. | ||
func (a *autofetcher) Start(ctx context.Context) (refreshCh chan error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had several discussions in the Guild times and settled for the approach:
- Long-lived things (services/controllers) have their lifetime managed via a closeCh (created with a Start function call, closed with a Stop function call)
- Short-lived things (workers/fetchers/runners) have their lifetime managed via a ctx passed as argument in their "run" method
There's several examples of this in the newer code
Having both a ctx passed here and a Stop function gives you two ways of stopping things that don't even have the exact same effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, was thinking whether to go with context or channel. I can change that.
"sync" | ||
) | ||
|
||
type mockContentStore struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid manually writing the mocks, use mockgen
Start(ctx context.Context) (refreshCh chan error) | ||
|
||
// Stop stops the background autofetcher process. | ||
// Blocks until the background goroutine has finished. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HA! alright this relates to what you commented in my PR. I think whether to wait or not for the operations to finish before returning depends on how critical the operation it's performing is. In this case we wouldn't want to, for example, delay the client's exit for this to finish downloading some token lists. If we did, we'd better introduce some timeout in case someone has a super slow connection and waiting could take an unreasonable amount of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is not a super critical operation, but also this delay the client's exit for this to finish downloading some token lists
is not the case, cause the process will be canceled via context and returning will be done in the next step. Also since the go routine is sending to the channel and we close the channel without waiting for Go routine to finish, it will panic. So I don't see any issue here with it.
pkg/tokens/autofetcher/fetcher.go
Outdated
} | ||
|
||
return &autofetcher{ | ||
fetcher: fetcher.New(fetcher.DefaultConfig()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fetcher object should be passed as a constructor parameter, that way consumers can use the config they desire for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
|
||
**Example timing**: | ||
- `AutoRefreshInterval: 30 * time.Minute` - Refresh every 30 minutes | ||
- `AutoRefreshCheckInterval: time.Minute` - Check every minute if 30 minutes have passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this parameter description sounds very weird. Why check every minute if 30 minutes have passed instead of... waiting 30 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even fully understand the purpose of the two separate parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the client can turn the auto-refresh on/off, with AutoRefreshCheckInterval
effect will be visible very short after enabling auto-refresh. An example:
- client has auto-refresh on and the last auto-refresh was done at 10:00
- client turned off auto-refresh at 10:01, so last refresh was at 10:00
- then the client turns on auto-refresh at 10:29 then:
- if there is no
AutoRefreshCheckInterval
the change will take effect in the nextAutoRefreshInterval
check, which will be at 10:59 - since we have
AutoRefreshCheckInterval
the change will take effect at 10:30 (definitely before 10:31)
- if there is no
I agree that even if we check every AutoRefreshInterval
minutes and in the above example, waiting for 2 x AutoRefreshInterval
is not a big deal for that operation, it's a better user experience for some actions to take effect as soon as possible.
Also those params are configurable by the consumer, so can be set arbitrarily, just that AutoRefreshCheckInterval
must be less or equal to AutoRefreshInterval
.
return b.tokenLists | ||
} | ||
|
||
func getNativeToken(chainID uint64) *types.Token { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems very random to be adding native tokens for these two specific chains here. If Status needs these two tokens at startup for whatever reason, it should insert the appropriate list using the AddTokenList
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's part of AddNativeTokenList
function, so if the consumer doesn't need native tokens then just doesn't need to call that function. If needs different tokens can use AddTokenList
.
} | ||
|
||
// AddRawTokenList adds a raw token list to the builder using the provided parser and adds the tokens to the list of unique tokens. | ||
func (b *Builder) AddRawTokenList(tokenListID string, raw []byte, sourceURL string, fetchedAt time.Time, parser parsers.TokenListParser) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is not very useful... You could just parse the list outside and use the normal AddTokenList
method. There's little reason for the class in charge of accumulating the tokens to deal with parsing as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's here just for convenience, who need it (as we do) will use it.
312d1ce
to
900467d
Compare
The `fetcher` package provides functionality for fetching token lists and related data from remote sources with support for HTTP caching, JSON schema validation, and concurrent operations. The fetcher package is designed to: - Fetch individual token lists and token list metadata - Support HTTP caching with ETags to minimize network traffic - Validate JSON data against schemas - Handle concurrent fetching operations safely - Provide robust error handling with context support
The `parsers` package provides implementations for parsing token lists from various formats and sources. It supports multiple token list standards and converts them into a unified internal format for consistent processing. The parsers package provides two main types of parsers: 1. **Token List Parsers**: Parse individual token lists from various providers 2. **List of Token Lists Parsers**: Parse metadata about collections of token lists
The `autofetcher` package provides automated background fetching and caching of token lists with configurable refresh intervals. It supports both direct token list fetching and remote list-of-token-lists discovery patterns. The autofetcher package is designed to: - **Automatically fetch token lists** in the background with configurable intervals - **Support two modes**: direct token lists and remote list-of-token-lists - **Provide thread-safe operations** for concurrent usage - **Handle HTTP caching** with ETags to minimize network traffic - **Store fetched content** using a pluggable ContentStore interface - **Graceful lifecycle management** with Start/Stop operations
The `builder` package provides functionality for building token collections by progressively adding multiple token lists from various sources, creating a unified collection of unique tokens across different blockchain networks. The builder package is designed to: - Build token collections incrementally by adding token lists from various sources - Ensure token uniqueness across all added lists through automatic deduplication - Generate native tokens for supported blockchain networks - Parse and add raw token lists using configurable parsers - Maintain both individual token lists and a unified token collection - Follow the Builder pattern for stateful construction
900467d
to
bdb47c3
Compare
The `manager` package provides a high-level, thread-safe interface for managing token collections from multiple sources with automatic refresh capabilities, state management, and comprehensive token operations. The manager package is designed to: - **Centralize token management** across multiple blockchain networks - **Merge tokens from various sources** (native, remote lists, local lists, custom tokens) - **Provide thread-safe access** to token collections with optimized read performance - **Support automatic refresh** of remote token lists with background fetching - **Maintain deterministic ordering** for consistent token resolution - **Handle errors gracefully** with fallback mechanisms
bdb47c3
to
b4c8e24
Compare
Added several standalone packages under
pkg/tokens
:autofetcher
builder
fetcher
manager
parsers
types