-
-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support multiple hosts files #998
feat: support multiple hosts files #998
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #998 +/- ##
==========================================
+ Coverage 93.55% 93.78% +0.22%
==========================================
Files 63 65 +2
Lines 5323 5373 +50
==========================================
+ Hits 4980 5039 +59
+ Misses 268 260 -8
+ Partials 75 74 -1
☔ View full report in Codecov by Sentry. |
consumersGrp, ctx := jobgroup.WithContext(ctx) | ||
defer consumersGrp.Close() | ||
|
||
producersGrp := jobgroup.WithMaxConcurrency(consumersGrp, r.cfg.Loading.Concurrency) | ||
defer producersGrp.Close() | ||
|
||
producers := parcour.NewProducersWithBuffer[*HostsFileEntry](producersGrp, consumersGrp, producersBuffCap) | ||
defer producers.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically a JobGroup
is a scope for goroutines: the defer grp.Close()
ensures no goroutine continues running after the function returns. Groups can be nested: here producersGrp
is a child of consumersGrp
.
These are ideas from structured concurrency (highly recommend that post, it's one of the most influential CS articles I've read).
Having a clear scope for goroutines also allows for clear failure propagation:
- when explicitly checking for goroutine errors with
Wait
:- if any goroutines returned errors, it returns them (just like
errgroup.Group
) - if any goroutines panicked, it propagates the panic to the current goroutine
- if any goroutines returned errors, it returns them (just like
- when we leave the function without explicitly waiting for goroutine,
Close
handles the failures:- if any goroutines returned errors, it propagates them to the parent
JobGroup
(or panics if there is no parent) - if any goroutines panicked, it propagates the panic to the current goroutine
- if any goroutines returned errors, it propagates them to the parent
producers
is an abstraction based on JobGroup
s to make the pattern we have here of producing host file entries from sources in parallel and consuming them from a single goroutine simple.
The fact that producers
uses the JobGroup
s we created means we know for sure it won't leave anything running since we're closing them before returning, and we can customize how those goroutines run.
lists/list_cache.go
Outdated
|
||
processingLinkJobs := len(links) | ||
producersGrp := jobgroup.WithMaxConcurrency(unlimitedGrp, b.processingConcurrency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limit applies to the whitelist and blacklist independently.
With the new code it should be easy to make both share the same limit. That seems like more user friendly behavior to me since there's a single option in the config. Do you think I should make that change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, one limit for both is good enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the current code is fine?
TBH it feels like we're not really respecting the user's choice here to me. But it's already the case so if you want to keep it as is, it's fine by me.
lists/list_cache.go
Outdated
|
||
processingLinkJobs := len(links) | ||
producersGrp := jobgroup.WithMaxConcurrency(unlimitedGrp, b.processingConcurrency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, one limit for both is good enough
if err := r.parseHostsFile(context.Background()); err != nil { | ||
r.log().Errorf("disabling hosts file resolving due to error: %s", err) | ||
downloader: lists.NewDownloader(cfg.Loading.Downloads, bootstrap.NewHTTPTransport()), | ||
} | ||
|
||
r.cfg.Filepath = "" // don't try parsing the file again | ||
} else { | ||
go r.periodicUpdate() | ||
err := cfg.Loading.StartPeriodicRefresh(r.loadSources, func(err error) { | ||
r.log().WithError(err).Errorf("could not load hosts files") | ||
}) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hosts file resolver will now behave like the blocking resolver and have a startStrategy
.
This is different from before: we were never trying again to parse the file if it failed at initialization.
I think this makes more sense now that we support remote files and multiple sources. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes definitively more sense. This should be documented properly to avoid confusions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also make more sense in the changelog IMO, not resolving this yet since if that's not easy to do I can add it in the docs.
Will finish this up this week-end :) |
5bfda51
to
3486779
Compare
Didn't have time to do the docs or run the tests locally, I should have time tomorrow for that. I refactored how we do config options migration cause it was quite verbose and copy-pasty. This will cause conflicts in the Redis rework branch, but I think it should be relatively easy to fix. Also moving all the options into a single |
5d4e62b
to
66c6124
Compare
Just pushed the docs, and cleaned up the commits. You can see the rendered docs here. |
config/config.go
Outdated
Concurrency uint `yaml:"concurrency" default:"4"` | ||
MaxErrorsPerSource int `yaml:"maxErrorsPerSource" default:"5"` | ||
RefreshPeriod Duration `yaml:"refreshPeriod" default:"4h"` | ||
StartStrategy StartStrategyType `yaml:"startStrategy" default:"blocking"` | ||
Downloads DownloaderConfig `yaml:"downloads"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the defaults were changed here.
I don't think we currently have a way to manually write changelog text but if there's an easy way to do it it could be nice to mention this.
1e99703
to
abdd45c
Compare
Is there some open points or is it ready to be merged? Can you please rebase it, there is some conflict in go.mod |
a23326d
to
c09049b
Compare
I like the new implementation for config depreciations. Sadly I have currently very little time so I won't be able to test it. It would help if there was a propper Go development suit for Android. 😕 |
I'd prefer to merge this without squashing if possible, but I think the repo settings prevent that. |
I think you must only rebase your branch on master. It is not required to squash commits. |
Deprecated settings use pointers to allow knowing if they are actually set in the user config. They are also nested in a struct which ensures they aren't still used since any old code would fail to compile, and easily make them discoverable by `migration.Migrate`.
Replace `IsZero` with `IsAboveZero` to help us avoid this mistake again.
c09049b
to
0312237
Compare
Thanks, I rebased and there was a |
Fixes #867