feat: support multiple hosts files #998

ThinkChaos · 2023-04-18T02:05:06Z

Fixes #867

codecov · 2023-04-18T02:11:14Z

Codecov Report

Patch coverage: 93.08% and project coverage change: +0.22 🎉

Comparison is base (7c07de7) 93.55% compared to head (a23326d) 93.78%.

❗ Current head a23326d differs from pull request most recent head c09049b. Consider uploading reports for the commit c09049b to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #998      +/-   ##
==========================================
+ Coverage   93.55%   93.78%   +0.22%     
==========================================
  Files          63       65       +2     
  Lines        5323     5373      +50     
==========================================
+ Hits         4980     5039      +59     
+ Misses        268      260       -8     
+ Partials       75       74       -1

Impacted Files	Coverage Δ
cmd/root.go	`61.29% <0.00%> (ø)`
server/server.go	`79.03% <71.42%> (+0.08%)`	⬆️
config/bytes_source.go	`88.13% <88.13%> (ø)`
lists/list_cache.go	`95.65% <91.66%> (-2.89%)`	⬇️
resolver/hosts_file_resolver.go	`97.01% <91.66%> (-2.99%)`	⬇️
lists/sourcereader.go	`95.23% <95.23%> (ø)`
config/config.go	`78.01% <97.50%> (+1.51%)`	⬆️
cmd/healthcheck.go	`100.00% <100.00%> (ø)`
config/blocking.go	`100.00% <100.00%> (+11.36%)`	⬆️
config/caching.go	`87.50% <100.00%> (+37.50%)`	⬆️
... and 7 more

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

go.mod

ThinkChaos · 2023-04-18T02:40:21Z

resolver/hosts_file_resolver.go

+	consumersGrp, ctx := jobgroup.WithContext(ctx)
+	defer consumersGrp.Close()
+
+	producersGrp := jobgroup.WithMaxConcurrency(consumersGrp, r.cfg.Loading.Concurrency)
+	defer producersGrp.Close()
+
+	producers := parcour.NewProducersWithBuffer[*HostsFileEntry](producersGrp, consumersGrp, producersBuffCap)
+	defer producers.Close()


Basically a JobGroup is a scope for goroutines: the defer grp.Close() ensures no goroutine continues running after the function returns. Groups can be nested: here producersGrp is a child of consumersGrp.
These are ideas from structured concurrency (highly recommend that post, it's one of the most influential CS articles I've read).

Having a clear scope for goroutines also allows for clear failure propagation:

when explicitly checking for goroutine errors with Wait:

if any goroutines returned errors, it returns them (just like errgroup.Group)

if any goroutines panicked, it propagates the panic to the current goroutine

when we leave the function without explicitly waiting for goroutine, Close handles the failures:

if any goroutines returned errors, it propagates them to the parent JobGroup (or panics if there is no parent)

if any goroutines panicked, it propagates the panic to the current goroutine

producers is an abstraction based on JobGroups to make the pattern we have here of producing host file entries from sources in parallel and consuming them from a single goroutine simple.

The fact that producers uses the JobGroups we created means we know for sure it won't leave anything running since we're closing them before returning, and we can customize how those goroutines run.

ThinkChaos · 2023-04-18T02:41:05Z

lists/list_cache.go


-	processingLinkJobs := len(links)
+	producersGrp := jobgroup.WithMaxConcurrency(unlimitedGrp, b.processingConcurrency)


This limit applies to the whitelist and blacklist independently.
With the new code it should be easy to make both share the same limit. That seems like more user friendly behavior to me since there's a single option in the config. Do you think I should make that change?

I think, one limit for both is good enough

Do you mean the current code is fine?
TBH it feels like we're not really respecting the user's choice here to me. But it's already the case so if you want to keep it as is, it's fine by me.

config/blocking.go

config/config.go

go.mod

0xERR0R · 2023-04-19T14:06:30Z

lists/list_cache.go


-	processingLinkJobs := len(links)
+	producersGrp := jobgroup.WithMaxConcurrency(unlimitedGrp, b.processingConcurrency)


I think, one limit for both is good enough

config/config.go

ThinkChaos · 2023-04-20T01:20:56Z

resolver/hosts_file_resolver.go

-	if err := r.parseHostsFile(context.Background()); err != nil {
-		r.log().Errorf("disabling hosts file resolving due to error: %s", err)
+		downloader: lists.NewDownloader(cfg.Loading.Downloads, bootstrap.NewHTTPTransport()),
+	}

-		r.cfg.Filepath = "" // don't try parsing the file again
-	} else {
-		go r.periodicUpdate()
+	err := cfg.Loading.StartPeriodicRefresh(r.loadSources, func(err error) {
+		r.log().WithError(err).Errorf("could not load hosts files")
+	})
+	if err != nil {
+		return nil, err
 	}


The hosts file resolver will now behave like the blocking resolver and have a startStrategy.
This is different from before: we were never trying again to parse the file if it failed at initialization.

I think this makes more sense now that we support remote files and multiple sources. Do you agree?

Yes, it makes definitively more sense. This should be documented properly to avoid confusions

This would also make more sense in the changelog IMO, not resolving this yet since if that's not easy to do I can add it in the docs.

config/config.go

ThinkChaos · 2023-05-05T22:34:20Z

Will finish this up this week-end :)

ThinkChaos · 2023-05-07T22:25:20Z

Didn't have time to do the docs or run the tests locally, I should have time tomorrow for that.

I refactored how we do config options migration cause it was quite verbose and copy-pasty. This will cause conflicts in the Redis rework branch, but I think it should be relatively easy to fix.
The "x and y are both configured" configured and "please use x instead" logs are automatically handled once you define the migrations with the new DSL.
The pattern that branch introduces of returning an error from RedisConfig.validateConfig can be supported by making validation a second pass after migration.

Also moving all the options into a single Deprecated struct is nice cause it prevents old code from compiling, so you're guaranteed to catch all usage of old options.
That's how I noticed the old HTTP port option was used in cmd/root.go.

ThinkChaos · 2023-05-12T02:15:29Z

Just pushed the docs, and cleaned up the commits.

You can see the rendered docs here.

ThinkChaos · 2023-05-12T02:17:51Z

config/config.go

+	Concurrency        uint              `yaml:"concurrency" default:"4"`
+	MaxErrorsPerSource int               `yaml:"maxErrorsPerSource" default:"5"`
+	RefreshPeriod      Duration          `yaml:"refreshPeriod" default:"4h"`
+	StartStrategy      StartStrategyType `yaml:"startStrategy" default:"blocking"`
+	Downloads          DownloaderConfig  `yaml:"downloads"`


Some of the defaults were changed here.
I don't think we currently have a way to manually write changelog text but if there's an easy way to do it it could be nice to mention this.

0xERR0R · 2023-06-19T20:36:35Z

Is there some open points or is it ready to be merged? Can you please rebase it, there is some conflict in go.mod

kwitsch · 2023-06-20T05:35:25Z

I like the new implementation for config depreciations. ♥️

Sadly I have currently very little time so I won't be able to test it. It would help if there was a propper Go development suit for Android. 😕

ThinkChaos · 2023-07-05T21:30:52Z

I'd prefer to merge this without squashing if possible, but I think the repo settings prevent that.
Would you want to change the settings to allow that, or should I just squash and merge?

0xERR0R · 2023-07-06T06:10:23Z

I'd prefer to merge this without squashing if possible, but I think the repo settings prevent that. Would you want to change the settings to allow that, or should I just squash and merge?

I think you must only rebase your branch on master. It is not required to squash commits.

Deprecated settings use pointers to allow knowing if they are actually set in the user config. They are also nested in a struct which ensures they aren't still used since any old code would fail to compile, and easily make them discoverable by `migration.Migrate`.

Replace `IsZero` with `IsAboveZero` to help us avoid this mistake again.

ThinkChaos · 2023-07-07T13:17:34Z

Thanks, I rebased and there was a go.sum conflict. So I guess that's why GitHub didn't want to do it.
The WebUI didn't mention that, or I missed it. Anyways, merged it :)

ThinkChaos commented Apr 18, 2023

View reviewed changes

0xERR0R requested changes Apr 19, 2023

View reviewed changes

ThinkChaos commented Apr 20, 2023

View reviewed changes

0xERR0R added this to the v0.22 milestone May 4, 2023

0xERR0R added the 🔨 enhancement New feature or request label May 4, 2023

ThinkChaos force-pushed the feat/multiple-hosts-sources branch from 5bfda51 to 3486779 Compare May 7, 2023 22:14

ThinkChaos force-pushed the feat/multiple-hosts-sources branch from 5d4e62b to 66c6124 Compare May 12, 2023 02:09

ThinkChaos commented May 12, 2023

View reviewed changes

ThinkChaos mentioned this pull request May 19, 2023

[Feature Request] Blocky should start resolving DNS traffic using available upstream ASAP #835

Closed

ThinkChaos force-pushed the feat/multiple-hosts-sources branch from 1e99703 to abdd45c Compare May 21, 2023 18:14

ThinkChaos force-pushed the feat/multiple-hosts-sources branch from a23326d to c09049b Compare June 19, 2023 23:54

0xERR0R approved these changes Jun 30, 2023

View reviewed changes

ThinkChaos added 6 commits July 7, 2023 09:04

fix: parse the API URL using the non-deprecated options

63385c3

feat: support multiple hosts files

7e8dfeb

fix: duration checks to take into account values can be negative

9c63aed

Replace `IsZero` with `IsAboveZero` to help us avoid this mistake again.

style: fix all existing lint errors

41fa334

ci: deploy docs on forks if they have pages enabled

0312237

ThinkChaos force-pushed the feat/multiple-hosts-sources branch from c09049b to 0312237 Compare July 7, 2023 13:04

ThinkChaos merged commit 2bd5948 into 0xERR0R:main Jul 7, 2023

ThinkChaos deleted the feat/multiple-hosts-sources branch July 7, 2023 13:17

ThinkChaos mentioned this pull request Dec 4, 2023

[Doc] config maxErrorsPerFile is missing from documents #1289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support multiple hosts files #998

feat: support multiple hosts files #998

ThinkChaos commented Apr 18, 2023

codecov bot commented Apr 18, 2023 •

edited

Loading

ThinkChaos Apr 18, 2023

ThinkChaos Apr 18, 2023

0xERR0R Apr 19, 2023

ThinkChaos Apr 20, 2023

0xERR0R Apr 19, 2023

ThinkChaos Apr 20, 2023

0xERR0R May 5, 2023

ThinkChaos May 12, 2023

ThinkChaos commented May 5, 2023

ThinkChaos commented May 7, 2023

ThinkChaos commented May 12, 2023 •

edited

Loading

ThinkChaos May 12, 2023

0xERR0R commented Jun 19, 2023

kwitsch commented Jun 20, 2023

ThinkChaos commented Jul 5, 2023

0xERR0R commented Jul 6, 2023

ThinkChaos commented Jul 7, 2023


		processingLinkJobs := len(links)
		producersGrp := jobgroup.WithMaxConcurrency(unlimitedGrp, b.processingConcurrency)

feat: support multiple hosts files #998

feat: support multiple hosts files #998

Conversation

ThinkChaos commented Apr 18, 2023

codecov bot commented Apr 18, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThinkChaos commented May 5, 2023

ThinkChaos commented May 7, 2023

ThinkChaos commented May 12, 2023 • edited Loading

Choose a reason for hiding this comment

0xERR0R commented Jun 19, 2023

kwitsch commented Jun 20, 2023

ThinkChaos commented Jul 5, 2023

0xERR0R commented Jul 6, 2023

ThinkChaos commented Jul 7, 2023

codecov bot commented Apr 18, 2023 •

edited

Loading

ThinkChaos commented May 12, 2023 •

edited

Loading