Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add whitelist regex support #612

Merged
merged 24 commits into from
Sep 1, 2019
Merged

Add whitelist regex support #612

merged 24 commits into from
Sep 1, 2019

Conversation

DL6ER
Copy link
Member

@DL6ER DL6ER commented Jul 9, 2019

By submitting this pull request, I confirm the following (please check boxes, eg [X]) Failure to fill the template will close your PR:

Please submit all pull requests against the development branch. Failure to do so will delay or deny your request

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.

How familiar are you with the codebase?:

10


Implement regex support for the whitelist. Just as with blacklists, whitelist regex support is intended for few, specialized filters. For mass-whitelisting, the known exact whitelisting should be used*.

Regex whitelist filter will affect the entire blocking Pi-hole is performing:

  • Gravity: Domains from the blocklists that a match whitelist regex filter will not be imported.
  • Blacklist: Blacklisted domains that match a whitelist regex filter will not be imported.
  • Regex blacklist: Domains matching both, blacklist and whitelist regex filters will be permitted as the whitelist is given a higher importance in FTL.

Using whitelist regex filters can slow down the startup process of FTL, however it can also result in improved performance when fewer blocking domains have to be imported.

Performance examples on a NanoPi NEO:

  • No regex whitelist:
    Database (gravity): imported 114851 domains (took 3269.0 ms)  
    
  • One regex whitelist filter (not matching):
    Database (gravity): imported 114851 domains (took 3707.6 ms)
    
  • Five regex whitelist filter (none of them matching anything):
    Database (gravity): imported 114851 domains (took 4374.3 ms) 
    
  • One whitelist regex (\.com$, matching all .com addresses):
    Database (gravity): imported 86736 domains (took 2182.2 ms)
    
    Note the reduced number of imported domains and the reduced time it takes to import gravity.

The used not-matching filters are long filters of medium complexity (containing a few character groups and alternations).

This pull request also adds eight new tests around blocking and whitelisting.


*) Pi-hole v5.0 ships with a massively improved exact whitelist performance. Up to millions of domains on the whitelist are now possible even on low-end hardware.

DL6ER added 12 commits July 7, 2019 22:46
Signed-off-by: DL6ER <dl6er@dl6er.de>
Signed-off-by: DL6ER <dl6er@dl6er.de>
…ands of messages during gravity import.

Signed-off-by: DL6ER <dl6er@dl6er.de>
Signed-off-by: DL6ER <dl6er@dl6er.de>
…piled.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…ex as well as gravity domains as expected.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…compare against the whitelist table but also evaluates possible whitelist regex filters.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…this routine is doing.

Signed-off-by: DL6ER <dl6er@dl6er.de>
… them globally available.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…ilable. Remove log_regex() as it can easily be inlined.

Signed-off-by: DL6ER <dl6er@dl6er.de>
@DL6ER DL6ER added this to the v5.0 milestone Jul 9, 2019
@DL6ER DL6ER mentioned this pull request Jul 9, 2019
8 tasks
@DL6ER
Copy link
Member Author

DL6ER commented Jul 9, 2019

Discourse feature request: Wildcard and regex support for whitelisting

@DL6ER DL6ER marked this pull request as ready for review July 14, 2019 20:48
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/filter-blocklist-on-update/21679/8

@dschaper
Copy link
Member

Is this a WIP?

@DL6ER
Copy link
Member Author

DL6ER commented Jul 15, 2019

No, it is ready for review (and possible discussions based thereon).

@AzureMarker
Copy link
Contributor

@DL6ER As this is ready to review, the core PR should also be made ready for review (un-draft).

src/database/gravity-db.c Outdated Show resolved Hide resolved
src/main.c Show resolved Hide resolved
src/regex.c Outdated Show resolved Hide resolved
src/regex.c Outdated Show resolved Hide resolved
test/gravity.db.sql Outdated Show resolved Hide resolved
…ing the latter to always end in _TABLE. Also remove code duplication by using a lookup array for the table names instead of using multiple select labels.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…able. Added documentation to in_whitelist() that explains the idea of the chosen implementation method.

Signed-off-by: DL6ER <dl6er@dl6er.de>
…E option. We might want to separate this into DEBUG_GRAVITY_DATABASE in the future.

Signed-off-by: DL6ER <dl6er@dl6er.de>
Signed-off-by: DL6ER <dl6er@dl6er.de>
…This allows us to remove the error handler as it could never be executed.

Signed-off-by: DL6ER <dl6er@dl6er.de>
Signed-off-by: DL6ER <dl6er@dl6er.de>
Signed-off-by: DL6ER <dl6er@dl6er.de>
@AzureMarker AzureMarker dismissed their stale review July 17, 2019 03:08

Requested changes were made, will re-review once Core changes are finalized

@DL6ER DL6ER mentioned this pull request Jul 22, 2019
9 tasks
…(). We always finalize the statement afterwards - whether or not we encountered an error before.

Signed-off-by: DL6ER <dl6er@dl6er.de>
src/database/gravity-db.c Show resolved Hide resolved
@DL6ER DL6ER merged commit 0cc1dd1 into development Sep 1, 2019
@DL6ER DL6ER deleted the new/whitelist-regex-support branch September 1, 2019 12:23
@DL6ER DL6ER mentioned this pull request May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants