Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement CIDR matching #42

Merged
merged 6 commits into from
Apr 15, 2022
Merged

Conversation

BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Apr 13, 2022

This is inspired by how the kernel does routing (with a LPC Trie), thanks @aojea for the pointer to https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux

There is one existing go library for this at https://github.com/yl2chen/cidranger but it implements more of a set type (we need a map to AWS regions), doesn't use the newer net/netip types which are more efficient, etc.

We implement a reasonably efficient generic map type in relatively little code here (~260 lines TrieMap including comments, whitespace, license header etc.).

While the current implementation here lacks level and path compression, cidranger only implements path compression.
For AWS data memory usage is acceptable without compression (ballpark less than 1MB), while lookup performance is with the AWS IPv4 data is ~40ns / op for this implementation vs ~400ns / op for cidranger on my 16" intel macbook laptop in microbenchmarking (and both drastically better than brute force looping over CIDRs).

See: #42 (comment) with the full dataset ipv4 + ipv6 matching, performance is ~fine. We can continue optimizing in follow-ups as needed.

See: #39

/cc @dims @ameukam

@k8s-ci-robot k8s-ci-robot requested review from ameukam and dims April 13, 2022 17:46
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 13, 2022
mapping map[netip.Prefix]V
}

func (b *bruteForceMapper[V]) GetIP(addr netip.Addr) (value V, matched bool) {
Copy link
Member Author

@BenTheElder BenTheElder Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: named return values seem to be the easiest way to return a default generic type cleanly while also documenting the return values

otherwise we would have a local type like var value V to achieve the same thing, you cannot do say return nil, false

@aojea
Copy link
Member

aojea commented Apr 13, 2022

👁️ /cc

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 13, 2022
@BenTheElder BenTheElder force-pushed the cidr-mapping branch 2 times, most recently from 83149f9 to 2982e29 Compare April 13, 2022 22:34
@BenTheElder
Copy link
Member Author

preliminary benchmarking has brute force taking 150x time, even with the current relatively simplistic implementation.

it does take longer than i'd like to initialize the trie, we may want to pre-initialize it (expected, but didn't want to do that before measuring...)

@BenTheElder
Copy link
Member Author

NOTE: most of the lines in the diff are just pkg/net/cidrs/aws/internal/ranges2go/zz_generated_rawdata.go which is https://ip-ranges.amazonaws.com/ip-ranges.json stuffed into a go string by a small script. The actual diff is not that large.

@BenTheElder
Copy link
Member Author

BenTheElder commented Apr 14, 2022

go test -benchmem -run=^$ -coverprofile=$elided -bench . sigs.k8s.io/oci-proxy/pkg/net/cidrs/aws

goos: darwin
goarch: amd64
pkg: sigs.k8s.io/oci-proxy/pkg/net/cidrs/aws
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkNewAWSRegionMapper-16       	     786	   1487210 ns/op	  870320 B/op	   29875 allocs/op
BenchmarkNewAWSegionBruteForce-16    	   96062	     12387 ns/op	       8 B/op	       1 allocs/op
BenchmarkAWSRegionTrieMap-16         	10781480	       111.2 ns/op	       0 B/op	       0 allocs/op
BenchmarkAWSRegionBruteForce-16      	   76730	     14975 ns/op	       0 B/op	       0 allocs/op
PASS
coverage: 100.0% of statements
ok  	sigs.k8s.io/oci-proxy/pkg/net/cidrs/aws	5.771s

With the real AWS data loaded including IPv6 (unless I've somehow misread these results, I don't find myself benchmarking go all that often...)

~ 0.0015 s to initialize the TrieMap and ~ 0.87 MB memory overhead (BenchmarkNewAWSRegionMapper-16).
~ 111 ns/op lookup times (BenchmarkAWSRegionTrieMap-16).

We have room to improve these if need be, but they should be plenty acceptable already.

@BenTheElder BenTheElder force-pushed the cidr-mapping branch 2 times, most recently from 007a5e7 to 5545b69 Compare April 14, 2022 02:06
@BenTheElder BenTheElder changed the title [WIP] implement CIDR matching implement CIDR matching Apr 14, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 14, 2022
@BenTheElder
Copy link
Member Author

BenTheElder commented Apr 14, 2022

Also a conscious decision: we could fairly easily minify pkg/net/cidrs/aws/internal/ranges2go/zz_generated_rawdata.go but I chose not to do so, because when the data does change occasionally it will be easy to read the diff (we will see some new ranges inserted in some regions), versus it will not be so easy minified.

The file is not included at runtime anyhow, only in the repo. The raw data from AWS is pre-processed by ranges2go into pre-parsed CIDRs in a more useful format for our purposes (map[string][]netip.Prefix region : CIDRs), and that file is much smaller.

pkg/net/cidrs/triemap.go Outdated Show resolved Hide resolved
pkg/net/cidrs/triemap.go Outdated Show resolved Hide resolved
@BenTheElder
Copy link
Member Author

BenTheElder commented Apr 15, 2022

I think the unused lint flaked:

pkg/net/cidrs/triemap.go:53:2: field `keyToValue` is unused (unused)
	keyToValue map[int]V
	^
pkg/net/cidrs/bruteforce.go:34:2: field `mapping` is unused (unused)
	mapping map[V][]netip.Prefix
	^
pkg/net/cidrs/triemap.go:54:2: field `valueToKey` is unused (unused)
	valueToKey map[V]int

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_oci-proxy/42/pull-oci-proxy-verify/1514800357945905152

does not fail locally, and by trivial inspection these errors are not correct
/retest

@BenTheElder
Copy link
Member Author

/retest

@dims
Copy link
Member

dims commented Apr 15, 2022

/approve
/lgtm

thanks @BenTheElder looking forward to hooking up this to the main code base next!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 15, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

4 participants