-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast HTTP match #732
Labels
Milestone
Comments
This was referenced May 13, 2017
Closed
krizhanovsky
modified the milestones:
0.5 alpha,
1.0 Tempesta OS,
0.10 Kernel-User Space Transport
Feb 5, 2018
krizhanovsky
modified the milestones:
0.10 Kernel-User Space Transport ,
0.9 Web server
Feb 10, 2018
krizhanovsky
added a commit
that referenced
this issue
Nov 24, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
HTTP Load balancing
Separated from #76, in particular from #76 (comment) . A faster implementation of HTTP field matching is required for HTTP load balancing and filtering. There could be a hash table, such that we can make a quick jump by a rule key and the key can be calculate by the string and ID of the HTTP field. And/or BNDM with q-Grams (BG) algorithm can be used to quickly process many strings with common prefix.
Issue #76 works on massive number of backend servers:
Currently all 1000 and more
match
rules are matched sequentially. The example is quite realistic for massive hosting installations. BG algorithm implemented in #901 must be applied to the matching. Probably matching syntax should be adjusted like (with #731 in mind):HTTPtables
Strings matching
Also the use case from #731 must be processed in more efficient way, e.g. using hash table or a tree:
Memory spacial locality
At the moment
kzalloc()
is used on configuration phase a lot, so spacial locality on run time can be improved by using more local data structures.The chains
Currently HTTPtables sequentially scans all the rules in a chain, which isn't efficient. The first option is to run only one per-header match using multi-pattern matching. Probably, there are also other optimization opportunities.
We need some use cases on large chains to understand the typical workload, i.e. whether there are cases with many patterns for the same headers or there are mostly different headers matchers.
Generic strings matching
Actually, Tempesta FW is full of multiple strings matching. E.g. caching policy for content type suffix is performed with FOR loop in
tfw_capolicy_match()
while a powerfull web resource can have a lot of various suffixes: aif, aiff, au, avi, bin, bmp, cab, carb, cct, cdf, class, css, doc, dcr, dtd, gcf, gff, gif, grv, hdml, hqx, ico, ini, jpeg, jpg, js, mov, mp3, nc, pct, ppc, pws, swa, swf, txt, vbs, w32, wav, wbmp, wml, wmlc, wmls, wmlsc, xsd, zip.Testing
Functional tests
TBD
Performance
We need a solid estimation on which number of rules and/or chains the performance significantly degrades.
The text was updated successfully, but these errors were encountered: