-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hashset-mode into ACL rules #627
Conversation
Just do it. |
Shouldn't we support sub-domains? For example:
matches |
One of the collaborators indicates that |
Done. I've added some ad-hoc tree-like structure powered by hashmaps. Not sure how it fast is it relative to specialized collections. But in my (pathologic) case it's hundreds times faster than regexes. Now
Ok. I'll leave this as is. Performance is already good enough for me. |
Excellent. |
这个改动好大,原来的acl文件格式兼容吗? |
这个正则有什么问题吗? |
因为 regex 匹配时关闭了 Unicode 字符的匹配,所以这里不能用 |
No, regex is fine. That's some quirk of |
谢谢。我想问,那如果开启unicode匹配有什么隐患吗? |
PR 里面已经写得很清楚了,域名中并不会出现 ascii 字符以外的内容(如果要匹配诸如 关闭 unicode 字符匹配可以提升匹配速度。 |
Using unicode in regex matching will definitely waste memory because domain names must be ASCII characters only (punycode). #629 should fix this issue. |
I have about 320000 hostnames rules and with regexps both performance and memory consumption are abysmal. Hashsets are much faster and smaller in my case. Memory decreased from 2.8GB to 35MB. It's possible to increase performance even further - replace default hasher with ahash, but since it requires to add new package I'll leave that decision for you.
I added hashsets' parsing in backward-compatible fashion, so old rules will work without any changes. Also I added an optimization to regexp matcher. Since all dns records are encoded in ASCII, there is no need for unicode support in regexes.