-
Notifications
You must be signed in to change notification settings - Fork 5k
feat(core/dns): new dns client library #12305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
126 commits
Select commit
Hold shift + click to select a range
6192fb9
refactor(dns): new dns client library
chobits 6d29383
add files to kong-3.7.0-0.rockspec
chobits 8d1e466
30-new-dns-client/02-old_client_spec.lua: use CI nameserver instead
chobits bf40be8
return last answer error if no available answers
chobits a46b3c7
set _G.busted_legacy_dns_client for original 21-dns-client/ tests
chobits b40e3ec
chores: better comment
chobits b40e417
add changelog
chobits 78d54a3
automatically refresh stale-but-in-use records after @stale_refresh_i…
chobits 07fbc34
revert "automatically refresh stale-but-in-use records after @stale_r…
chobits e77bd8d
add kong_dns_cache{_miss} shared dict into templates/nginx_kong.lua
chobits 2ed541e
only purge cache for test cases
chobits 18c9c65
use kong.worker_events instead of mlcache shm based ipc
chobits 8d22bdf
remove debug log for @tries
chobits 3a7104d
support req_dyn_hook.run_hooks
chobits 01dfaf2
supports __tostring of @tries table (error list)
chobits 157650b
coding style: use a 2-space indentation and localized some variables
chobits cb9293c
coding style: change table.insert to table_insert
chobits 1340b97
fix test case of stale updating task
chobits 3b4f0c6
fix bug: should insert `nil` value as missed data into mlcache
chobits 0c3f595
simplify injecting resolver.query logic in tests 30-new-dns-client/*
chobits 904f8b3
optimize cache inserting logic to avoid unnecessary IPC to broadcast …
chobits 8877f9e
avoid running callback from local worker's events and add tests for IPC
chobits 0338f1b
coding style fix and keep the error string consistent with the previo…
chobits a8cc892
fix shared_dict shm size
chobits c7f8cba
fix typo: CACHE_ONLY_MISS_ANSWERS -> CACHE_ONLY_ANSWERS
chobits cb3b499
fix test case: 01-request-debug_spec.lua: dns cache hit
chobits 50e253c
copy the provided opts table with new function copy_options
chobits 1812538
create timer using a static function instead of recreating closures
chobits 4a954c9
add constant LONG_LASTING_TTL for 10 years ttl value
chobits 92b40df
add comment for maximum TTL value: 0xffffffff
chobits 7acc0e0
fix coding styles and add more comments
chobits c57a449
add comment for sleep(0.2) in 04-round_robin_spec.lua
chobits 89882dd
coding style: removed unnecessary blank line in 04-round_robin_spec.lua
chobits de35aff
fixed flakiness of stale updating test case in 03-old_client_cache_sp…
chobits 49846e3
fix error message and update test case titles
chobits 6e3fcdf
fix bug that stale records will be not updated if querying nameserver…
chobits cb1781d
compatible with original dns client: skip the SRV record pointing to …
chobits a9916e5
revert shm_miss feature, which makes source code more complex
chobits 4a5516c
support admin API "/dns" to get statistics
chobits c4eedb7
fix lint error
chobits 936a25d
complete the release file: refactor_dns_client.yml
chobits ada4b54
chore: assign TYPE_LAST to _M.TYPE_LAST instead of -1
chobits 5d124db
Update release file
chobits 8f15f13
fix text of `dns_no_sync` option in refactor_dns_client.yml
chobits 6cc8d4e
process the scenario of timeout=0 in /etc/resolv.conf
chobits 39d18cb
chores(*): fix coding style; add comments; make constant records read…
chobits 1e19818
add a comment to explain of the concurrenct control of asynchronous t…
chobits 0d63172
fixed lock_timeout: r:query() has two IO operations send() & receive()
chobits b7a6ccc
automatically refresh stale-but-in-use records every 60s triggered by…
chobits e484ecf
added kong/resty/dns_client/README.md
chobits ad2cdb5
change statistics API path from /dns to /status/dns
chobits b0afa32
d11y: add key-value "query_last_time": "<unixtime> <duration>" into s…
chobits 3e533f7
fixed markdown format of kong/resty/dns_client/README.md
chobits 04e5a72
fix format for kong/resty/dns_client/README.md
chobits 47bae5f
add debug logs
chobits e789fc6
fix refactor_dns_client.yml to make it more user-friendly
chobits 2ccc33d
chore: use string_lower instead of <var>:lower() for debugging
chobits c3bcbcf
chores: refactor variable names
chobits 223ac01
fixed coding style(add spaces) and fix resolv.options.timeout checking
chobits e9a6485
move ip address answers generating logic into cache:get callback
chobits 6668b7c
modify some table_insert to "t[i] = v" and check order instead of che…
chobits 1ceb711
use empty table for opts as default value in _M.new()
chobits 23f0dc5
perf: return body directly instead of creating a local variable
chobits f3cca27
fix status code to 501 if dns stats not implemented for API "/status/…
chobits 22bb755
perf: convert variables (localhosts/empty_answers) to constants
chobits 4ed3f79
perf: firstly check for tailing dot in is_fqdn
chobits 408bbaf
chore: better comment for parseResolvConf-TODO
chobits c75e7d6
ensure valid_ttl doesn't exceed maximum ttl 0xffffffff
chobits 5b758b2
chore: rename get_round_robin_answers to get_next_round_robin_answers
chobits bf5f756
perf: dont use table as input parameters for APIs and add a new API `…
chobits 10be035
README.md: add apis `resolve_address` and and fix format
chobits f28ee49
perf: convert some variables local constants
chobits 5242043
improve readability: list _M.TYPE_XXX value directly
chobits e9d570f
refactor function name and fix lint issue
chobits 14fe836
refactor function names for better test
chobits 6cba090
chores: do not check for r.destroy before using it
chobits 07a2f75
1
chobits 59fa5f3
move library path to kong/dns
chobits 81746b8
mark it TODO to convert ipc to a module contant
chobits bb436a4
use do-end block to wrap init_hosts and insert_answer_into_cache
chobits 45d6d86
add comments and test cases for API utils.ipv6_bracket
chobits 46f68b5
remove unused kong.tools.utils requirement in test cases
chobits d6847bc
add comments for cwid checking and hosts
chobits 011255b
chores: fix some coding styles
chobits 4c224b4
Update kong/dns/README.md: remove `the` word
chobits 7793247
remove use of readonly function for cached DNS records
chobits 4eaace2
fix coding style: localize SWRR logic
chobits 4f31f16
re-insert hosts entries to cache if it is evicted
chobits 75f45c8
chores: remove = aligning
chobits 47a895d
remove empty table creation in hot code paths
chobits 2cb279c
fix lint error: resolve_names -> resolved_names
chobits 4727cf7
chores: fix a couple of missing localizations
chobits 7c5fd97
fix opts initialization in _M.init()
chobits 444f47b
remove local variable options for r:query
chobits 96f9329
avoid checking for `ngx.ctx.has_timing` in recursion
chobits afdfcb6
use `legacy_dns_client` switch to check if we need to reply 501 in /s…
chobits b1fa80a
added debug log for EE test cases
chobits 8ebc538
chores: fixed lines exceeding 80 characters by a large margin
chobits 95fc78e
compatible with the modified req dyc debug API
chobits 22cd510
remove the logic of CNAME and recursive detection
chobits 03aefed
remove LAST type logic
chobits 6958eb2
only use error_ttl, remove empty_ttl logic
chobits f9911ea
fix type in readme.md
chobits 34783db
change paths of test cases directory
chobits a0786a9
set legacy_dns_client off for some cases
chobits 12c63fb
update changelog yml
chobits a50282d
disable additional section & add tests
chobits 91c896c
further simplify code: either query A/AAAA or SRV
chobits 3504b26
revert pathes modification for conflicts
chobits 602ff4f
fix health check tests for SRV
chobits 654c776
fix /status/dns test cases
chobits d0d196c
chores(dns): fixed coding style
chobits 350f2bd
chores(dns): fixed coding style: MT -> _MT
chobits 91b1f2d
@chobits chores(dns): fixed coding style: remove () from srv port
chobits b6955f0
chores(dns): fix coding style
chobits 67813e2
chores(test): fix typo, return `ttl` instead of `tries`
chobits dd26cf2
fix conflicts: remove modification in test: 01-instrumentations_spec.lua
chobits b69ec7f
fix conflicts and its tests
chobits 97705b9
chores(dns/README.md): fixed types
chobits 28770c0
perf(dns): reduce table creation
chobits 593f4ed
fixed coding styles: add more blanks and rename some variables
chobits b0d5455
add option:random_resolver and fixed docs
chobits 7ce9599
change seperator from `:` to `|` in the output of API /status/dns
chobits 37cf30d
add a TODO for more structured `tries`
chobits 48994bc
doc: perf test for memory consumption
chobits 1f0bc17
stale_ttl: fix expired time caculation
chobits File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
message: > | ||
Starting from this version, a new DNS client library has been implemented and added into Kong. The new DNS client library has the following changes | ||
- Introduced global caching for DNS records across workers, significantly reducing the query load on DNS servers. | ||
- Introduced observable statistics for the new DNS client, and a new Admin API `/status/dns` to retrieve them. | ||
- Deprecated the `dns_no_sync` option. Multiple DNS queries for the same name will always be synchronized (even across workers). This remains functional with the legacy DNS client library. | ||
- Deprecated the `dns_not_found_ttl` option. It uses the `dns_error_ttl` option for all error responses. This option remains functional with the legacy DNS client library. | ||
- Deprecated the `dns_order` option. By default, SRV, A, and AAAA are supported. Only names in the SRV format (`_service._proto.name`) enable resolving of DNS SRV records. | ||
type: feature | ||
scope: Core |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
Name | ||
chobits marked this conversation as resolved.
Show resolved
Hide resolved
|
||
==== | ||
|
||
Kong DNS client - The module is currently only used by Kong, and builds on top of the `lua-resty-dns` and `lua-resty-mlcache` libraries. | ||
|
||
Table of Contents | ||
================= | ||
|
||
* [Name](#name) | ||
* [APIs](#apis) | ||
* [new](#new) | ||
* [resolve](#resolve) | ||
* [resolve_address](#resolve_address) | ||
* [Performance characteristics](#performance-characteristics) | ||
* [Memory](#memory) | ||
|
||
# APIs | ||
chobits marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The following APIs are for internal development use only within Kong. In the current version, the new DNS library still needs to be compatible with the original DNS library. Therefore, the functions listed below cannot be directly invoked. For example, the `_M:resolve` function in the following APIs will be replaced to ensure compatibility with the previous DNS library API interface specifications `_M.resolve`. | ||
|
||
## new | ||
|
||
**syntax:** *c, err = dns_client.new(opts)* | ||
**context:** any | ||
|
||
**Functionality:** | ||
|
||
Creates a dns client object. Returns `nil` and a message string on error. | ||
|
||
Performs a series of initialization operations: | ||
|
||
* parse `host` file, | ||
* parse `resolv.conf` file (used by the underlying `lua-resty-dns` library), | ||
* initialize multiple TTL options, | ||
* create a mlcache object and initialize it. | ||
|
||
**Input parameters:** | ||
|
||
`@opts` It accepts a options table argument. The following options are supported: | ||
|
||
* TTL options: | ||
* `valid_ttl`: (default: `nil`) | ||
* By default, it caches answers using the TTL value of a response. This optional parameter (in seconds) allows overriding it. | ||
* `stale_ttl`: (default: `3600`) | ||
* the time in seconds for keeping expired DNS records. | ||
* Stale data remains in use from when a record expires until either the background refresh query completes or until `stale_ttl` seconds have passed. This helps Kong stay resilient if the DNS server is temporarily unavailable. | ||
* `error_ttl`: (default: `1`) | ||
* the time in seconds for caching DNS error responses. | ||
* `hosts`: (default: `/etc/hosts`) | ||
* the path of `hosts` file. | ||
* `resolv_conf`: (default: `/etc/resolv.conf`) | ||
* the path of `resolv.conf` file, it will be parsed and passed into the underlying `lua-resty-dns` library. | ||
* `family`: (default: `{ "SRV", "A", "AAAA" }`) | ||
* the types of DNS records that the library should query, it is taken from `kong.conf` option `dns_family`. | ||
* options for the underlying `lua-resty-dns` library: | ||
* `retrans`: (default: `5`) | ||
* the total number of times of retransmitting the DNS request when receiving a DNS response times out according to the timeout setting. When trying to retransmit the query, the next nameserver according to the round-robin algorithm will be picked up. | ||
* If not given, it is taken from `resolv.conf` option `options attempts:<value>`. | ||
* `timeout`: (default: `2000`) | ||
* the time in milliseconds for waiting for the response for a single attempt of request transmission. | ||
* If not given, it is taken from `resolv.conf` option `options timeout:<value>`. But note that its unit in `resolv.conf` is second. | ||
* `random_resolver`: (default: `false`) | ||
* a boolean flag controls whether to randomly pick the nameserver to query first. If `true`, it will always start with the random nameserver. | ||
* If not given, it is taken from `resolv.conf` option `rotate`. | ||
* `nameservers`: | ||
* a list of nameservers to be used. Each nameserver entry can be either a single hostname string or a table holding both the hostname string and the port number. For example, `{"8.8.8.8", {"8.8.4.4", 53} }`. | ||
* If not given, it is taken from `resolv.conf` option `nameserver`. | ||
* `cache_purge`: (default: `false`) | ||
* a boolean flag controls whether to clear the internal cache shared by other DNS client instances across workers. | ||
|
||
[Back to TOC](#table-of-contents) | ||
|
||
## resolve | ||
|
||
**syntax:** *answers, err, tries? = resolve(qname, qtype, cache_only, tries?)* | ||
**context:** *rewrite_by_lua\*, access_by_lua\*, content_by_lua\*, ngx.timer.\** | ||
|
||
**Functionality:** | ||
|
||
Performs a DNS resolution. | ||
|
||
1. Check if the `<qname>` matches SRV format (`\_service.\_proto.name`) to determine the `<qtype>` (SRV or A/AAAA), then use the key `<qname>:<qtype>` to query mlcache. If cached results are found, return them directly. | ||
2. If there are no results available in the cache, it triggers the L3 callback of `mlcache:get` to query records from the DNS servers, details are as follows: | ||
1. Check if `<qname>` has an IP address in the `hosts` file, return if found. | ||
2. Check if `<qname>` is an IP address itself, return if true. | ||
3. Use `mlcache:peek` to check if the expired key still exists in the shared dictionary. If it does, return it directly to mlcache and trigger an asynchronous background task to update the expired data (`start_stale_update_task`). The maximum time that expired data can be reused is `stale_ttl`, but the maximum TTL returned to mlcache cannot exceed 60s. This way, if the expired key is not successfully updated by the background task after 60s, it can still be reused by calling the `resolve` function from the upper layer to trigger the L3 callback to continue executing this logic and initiate another background task for updating. | ||
1. For example, with a `stale_ttl` of 3600s, if the background task fails to update the record due to network issues during this time, and the upper-level application continues to call resolve to get the domain name result, it will trigger a background task to query the DNS result for that domain name every 60s, resulting in approximately 60 background tasks being triggered (3600s/60s). | ||
4. Query the DNS server, with `<qname>:<qtype>` combinations: | ||
1. The `<qname>` is extended according to settings in `resolv.conf`, such as `ndots`, `search`, and `domain`. | ||
|
||
**Return value:** | ||
|
||
* Return value `answers, err`: | ||
* Return one array-like Lua table contains all the records. | ||
chobits marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* For example, `{{"address":"[2001:db8:3333:4444:5555:6666:7777:8888]","class":1,"name":"example.test","ttl":30,"type":28},{"address":"192.168.1.1","class":1,"name":"example.test","ttl":30,"type":1},"expire":1720765379,"ttl":30}`. | ||
* IPv6 addresses are enclosed in brackets (`[]`). | ||
* If the server returns a non-zero error code, it will return `nil` and a string describing the error in this record. | ||
* For example, `nil, "dns server error: name error"`, the server returned a result with error code 3 (NXDOMAIN). | ||
* In case of severe errors, such network error or server's malformed DNS record response, it will return `nil` and a string describing the error instead. For example: | ||
* `nil, "dns server error: failed to send request to UDP server 10.0.0.1:53: timeout"`, there was a network issue. | ||
* Return value and input parameter `@tries?`: | ||
* If provided as an empty table, it will be returned as a third result. This table will be an array containing the error message for each (if any) failed try. | ||
* For example, `[["example.test:A","dns server error: 3 name error"], ["example.test:AAAA","dns server error: 3 name error"]]`, both attempts failed due to a DNS server error with error code 3 (NXDOMAIN), indicating a name error. | ||
chobits marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
**Input parameters:** | ||
|
||
* `@qname`: the domain name to resolve. | ||
* `@qtype`: (optional: `nil` or DNS TYPE value) | ||
* specify the query type instead of `self.order` types. | ||
* `@cache_only`: (optional: `boolean`) | ||
* control whether to solely retrieve data from the internal cache without querying to the nameserver. | ||
* `@tries?`: see the above section `Return value and input paramter @tries?`. | ||
|
||
[Back to TOC](#table-of-contents) | ||
|
||
## resolve_address | ||
|
||
**syntax:** *ip, port_or_err, tries? = resolve_address(name, port, cache_only, tries?)* | ||
**context:** *rewrite_by_lua\*, access_by_lua\*, content_by_lua\*, ngx.timer.\** | ||
|
||
**Functionality:** | ||
|
||
Performs a DNS resolution, and return a single randomly selected address (IP and port number). | ||
|
||
When calling multiple times on cached records, it will apply load-balancing based on a round-robin (RR) scheme. For SRV records, this will be a _weighted_ round-robin (WRR) scheme (because of the weights it will be randomized). It will apply the round-robin schemes on each level individually. | ||
|
||
**Return value:** | ||
|
||
* Return value `ip, port_or_err`: | ||
* Return one IP address and port number from records. | ||
* Return `nil, err` if errors occur, with `err` containing an error message. | ||
* Return value and input parameter `@tries?`: same as `@tries?` of `resolve` API. | ||
|
||
**Input parameters:** | ||
|
||
* `@name`: the domain name to resolve. | ||
* `@port`: (optional: `nil` or port number) | ||
* default port number to return if none was found in the lookup chain (only SRV records carry port information, SRV with `port=0` will be ignored). | ||
* `@cache_only`: (optional: `boolean`) | ||
* control whether to solely retrieve data from the internal cache without querying to the nameserver. | ||
|
||
[Back to TOC](#table-of-contents) | ||
|
||
# Performance characteristics | ||
|
||
## Memory | ||
|
||
We evaluated the capacity of DNS records using the following resources: | ||
|
||
* Shared memory size: | ||
* 5 MB (by default): `lua_shared_dict kong_dns_cache 5m`. | ||
* 10 MB: `lua_shared_dict kong_dns_cache 10m`. | ||
* DNS response: | ||
* Each DNS resolution response contains some number of A type records. | ||
* Record: ~80 bytes json string, e.g., `{address = "127.0.0.1", name = <domain>, ttl = 3600, class = 1, type = 1}`. | ||
* Domain: ~36 bytes string, e.g., `example<n>.long.long.long.long.test`. Domain names with lengths between 10 and 36 bytes yield similar results. | ||
|
||
The results of ) are as follows: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Try to fix it in #13389 |
||
|
||
| shared memory size | number of records per response | number of loaded responses | | ||
|--------------------|-------------------|----------| | ||
| 5 MB | 1 | 20224 | | ||
| 5 MB | 2 ~ 3 | 10081 | | ||
| 5 MB | 4 ~ 9 | 5041 | | ||
| 5 MB | 10 ~ 20 | 5041 | | ||
| 5 MB | 21 ~ 32 | 1261 | | ||
| 10 MB | 1 | 40704 | | ||
| 10 MB | 2 ~ 3 | 20321 | | ||
| 10 MB | 4 ~ 9 | 10161 | | ||
| 10 MB | 10 ~ 20 | 5081 | | ||
| 10 MB | 20 ~ 32 | 2541 | | ||
|
||
|
||
[Back to TOC](#table-of-contents) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.