Skip to content

Conversation

@serrislew
Copy link
Contributor

Proxies should serve stale content even if DNS lookup to OS is unsuccessful.

  • For hierarchical caching, if a child proxy matches successfully to a parent proxy and child has requested stale object in its cache, it should not need to make a DNS request for the origin.
    • It can validate with the parent and return the object as expected.
  • Without hierarchical caching, if stale content is being requested from a proxy and the DNS lookup to the OS is unsuccessful, it should also serve the stale object
    • this behavior is similar to when the origin server is down and the proxy serves stale content with a warning header

These changes still respect the max_stale_age configurations and do not serve stale content pass the expiration date.

Autest tests the following using PUSH command to store stale content in proxy's cache:

  • With child mapping to an origin server without a DNS entry and child matching successfully to parent, a request from child after content becomes stale returns 200
    • After max_stale_age has passed, request from child returns 500 as stale content is no longer returnable
  • Without hierarchical caching and proxy mapping to an origin server without a DNS entry, a request from proxy after content becomes stale returns 200
    • After max_stale_age has passed, request from proxy returns 500 as stale content is no longer returnable


# Object to push to proxies
stale_5 = "HTTP/1.1 200 OK\nServer: ATS/10.0.0\nAccept-Ranges: bytes\nContent-Length: 6\nCache-Control: public, max-age=5\n\nCACHED"
stale_30 = "HTTP/1.1 200 OK\nServer: ATS/10.0.0\nAccept-Ranges: bytes\nContent-Length: 6\nCache-Control: public, max-age=30\n\nCACHED"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be changed to max-age=10 (or other value less than 30)? @zwoop may only give you a fruit cake for X-mas if you make Au regression test runs take 30 seconds longer.

f'map http://localhost:{ts_parent.Variables.port} {server_name}'
)
ts_parent.Disk.remap_config.AddLine(
f'map {server_name} {server_name}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing for non-hierarchical scenarios as well so the ts_parent server is mapped to the OS and gets called directly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you can also do:

'proxy.config.url_remap.remap_required': 0,

But the way you have it is fine.

@ywkaras
Copy link
Contributor

ywkaras commented Dec 10, 2021

I forget now, did you say Apple is already running this on production proxies?

'proxy.config.http.parent_proxy.self_detect': 0,
})
ts_child.Disk.parent_config.AddLine(
f'dest_domain=. parent=localhost:{ts_parent.Variables.port} round_robin=consistent_hash go_direct=false'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So even with this configured, child TS will still do a DNS lookup on unkown.domain.com ? Did you see the DNS lookup in the http debug trace output?

Copy link
Contributor Author

@serrislew serrislew Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The http debug trace output shows the child doing PPDNS lookup to the parent and the parent doing a DNS lookup to unknown.domain.com (OS). After the DNS lookup fails for the parent but the parent has the stale returnable object in its cache, the child will revalidate from the parent (without doing a DNS lookup) and return the object to the client. Let me know if you are seeing something different in the debug trace output!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lately we've been including DNS servers in our Au tests:

'proxy.config.dns.nameservers': f"127.0.0.1:{dns.Variables.Port}", # Only nameservers if resolv_conf NULL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why ATS would do a DNS lookup on the request host when it's not using the request host as the next hop. But I suppose that's beyond the scope of this PR.

ts_parent = Test.MakeATSProcess("ts_parent")
server_name = "http://unknown.domain.com/"

Test.testName = "STALE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of our Au tests don't assigned testName, I'm not sure what it's for:

wkaras ~/REPOS/TS3/tests/gold_tests
M$ find . -name \*.test.py | wc -l
     239
wkaras ~/REPOS/TS3/tests/gold_tests
M$ fgrep Test.testName $( find . -name \*.test.py ) | wc -l
      44
wkaras ~/REPOS/TS3/tests/gold_tests
M$

@ywkaras
Copy link
Contributor

ywkaras commented Jan 4, 2022

We discussed this in today's meeting, it turns out microdns is trickier to use than I thought, and not well-documented, so approving this as is.

@vmamidi vmamidi merged commit 85be282 into apache:master Jan 11, 2022
zwoop pushed a commit that referenced this pull request Apr 11, 2022
* serve stale when dns lookups fails

* updated max-age in autest to run shorter

Co-authored-by: Serris Lew <lserris@apple.com>
(cherry picked from commit 85be282)
@zwoop
Copy link
Contributor

zwoop commented Apr 11, 2022

Cherry-picked to v9.2.x

@zwoop zwoop modified the milestones: 10.0.0, 9.2.0 Apr 11, 2022
moonchen pushed a commit to moonchen/trafficserver that referenced this pull request May 26, 2022
* asf/9.2.x:
  Updated ChangeLog
  Serve stale content when DNS lookup fails (apache#8484)
  Reuse TSMutex for ts_lua_http_intercept_handler (apache#8687)
  include <cstring> for access to C-string operations (apache#8786)
  Propagate proxy.config.net.sock_option_flag_in to newly accepted connections (apache#8784)
hnakamur added a commit to hnakamur/trafficserver9-deb that referenced this pull request Jan 24, 2023
Delete debian/patches/0014-serve-stale-when-dns-lookup-fails.patch
since it was cherry-picked to 9.2.x
apache/trafficserver#8484 (comment)
hnakamur added a commit to hnakamur/trafficserver9-deb that referenced this pull request Jan 24, 2023
Delete debian/patches/0014-serve-stale-when-dns-lookup-fails.patch
since it was cherry-picked to 9.2.x
apache/trafficserver#8484 (comment)
hnakamur added a commit to hnakamur/trafficserver9-deb that referenced this pull request Jan 24, 2023
Delete debian/patches/0014-serve-stale-when-dns-lookup-fails.patch
since it was cherry-picked to 9.2.x
apache/trafficserver#8484 (comment)
JosiahWI pushed a commit to JosiahWI/trafficserver that referenced this pull request Jul 19, 2023
* serve stale when dns lookups fails

* updated max-age in autest to run shorter

Co-authored-by: Serris Lew <lserris@apple.com>
(cherry picked from commit 85be282)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants