You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since #7159, there's now a default limit of 100 simultaneous HTTP connections coming from a single IP. When Vault is using Consul as a backend, by default its max_parallel setting allows it to open up to 128 connections to Consul. When Vault exceeds the connection limit, it sees it as a backend failure, leading it to sealing itself.
Reproduction Steps
Steps to reproduce this issue:
I haven't tried to reproduce it directly, though it's my analysis of logs after a failure that led me to conclude this is the problem. I expect it could be reproduced by inducing a high number of parallel requests to Vault with the default Consul/Vault settings. When I observed it was during the lease restoration phase of Vault startup, which runs 64 parallel requests. I imagine there must have been additional traffic to account for the remaining 37 requests that would be needed to exceed the limit of 100, but I don't know yet where that's coming from, since Vault isn't unsealed while the restoration is ongoing. Either way I'm confident that this new 1.6.3 behaviour will be problematic for some existing Vault+Consul users.
Consul info for both Client and Server
Client info
Vault 1.2.3+ent.
Server info
I don't have direct access to the system in question, but this was observed multiple times with Consul 1.6.3, and does not manifest with Consul 1.6.1.
Operating system and Environment details
Best I can do right now is "linux, amd64".
Log Fragments
vault: 2020-02-11T12:46:53.984-0600 [ERROR] expiration: error restoring leases: error="failed to read lease entry XXXX: Get http://127.0.0.1:8500/v1/kv/vault/sys/expire/id/XXXX: read tcp 127.0.0.1:50460->127.0.0.1:8500: read: connection reset by peer"
vault: 2020-02-11T12:38:36.634-0600 [ERROR] expiration: error restoring leases: error="failed to read lease entry YYYY: Get http://127.0.0.1:8500/v1/kv/vault/sys/expire/id/YYYY: EOF"
The text was updated successfully, but these errors were encountered:
FYI @ncabatoff if you can access Consul logs while this is happening we explicitly log when we reset connections due to that limit which could confirm your hypothesis further.
That said, I'm pretty sure you're right and we should increase the default to allow this case.
A work around for the interim is that operators can change config in Consul (and presumably Vault) to match but we should make the compatible by default.
Overview of the Issue
Since #7159, there's now a default limit of 100 simultaneous HTTP connections coming from a single IP. When Vault is using Consul as a backend, by default its max_parallel setting allows it to open up to 128 connections to Consul. When Vault exceeds the connection limit, it sees it as a backend failure, leading it to sealing itself.
Reproduction Steps
Steps to reproduce this issue:
I haven't tried to reproduce it directly, though it's my analysis of logs after a failure that led me to conclude this is the problem. I expect it could be reproduced by inducing a high number of parallel requests to Vault with the default Consul/Vault settings. When I observed it was during the lease restoration phase of Vault startup, which runs 64 parallel requests. I imagine there must have been additional traffic to account for the remaining 37 requests that would be needed to exceed the limit of 100, but I don't know yet where that's coming from, since Vault isn't unsealed while the restoration is ongoing. Either way I'm confident that this new 1.6.3 behaviour will be problematic for some existing Vault+Consul users.
Consul info for both Client and Server
Client info
Vault 1.2.3+ent.
Server info
I don't have direct access to the system in question, but this was observed multiple times with Consul 1.6.3, and does not manifest with Consul 1.6.1.
Operating system and Environment details
Best I can do right now is "linux, amd64".
Log Fragments
The text was updated successfully, but these errors were encountered: