-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sysctl net.ipv4.tcp_keepalive_time / other kernel parameter setting needed #165
Comments
We have created an issue in Pivotal Tracker to manage this: https://www.pivotaltracker.com/story/show/170644987 The labels on this github issue will be updated when the story is started. |
Hi @h0nIg - if I understand correctly(?) Do we in fact need to have people change a global default on the container or might it be possible to solve this by just setting the keep alive time on the relevant connection in the app code? (My hesitation on having it at the container level is it seems quite non-12-factor to have the app rely on the particular OS configuration it's running on and it's not clear to me from a CF CLI POV how it'd work? We could set it for every container in garden, but then your app would only work on that tweaked CF which seems sad) |
@julz it is quite hard to convince every library or developer to set those for their applications in order to prevent support tickets. Lets take a look how customers see this: As a customer with a deployed application running into connection problems, i barely understand whats going on with my container OS. The behavior was changed due to the kernel upgrade by upgrading the stemcell. If it is not a way to go for you, to restore the behaviour like it was with kernel 4.4 (inherit from host), please make it possible to set those for the full CF landscape as a configuration. If this is still not a good idea, what about adding a generic mechanism to let the operator of the landscape set certain parameter for all garden containers? |
Hi @h0nIg - really sorry for slow response on this, somehow missed your reply. I'd still personally say it's not great if an app is relying on particular linux kernel tuning parameter being set to work correctly BUT yeah I totally see the problem you're pointing out here and we'd be happy to accept a PR to allow setting a global default value for all containers if you're interested in creating one (or we can prioritise a story to do it ourselves, but it might take us a little more time to get to that way). |
To chime in on your discussion here: @julz Java did add that option to configure Socket Keep Alive options only since Java 11 (which is still relatively recent). That means that before Java 11 there was no way to configure Keepalive on a Socket. As a result, most libraries doing some socket stuff have not exposed this option in their library. This also effects for example JDBC drivers. They are usually compiled on an earlier Java version to make them compatible for most relevant JRE's. Point is that the assumption that you can change in in your application itself basically prevents all Java applications from dropping connections quickly when the connection is broken. This assumption therefore also prevents use of proper failover in most Java applications. Ideally indeed this should be handled by the app, but in reality in a lot of cases you still can't as the socket is hidden by a library. The only way to configure it then is by changing the OS keep alive settings. |
as a result of my discussions with @krumts and that several kernel parameters might be required to get changed (net.ipv4.tcp_keepalive_time / net.ipv4.tcp_keepalive_intvl / net.ipv4.tcp_keepalive_probes), i would opt in to have flexibility to set several parameters instead of a static hardcoded one. If an CF landscape operator will change certain settings required by his IaaS setup, we should trust him that he knowns what he is doing. As this is important for SAP, lets try to find a way through your backlog and our colleagues (e.g. @yulianedyalkova) who work on the backlog |
Hi @arjenw and @h0nIg - I understand where you're coming from, but I'd really like to avoid a situation where an app works on one CF but not another, which is something that so far has (I think) always been the case. I'd also really like to avoid a situation where we're asking an operator to tweak lots of kernel parameters (and potentially having hard to debug issues where the settings differ between CFs). Is there any chance there are reasonable default values for these things we could set that would work for all CF apps? |
Hi @julz
Interesting. These are actually the two goals we also wanted to achieve as we started the discussions, so I guess we have different understanding :) I'll try to explain the issue from my PoV.
Let's consinder an app, which uses a connection pool a - it creates a connection, uses it, puts it in the pool and takes it again from there after 400s.
And it gets really funky when the NAT GW starts reusing its local ports (which are free from its PoV) as src port for the connection, for which the server may still think it has an ESTABLISHED connection (one that timed out in the NAT GW). Unfortunately these are not just theoretical issues and they are definitely hard to debug - I've spent probably months of my own time and this is usually after the app developers and may be the team operating the destination spend lots of their time. My hope is, that if we as operators use the knowledge about what the concrete constrains of the infrastructure are (apps don't know these details), we could try to to come up with a setup which is working on each infrastructure. Now back to
I understand that. Also for us fiddling with kernel parameters is not the first thing we looked for. However, we've seen concrete issues, and see no real way to solve them otherwise (I don't claim there isn't another way, may be we just don't have the experience yet) Modifying the tcp parameters by the app is hard
Difficult to say. I can give an answer that would fit the 4 clouds mentioned above, but I guess there are other setups too. I hope I could explain how we came to the discussion. |
In the past, we used to default DNS configuration to match whatever the host was set to -- on the assumption that you probably want your containers to share a similar configuration to the host. If I understand correctly in previous kernels that was also the behaviour of the network namespace with regards to these properties (they were inherited from the host netns). What if we restored that behaviour and set the configs to match the same values on the host, would that work? |
@julz I think this should work. At least so far I don't have a use case where we want to have host and containers set differently, so this could be a reasonable pragmatic approach. Let's see what the others think. |
So, summarising: if we allowed configuring @arjenw @h0nIg - does something like the above lines seem like a reasonable solution to you? |
@julz this would be a solution for us as well |
Description
This is a follow up issue on #70, since the issue was closed.
We’ve seen apps on the platform (running on AWS) which talk to DBs exposed via public IPs. These connections are going via an AWS NAT Gateway which has an idle timeout of 350 sec. If the apps run some queries (don’t know what these are :)) which need longer to get a response from the server, then the connections is “freed” on the NAT GW and only later if the app tries to send agains some data over the connection it gets a RST.
The connections via the NAT GW can be kept open if one of the sides would send tcp keepalive packets. However, the containers in which the apps are running have the defaults (net.ipv4.tcp_keepalive_time = 7200) - the first probe is made after 2h. On the Diego Cell VM the settings are different (net.ipv4.tcp_keepalive_time = 120, see https://github.com/cloudfoundry/bosh-linux-stemcell-builder/blob/acc0c1d039be5beeb30be0c9385a1b1c54e89218/stemcell_builder/stages/bosh_sysctl/assets/60-bosh-sysctl.conf#L35) but the latter are not inherited in the container namespaces, there the defaults are used. So at the moment neither the app developers can modify the settings for the containers, neither we as operators of the platform (at least we haven’t figured out how).
There should be a mechanism to set kernel parameters inside a container to overcome problematic default parameters. It will not help by modifying cflinuxfs3 with e.g. /etc/sysctl.d/20-myconfiguration.conf, because a couple of kernel parameters can not get changed since they are readonly and can either get set for privileged containers (you do not want to do this...) or during creation of the container.
Steps to reproduce
login to a CF diego cell
Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:
The text was updated successfully, but these errors were encountered: