Skip to content

Commit 07f4c90

Browse files
edumazetdavem330
authored andcommitted
tcp/dccp: try to not exhaust ip_local_port_range in connect()
A long standing problem on busy servers is the tiny available TCP port range (/proc/sys/net/ipv4/ip_local_port_range) and the default sequential allocation of source ports in connect() system call. If a host is having a lot of active TCP sessions, chances are very high that all ports are in use by at least one flow, and subsequent bind(0) attempts fail, or have to scan a big portion of space to find a slot. In this patch, I changed the starting point in __inet_hash_connect() so that we try to favor even [1] ports, leaving odd ports for bind() users. We still perform a sequential search, so there is no guarantee, but if connect() targets are very different, end result is we leave more ports available to bind(), and we spread them all over the range, lowering time for both connect() and bind() to find a slot. This strategy only works well if /proc/sys/net/ipv4/ip_local_port_range is even, ie if start/end values have different parity. Therefore, default /proc/sys/net/ipv4/ip_local_port_range was changed to 32768 - 60999 (instead of 32768 - 61000) There is no change on security aspects here, only some poor hashing schemes could be eventually impacted by this change. [1] : The odd/even property depends on ip_local_port_range values parity Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 837b995 commit 07f4c90

File tree

3 files changed

+14
-6
lines changed

3 files changed

+14
-6
lines changed

Documentation/networking/ip-sysctl.txt

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -751,8 +751,10 @@ IP Variables:
751751
ip_local_port_range - 2 INTEGERS
752752
Defines the local port range that is used by TCP and UDP to
753753
choose the local port. The first number is the first, the
754-
second the last local port number. The default values are
755-
32768 and 61000 respectively.
754+
second the last local port number.
755+
If possible, it is better these numbers have different parity.
756+
(one even and one odd values)
757+
The default values are 32768 and 60999 respectively.
756758

757759
ip_local_reserved_ports - list of comma separated ranges
758760
Specify the ports which are reserved for known third-party
@@ -775,7 +777,7 @@ ip_local_reserved_ports - list of comma separated ranges
775777
ip_local_port_range, e.g.:
776778

777779
$ cat /proc/sys/net/ipv4/ip_local_port_range
778-
32000 61000
780+
32000 60999
779781
$ cat /proc/sys/net/ipv4/ip_local_reserved_ports
780782
8080,9148
781783

net/ipv4/af_inet.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1595,7 +1595,7 @@ static __net_init int inet_init_net(struct net *net)
15951595
*/
15961596
seqlock_init(&net->ipv4.ip_local_ports.lock);
15971597
net->ipv4.ip_local_ports.range[0] = 32768;
1598-
net->ipv4.ip_local_ports.range[1] = 61000;
1598+
net->ipv4.ip_local_ports.range[1] = 60999;
15991599

16001600
seqlock_init(&net->ipv4.ping_group_range.lock);
16011601
/*

net/ipv4/inet_hashtables.c

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -502,8 +502,14 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
502502
inet_get_local_port_range(net, &low, &high);
503503
remaining = (high - low) + 1;
504504

505+
/* By starting with offset being an even number,
506+
* we tend to leave about 50% of ports for other uses,
507+
* like bind(0).
508+
*/
509+
offset &= ~1;
510+
505511
local_bh_disable();
506-
for (i = 1; i <= remaining; i++) {
512+
for (i = 0; i < remaining; i++) {
507513
port = low + (i + offset) % remaining;
508514
if (inet_is_local_reserved_port(net, port))
509515
continue;
@@ -547,7 +553,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
547553
return -EADDRNOTAVAIL;
548554

549555
ok:
550-
hint += i;
556+
hint += (i + 2) & ~1;
551557

552558
/* Head lock still held and bh's disabled */
553559
inet_bind_hash(sk, tb, port);

0 commit comments

Comments
 (0)