Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/2 vs HTTPS handling #1643

Merged
merged 26 commits into from
Sep 6, 2022
Merged

HTTP/2 vs HTTPS handling #1643

merged 26 commits into from
Sep 6, 2022

Conversation

krizhanovsky
Copy link
Contributor

@krizhanovsky krizhanovsky commented Jun 20, 2022

Fix #1624 with explicit ALPN verification against the type of the listening socket, inherited by all established sockets. Manually tested with tempesta-tech/tempesta-test#275 (review)

Fix pending active connections for 5s issue (#1377 (comment)) and related 2nd checkbox from #1422 .

Fix #861: now we send (or at least able to send) TCP RST on security events.

Linked tests pull request tempesta-tech/tempesta-test#288

The pull request also requires tempesta-tech/linux-5.10.35-tfw@19bbfd1

@krizhanovsky krizhanovsky requested a review from const-t June 20, 2022 03:16
@krizhanovsky krizhanovsky marked this pull request as draft June 20, 2022 03:16
@krizhanovsky
Copy link
Contributor Author

krizhanovsky commented Jul 14, 2022

Just caught crash on simple while :; do ./scripts/tempesta.sh --restart; sleep 10; done and wrk -d 3600 -c 1000 -t 2 -H 'connection: close' https://ubuntu:443/ and _single_CPU VM:

[  299.565199] ==================================================================
[  299.568091] BUG: KASAN: null-ptr-deref in _raw_spin_lock+0x56/0xa0
[  299.571037] Write of size 4 at addr 0000000000000004 by task ksoftirqd/0/10
[  299.573838] 
[  299.575955] CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G           O      5.10.35-kdump+ #14
[  299.579093] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[  299.582262] Call Trace:
[  299.584647]  dump_stack+0x7d/0xa3
[  299.587193]  ? _raw_spin_lock+0x56/0xa0
[  299.589855]  kasan_report.cold+0x5/0x37
[  299.592636]  ? _raw_spin_lock+0x56/0xa0
[  299.595205]  check_memory_region+0x142/0x190
[  299.597926]  _raw_spin_lock+0x56/0xa0
[  299.600436]  ? _raw_read_lock_irq+0x30/0x30
[  299.603254]  frang_tls_handler+0x4f/0xf0 [tempesta_fw]
[  299.606131]  ttls_handshake_server_step+0x1ec/0x820 [tempesta_tls]
[  299.608983]  ? ttls_handshake_server_hello+0x670/0x670 [tempesta_tls]
[  299.611868]  ? __inc_numa_state+0x14/0x90
[  299.614592]  ? __alloc_pages_slowpath.constprop.0+0xc70/0xc70
[  299.617503]  ? free_unref_page_commit+0x109/0x190
[  299.620056]  ? ttls_parse_record_hdr+0x292/0x570 [tempesta_tls]
[  299.622816]  ttls_recv+0x2fa/0x5f0 [tempesta_tls]
[  299.625572]  ss_skb_process+0xfa/0x270 [tempesta_fw]
[  299.628187]  ? ttls_handle_alert+0x50/0x50 [tempesta_tls]
[  299.630897]  ? tfw_sock_srv_exit+0x30/0x30 [tempesta_fw]
[  299.633582]  ? _raw_spin_lock+0x65/0xa0
[  299.636016]  ? _raw_read_lock_irq+0x30/0x30
[  299.638580]  tfw_tls_connection_recv+0x125/0x600 [tempesta_fw]
[  299.641310]  ? tfw_tls_connection_lost+0x30/0x30 [tempesta_fw]
[  299.644233]  ss_tcp_process_skb.constprop.0+0x280/0x470 [tempesta_fw]
[  299.647101]  ? ss_do_close+0x7b0/0x7b0 [tempesta_fw]
[  299.649818]  ss_tcp_process_data+0x1eb/0x2b0 [tempesta_fw]
[  299.652524]  ? ss_tcp_process_skb.constprop.0+0x470/0x470 [tempesta_fw]
[  299.655648]  ss_tcp_data_ready+0x79/0xe0 [tempesta_fw]
[  299.658437]  tcp_rcv_established+0x933/0xb40
[  299.661017]  ? tcp_data_queue+0xa40/0xa40
[  299.663528]  ? __bpf_redirect+0x180/0x1c0
[  299.666189]  tcp_v4_do_rcv+0x24a/0x340
[  299.668736]  tcp_v4_rcv+0x137b/0x1490
[  299.671140]  ? __inet_lookup_established+0x232/0x2c0
[  299.673743]  ? tcp_v4_early_demux+0x2d0/0x2d0
[  299.676322]  ip_protocol_deliver_rcu+0x27/0x260
[  299.678869]  ip_local_deliver_finish+0xac/0xc0
[  299.681276]  ip_local_deliver+0xe9/0x1a0
[  299.683703]  ? ip_local_deliver_finish+0xc0/0xc0
[  299.686193]  ? ip_rcv_finish_core.constprop.0+0x166/0x570
[  299.688694]  ip_rcv+0x12e/0x140
[  299.690914]  ? ip_local_deliver+0x1a0/0x1a0
[  299.693211]  ? receive_buf+0x4f0/0x4f0 [virtio_net]
[  299.695694]  ? virtqueue_get_buf_ctx_split+0x5d/0x1d0 [virtio_ring]
[  299.698372]  ? ip_rcv_finish_core.constprop.0+0x570/0x570
[  299.700831]  ? ip_local_deliver+0x1a0/0x1a0
[  299.703088]  __netif_receive_skb_one_core+0xb9/0xf0
[  299.705390]  ? napi_gro_receive+0x250/0x250
[  299.707739]  ? _raw_spin_lock+0x65/0xa0
[  299.710048]  ? _raw_read_lock_irq+0x30/0x30
[  299.712227]  ? virtnet_poll+0x201/0x30f [virtio_net]
[  299.714572]  process_backlog+0xe3/0x270
[  299.716729]  napi_poll+0xfe/0x2f0
[  299.719001]  net_rx_action+0x14f/0x300
[  299.721230]  ? napi_poll+0x2f0/0x2f0
[  299.723398]  ? __kernel_fpu_begin_mask+0x7c/0x120
[  299.725705]  __do_softirq+0xe9/0x368
[  299.727898]  ? takeover_tasklets+0x2b0/0x2b0
[  299.730237]  run_ksoftirqd+0x15/0x20
[  299.732379]  smpboot_thread_fn+0x194/0x2c0
[  299.734562]  ? smpboot_register_percpu_thread+0x180/0x180
[  299.736912]  ? smpboot_register_percpu_thread+0x180/0x180
[  299.739413]  kthread+0x1c6/0x1f0
[  299.741391]  ? kthread_create_worker_on_cpu+0xa0/0xa0
[  299.743614]  ret_from_fork+0x1f/0x30
[  299.745664] ==================================================================
[  299.748438] Disabling lock debugging due to kernel taint
[  299.750990] BUG: kernel NULL pointer dereference, address: 0000000000000004
[  299.753648] #PF: supervisor write access in kernel mode
[  299.755985] #PF: error_code(0x0002) - not-present page
[  299.758391] PGD 0 P4D 0 
[  299.760413] Oops: 0002 [#1] SMP KASAN PTI
[  299.762691] CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G    B      O      5.10.35-kdump+ #14
[  299.765459] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[  299.768339] RIP: 0010:_raw_spin_lock+0x6e/0xa0
[  299.770639] Code: be 04 00 00 00 c7 44 24 20 00 00 00 00 e8 5a 83 78 ff be 04 00 00 00 48 8d 7c 24 20 e8 4b 83 78 ff ba 01 00 00 00 8b 44 24 20 <3e> 0f b1 55 00 75 19 48 b8 00 00 00 00 00 fc ff df 48 c7 04 03 00
[  299.776718] RSP: 0018:ffffc900000d7170 EFLAGS: 00010297
[  299.779385] RAX: 0000000000000000 RBX: 1ffff9200001ae2e RCX: ffffffffba2487c5
[  299.782335] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc900000d7190
[  299.785132] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000003
[  299.787955] R10: fffff5200001ae32 R11: 0000000000000001 R12: 0000000000000000
[  299.790784] R13: 0000000000000004 R14: ffff888102248040 R15: ffff8881063133d8
[  299.793724] FS:  0000000000000000(0000) GS:ffff88811b600000(0000) knlGS:0000000000000000
[  299.796642] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  299.799337] CR2: 0000000000000004 CR3: 000000011690c006 CR4: 0000000000370ef0
[  299.802180] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  299.805062] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  299.807849] Call Trace:
[  299.810107]  ? _raw_read_lock_irq+0x30/0x30
[  299.812611]  frang_tls_handler+0x4f/0xf0 [tempesta_fw]
[  299.815265]  ttls_handshake_server_step+0x1ec/0x820 [tempesta_tls]
[  299.818027]  ? ttls_handshake_server_hello+0x670/0x670 [tempesta_tls]
[  299.820793]  ? __inc_numa_state+0x14/0x90
[  299.823211]  ? __alloc_pages_slowpath.constprop.0+0xc70/0xc70
[  299.826002]  ? free_unref_page_commit+0x109/0x190
[  299.828587]  ? ttls_parse_record_hdr+0x292/0x570 [tempesta_tls]
[  299.831350]  ttls_recv+0x2fa/0x5f0 [tempesta_tls]
[  299.833983]  ss_skb_process+0xfa/0x270 [tempesta_fw]
[  299.836795]  ? ttls_handle_alert+0x50/0x50 [tempesta_tls]
[  299.839638]  ? tfw_sock_srv_exit+0x30/0x30 [tempesta_fw]
[  299.842431]  ? _raw_spin_lock+0x65/0xa0
[  299.845088]  ? _raw_read_lock_irq+0x30/0x30
[  299.847645]  tfw_tls_connection_recv+0x125/0x600 [tempesta_fw]
[  299.850580]  ? tfw_tls_connection_lost+0x30/0x30 [tempesta_fw]
[  299.853320]  ss_tcp_process_skb.constprop.0+0x280/0x470 [tempesta_fw]
[  299.856412]  ? ss_do_close+0x7b0/0x7b0 [tempesta_fw]
[  299.859109]  ss_tcp_process_data+0x1eb/0x2b0 [tempesta_fw]
[  299.861962]  ? ss_tcp_process_skb.constprop.0+0x470/0x470 [tempesta_fw]
[  299.864959]  ss_tcp_data_ready+0x79/0xe0 [tempesta_fw]
[  299.867760]  tcp_rcv_established+0x933/0xb40
[  299.870502]  ? tcp_data_queue+0xa40/0xa40
[  299.873043]  ? __bpf_redirect+0x180/0x1c0
[  299.875557]  tcp_v4_do_rcv+0x24a/0x340
[  299.878168]  tcp_v4_rcv+0x137b/0x1490
[  299.880687]  ? __inet_lookup_established+0x232/0x2c0
[  299.883332]  ? tcp_v4_early_demux+0x2d0/0x2d0
[  299.885890]  ip_protocol_deliver_rcu+0x27/0x260
[  299.888552]  ip_local_deliver_finish+0xac/0xc0
[  299.891141]  ip_local_deliver+0xe9/0x1a0
[  299.893629]  ? ip_local_deliver_finish+0xc0/0xc0
[  299.896126]  ? ip_rcv_finish_core.constprop.0+0x166/0x570
[  299.898797]  ip_rcv+0x12e/0x140
[  299.901037]  ? ip_local_deliver+0x1a0/0x1a0
[  299.903466]  ? receive_buf+0x4f0/0x4f0 [virtio_net]
[  299.905899]  ? virtqueue_get_buf_ctx_split+0x5d/0x1d0 [virtio_ring]
[  299.908637]  ? ip_rcv_finish_core.constprop.0+0x570/0x570
[  299.911147]  ? ip_local_deliver+0x1a0/0x1a0
[  299.913431]  __netif_receive_skb_one_core+0xb9/0xf0
[  299.915897]  ? napi_gro_receive+0x250/0x250
[  299.918311]  ? _raw_spin_lock+0x65/0xa0
[  299.920562]  ? _raw_read_lock_irq+0x30/0x30
[  299.922901]  ? virtnet_poll+0x201/0x30f [virtio_net]
[  299.925218]  process_backlog+0xe3/0x270
[  299.927385]  napi_poll+0xfe/0x2f0
[  299.929731]  net_rx_action+0x14f/0x300
[  299.931934]  ? napi_poll+0x2f0/0x2f0
[  299.934133]  ? __kernel_fpu_begin_mask+0x7c/0x120
[  299.936475]  __do_softirq+0xe9/0x368
[  299.938647]  ? takeover_tasklets+0x2b0/0x2b0
[  299.940934]  run_ksoftirqd+0x15/0x20
[  299.943078]  smpboot_thread_fn+0x194/0x2c0
[  299.945195]  ? smpboot_register_percpu_thread+0x180/0x180
[  299.947499]  ? smpboot_register_percpu_thread+0x180/0x180
[  299.949838]  kthread+0x1c6/0x1f0
[  299.951914]  ? kthread_create_worker_on_cpu+0xa0/0xa0
[  299.954178]  ret_from_fork+0x1f/0x30
[  299.956244] Modules linked in: tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tempesta_lib(O) intel_rapl_msr ppdev intel_rapl_common rapl sg serio_raw parport_pc parport button loop sch_fq_codel ipmi_devintf ipmi_msghandler msr configfs ip_tables x_tables ext4 crc16 mbcache jbd2 efivars linear md_mod virtio_net net_failover failover virtio_blk sr_mod cdrom ata_generic bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea ata_piix sysfillrect sysimgblt fb_sys_fops libata drm crct10dif_pclmul virtio_pci psmouse virtio_ring virtio scsi_mod i2c_piix4 [last unloaded: tempesta_lib]
[  299.971991] CR2: 0000000000000004
[  299.974368] ---[ end trace 6947d81dcc0a1ea5 ]---
[  299.976900] RIP: 0010:_raw_spin_lock+0x6e/0xa0
[  299.979495] Code: be 04 00 00 00 c7 44 24 20 00 00 00 00 e8 5a 83 78 ff be 04 00 00 00 48 8d 7c 24 20 e8 4b 83 78 ff ba 01 00 00 00 8b 44 24 20 <3e> 0f b1 55 00 75 19 48 b8 00 00 00 00 00 fc ff df 48 c7 04 03 00
[  299.986061] RSP: 0018:ffffc900000d7170 EFLAGS: 00010297
[  299.988750] RAX: 0000000000000000 RBX: 1ffff9200001ae2e RCX: ffffffffba2487c5
[  299.991695] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffc900000d7190
[  299.994830] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000003
[  299.997806] R10: fffff5200001ae32 R11: 0000000000000001 R12: 0000000000000000
[  300.000722] R13: 0000000000000004 R14: ffff888102248040 R15: ffff8881063133d8
[  300.003817] FS:  0000000000000000(0000) GS:ffff88811b600000(0000) knlGS:0000000000000000
[  300.007206] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  300.010063] CR2: 0000000000000004 CR3: 000000011690c006 CR4: 0000000000370ef0
[  300.013066] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  300.016161] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  300.019112] Kernel panic - not syncing: Fatal exception in interrupt

Apparently frang_tls_handler() is called on not yet initialized sk->sk_security. frang_conn_new() is called under the socket lock (see the upcoming patch with the spin lock assertion), so there could be no race and it seems frang_conn_new() wasn't called at all. I still didn't figure out whether this is a problem of this particular branch of the master - I saw the crash only once.

@krizhanovsky
Copy link
Contributor Author

Caught crash, which is fixed in 70daa64

[21157.520413] kernel BUG at /root/tempesta/fw/t/unit/../../str.c:564!
[21157.523559] invalid opcode: 0000 [#1] SMP KASAN PTI
[21157.527195] CPU: 2 PID: 50116 Comm: python3 Tainted: G    B      O      5.10.35-kdump+ #16
[21157.527613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[21157.533690] RIP: 0010:tfw_str_collect_cmp+0x7a/0x170 [tempesta_fw]
[21157.537676] Code: 48 8b 03 48 89 45 00 48 8b 43 08 48 89 45 08 48 8b 43 10 48 89 45 10 48 8b 43 18 48 89 45 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 5b 48 89 ef be 20 00 00 00 5d 41 5c 41 5d 41 5e 41 5f e9 5d
[21157.544242] RSP: 0018:ffffc900001c8388 EFLAGS: 00010206
[21157.547849] RAX: 0000000000000000 RBX: ffff8880b7ae6630 RCX: dffffc0000000000
[21157.547849] RDX: 0000000000000003 RSI: ffff8880b7ae6630 RDI: ffff8880b7ae6648
[21157.553778] RBP: ffffc900001c84e8 R08: ffffffffc0aa69bf R09: 0000000000000000
[21157.558393] R10: ffffed1016f5cc7d R11: 0000000000000001 R12: ffff8880b7ae6630
[21157.563905] R13: 00000000041fd200 R14: ffffffffc0ac8e80 R15: ffff888104b070e0
[21157.568581] FS:  00007f6090579640(0000) GS:ffff88811b700000(0000) knlGS:0000000000000000
[21157.568581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[21157.573659] CR2: 00007f322887fac0 CR3: 00000000b7a06001 CR4: 0000000000370ee0
[21157.578910] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[21157.583676] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[21157.583676] Call Trace:
[21157.589066]  <IRQ>
[21157.589066]  tfw_http_search_cookie+0x1ea/0x230 [tempesta_fw]
[21157.594456]  ? tfw_http_verify_hdr_field+0x150/0x150 [tempesta_fw]
[21157.599364]  ? __http_msg_hdr_val+0x19c/0x1d0 [tempesta_fw]
[21157.603770]  match_cookie+0x1a1/0x2e0 [tempesta_fw]
[21157.604633]  ? tfw_http_search_cookie+0x230/0x230 [tempesta_fw]
[21157.609529]  ? tfw_http_req_client_link+0xbc/0x300 [tempesta_fw]
[21157.613743]  ? tfw_h2_append_predefined_body+0x540/0x540 [tempesta_fw]
[21157.613743]  do_eval+0xc1/0x190 [tempesta_fw]
[21157.619785]  tfw_http_match_req+0x38/0x50 [tempesta_fw]
[21157.623980]  tfw_http_tbl_action+0x14c/0x1d0 [tempesta_fw]
[21157.623980]  tfw_http_req_process+0x30c/0x9c0 [tempesta_fw]
[21157.629918]  ? policy_node+0x42/0x60
[21157.633732]  ? tfw_http_req_cache_cb+0x6d0/0x6d0 [tempesta_fw]
[21157.633732]  ? memset+0x20/0x40
[21157.640122]  ? tfw_http_init_parser_req+0x5f/0x70 [tempesta_fw]
[21157.643692]  ? tfw_http_conn_msg_alloc+0x1b4/0x370 [tempesta_fw]
[21157.643692]  ? tfw_http_msg_process_generic+0xc3/0x260 [tempesta_fw]
[21157.650332]  tfw_connection_recv+0x91/0xe0 [tempesta_fw]
[21157.653717]  ss_tcp_process_skb.constprop.0+0x280/0x470 [tempesta_fw]
[21157.653717]  ? ss_conn_drop_guard_exit+0x70/0x70 [tempesta_fw]
[21157.660571]  ss_tcp_process_data+0x1ef/0x2a0 [tempesta_fw]
[21157.663631]  ? ss_tcp_process_skb.constprop.0+0x470/0x470 [tempesta_fw]
[21157.663631]  ? tcp_event_data_recv+0x292/0x5d0
[21157.670709]  ss_tcp_data_ready+0x68/0x100 [tempesta_fw]
[21157.673929]  tcp_data_queue+0x5d8/0xa40
[21157.673929]  ? tcp_data_ready+0x1e0/0x1e0
[21157.680937]  ? ktime_get+0x40/0xa0
[21157.683666]  tcp_rcv_established+0x3ea/0xb40
[21157.683666]  ? tcp_data_queue+0xa40/0xa40

fw/sock.c Outdated Show resolved Hide resolved
fw/str.c Outdated Show resolved Hide resolved
fw/sock_clnt.c Show resolved Hide resolved
fw/sock_clnt.c Outdated Show resolved Hide resolved
fw/sock_clnt.c Outdated Show resolved Hide resolved
fw/sock_clnt.c Outdated Show resolved Hide resolved
fw/sock_clnt.c Show resolved Hide resolved
fw/sock_clnt.c Show resolved Hide resolved
fw/tls.c Show resolved Hide resolved
fw/connection.h Show resolved Hide resolved
fw/sock.c Show resolved Hide resolved
@s0nx
Copy link
Contributor

s0nx commented Sep 1, 2022

BTW, there are some other cases of freeing the list member without explicit list_del()/list_del_init() in sock_clnt.c, which might cause problems.

@krizhanovsky krizhanovsky marked this pull request as draft September 2, 2022 12:00
established sockets from HTTPS. Now we can use more efficient

  TFW_FSM_TYPE(conn->proto.type) == TFW_FSM_H2

instead of TFW_CONN_H2(conn) and initialize HTTP/2 connection descriptor
only for HTTP/2 connections, not HTTPS as previously.

Add a new alpn_match_cb hook for TLS, which validates client ALPN
matching one of ours ALPN against type of the socket, so if client sends
up ALPN "h2" on a TCP connection established on a socket with

  listen ... proto=https;

, then the connection won't be established.
Disable annoying warning on inability to send TLS alerts on
connections with handshakes in progress.

Add comments for ss_active_guard_enter() about the need for the
double checking.

Add more information about connections counters state to the warning
message about pending client connections.
1. remove stress.[ch] which never been used and we still don't know
   whether we need then for the QoS.
2. Remove classifier - there is only one module (http_limits) using
   it and there is no sense to play with RCU and indirect calls
3. Replace SS_CALL_GUARD_EXIT() with an (inlined) function and remove
   unused SS_CALL_GUARD_ENTER().

Add assertion that a socket is locked, when it enters frang_conn_limit()
from tcp_v4_syn_recv_sock() (the socket is locked by
tcp_create_openreq_child() -> inet_csk_clone_lock().
new listening port to the bitmap before moving the socket to listening
state - this way we guarantee that HTTP limiting is called on the
initialized accounting descriptor.

The problem appears on constant system restart under 2000 HTTPS
connections from wrk.
to static functions. Instead, just store and restore the list in
the particular test (we need this since there is a session test, which
initializes the session module in the list).

Remove unnecessary exports.
and immediately after that tfw_client_stop() closes the clients
database, so during ss_synchronize() client connections closing
callback tfw_sock_clnt_drop() removes client connection from the
dead linked list.

This patch count each sock.c users (they must set TfwMod->sock_user) and
calls ss_synchronize() exactly when last of them finishes (including
interrupted start procedure due to an error in some module).
need to save and restore FPU context before calling syncronous sockets.
krizhanovsky and others added 9 commits September 2, 2022 15:04
connection close alert, which is just normal.
tcp_write_xmit() calls tfw_tls_encrypt(). However, the may never happen
if the peer didn't announce sufficient receive window. In this case
Tempesta shutdown may find live connections and rise "pending active
connections for 5s" and leave the connections objects in the memory
cache.

ss_synchronize() now returns a success value and we abort all the client
connections in case of failure and recheck all the counters with
ss_synchronize() again. To abort all the connections we extend
ss_do_send() with __SS_F_RST and send RST on aborted connections.

The connections aborting (TCP RST) is also useful on security events:
when we see any malicious activity we should reset connection instead of
issuing normal TCP connections closing. For this purpose conn_abort
callback was introduced for TfwConnHooks and ss_tcp_data_ready()
normally closes or aborts a TCP connection depending on
ss_tcp_process_data() return value (SS_BAD and SS_DROP correspondinglY),
so several innocent places returning T_DROP/SS_DROP were adjusted to
return T_BAD/SS_BAD, most likely not all the places though.

tfw_peer_del_conn() were updated to use 1-depth nesting to be called
as a callback from tfw_peer_for_each_conn() (the ability isn't used by
the current version of connections aborting).

Also:
- several comments are updated/fixed;
- couple of macros replaced with inlined functions
- better messaging in ss_synchronize()
- some other minor code cleanups
chunk, to tfw_str_collect_cmp(). Firstly check chunk == end in
tfw_str_collect_cmp() and only after than make an assertion that chunk
is a plain string.
may free the conn and we should not dereference the pointer even
for the assertion.
it is mainly used on connection freeing since WS(S) is upgraded from
HTTP(S).
If one of sockets can't start listening in tfw_sock_clnt_start
we just breaks the initialization loop and clenup some data,
but already opened sockets won't be released, since module
has not been started yet and tfw_sock_clnt_stop must not be
called for such modules.

Now has been added realising of sockets in tfw_listen_sock_del_all
which must be called from cleanup callback even on tfw_sock_clnt_start
failure.

Also tfw_listen_sock_del_all now frees tfw_listen_socks list
tfw_conn_hook_call() or tfw_h2_context_init(): call TLS connection
destructor on yet not fully initialized TLS connection handler.
Nullify TfwHPackDTbl pool pointer after destruction to let
tfw_hpack_clean() call on the handler without double-frees.
@krizhanovsky
Copy link
Contributor Author

krizhanovsky commented Sep 3, 2022

Just caught a crash on the current version of the branch on multiple_listeners.test_on_the_fly.TestOnTheFly.test_change_config_on_the_fly test from tempesta-tech/tempesta-test#275:

[696585.558773] BUG: KASAN: use-after-free in tfw_sock_clnt_new+0x21f/0x460 [tempesta_fw]
[696585.561162] Read of size 24 at addr ffff88811af2f600 by task wrk/76566
[696585.563332]
[696585.564933] CPU: 3 PID: 76566 Comm: wrk Tainted: G           O      5.10.35-kdump+ #16
[696585.565344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[696585.565344] Call Trace:
[696585.565344]  <IRQ>
[696585.565344]  dump_stack+0x7d/0xa3
[696585.565344]  print_address_description.constprop.0+0x1c/0x1f0
[696585.565344]  ? _raw_spin_lock_irqsave+0x6b/0xb0
[696585.565344]  ? _raw_write_lock_irqsave+0xb0/0xb0
[696585.565344]  ? tfw_sock_clnt_new+0x21f/0x460 [tempesta_fw]
[696585.565344]  kasan_report.cold+0x1f/0x37
[696585.565344]  ? tfw_sock_clnt_new+0x21f/0x460 [tempesta_fw]
[696585.565344]  check_memory_region+0x142/0x190
[696585.565344]  tfw_sock_clnt_new+0x21f/0x460 [tempesta_fw]
[696585.565344]  ? tfw_cli_conn_release+0xb0/0xb0 [tempesta_fw]
[696585.565344]  ss_tcp_state_change+0x137/0x320 [tempesta_fw]
[696585.565344]  tcp_rcv_state_process+0x6b7/0xc10
[696585.565344]  ? tcp_rcv_synsent_state_process+0xd20/0xd20
[696585.565344]  ? tcp_ca_openreq_child+0x1c0/0x1c0
[696585.565344]  ? bpf_get_listener_sock+0x31/0x80
[696585.565344]  tcp_child_process+0x14f/0x2d0
[696585.565344]  tcp_v4_rcv+0x100e/0x1490
.....
[696585.565344] Allocated by task 76532:
[696585.565344]  kasan_save_stack+0x1b/0x40
[696585.565344]  __kasan_kmalloc.constprop.0+0xbf/0xd0
[696585.565344]  kmem_cache_alloc_trace+0x220/0x3b0
[696585.565344]  tfw_listen_sock_add+0x179/0x240 [tempesta_fw]
[696585.565344]  tfw_cfgop_listen+0x163/0x240 [tempesta_fw]
[696585.565344]  spec_handle_entry+0x80/0xb0 [tempesta_fw]
[696585.565344]  tfw_cfg_parse_mods+0x24a/0x2d0 [tempesta_fw]
[696585.565344]  tfw_cfg_parse+0x73/0xb0 [tempesta_fw]
[696585.565344]  tfw_start+0x5c/0x190 [tempesta_fw]
[696585.565344]  tfw_ctlfn_state_io+0x18b/0x1c0 [tempesta_fw]
[696585.565344]  proc_sys_call_handler+0x1c3/0x300
[696585.565344]  new_sync_write+0x22b/0x320
[696585.565344]  vfs_write+0x2ae/0x350
[696585.565344]  ksys_write+0xa0/0x120
[696585.565344]  do_syscall_64+0x2d/0x70
[696585.565344]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[696585.565344]
[696585.565344] Freed by task 76583:
[696585.565344]  kasan_save_stack+0x1b/0x40
[696585.565344]  kasan_set_track+0x1c/0x30
[696585.565344]  kasan_set_free_info+0x1b/0x30
[696585.565344]  __kasan_slab_free+0xec/0x130
[696585.565344]  kfree+0x8c/0x110
[696585.565344]  tfw_sock_clnt_start+0x407/0x5a0 [tempesta_fw]
[696585.565344]  tfw_start+0xd3/0x190 [tempesta_fw]
[696585.565344]  tfw_ctlfn_state_io+0x18b/0x1c0 [tempesta_fw]
[696585.565344]  proc_sys_call_handler+0x1c3/0x300
[696585.565344]  new_sync_write+0x22b/0x320
[696585.565344]  vfs_write+0x2ae/0x350
[696585.565344]  ksys_write+0xa0/0x120
[696585.565344]  do_syscall_64+0x2d/0x70
[696585.565344]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[696585.565344]
[696585.565344] The buggy address belongs to the object at ffff88811af2f600
[696585.565344]  which belongs to the cache kmalloc-96 of size 96
[696585.565344] The buggy address is located 0 bytes inside of
[696585.565344]  96-byte region [ffff88811af2f600, ffff88811af2f660)
[696585.565344] The buggy address belongs to the page:
[696585.565344] page:0000000084c5d3fb refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11af2f
[696585.565344] flags: 0x2fff80000000200(slab)
[696585.565344] raw: 02fff80000000200 ffffea00040c4488 ffffea000412eec8 ffff888100040600
[696585.565344] raw: 0000000000000000 ffff88811af2f000 0000000100000020 ffff88810096a001
[696585.565344] page dumped because: kasan: bad access detected
[696585.565344] page->mem_cgroup:ffff88810096a001

The problem is caused by the race between tfw_sock_clnt_new() callback and tfw_sock_clnt_start(): while a new child socket is created and initialized by the listening socket, the listening socket is freed during the reconfiguration.

Fixed in e8cfb48

tfw_sock_clnt_start(): while a new child socket is created and
initialized by the listening socket, the listening socket, whcih was
removed from the reloaded configuration, is freed during the reconfiguration.
There actually was no requirement to use the listening socket in the
child initialization. The patch removes `listener` from SsProto as
doubling TfwListenSock->sk, so all connections will use smaller memory.

Also remove the unused synchronous sockets tests.
1. all our sockets have sk->sk_allocation == GFP_ATOMIC, so remove
   setting of the allocation class and use sk_uid to differentiate
   client and server sockets in ss_tcp_state_change().

2. ss_tcp_state_change() uses SsHooks from SsProto, which is supposed
   to be in sk->sk_user_data to call connection_new hooks, so we do need
   to keep full SsProto in sk_user_data for listening sockets. I used
   statically allocates SsProtos to make child sockets independent from
   parent to avoid the race, attempted to fix in previous commit.

3. remove ss_wait_newconn() call from tfw_sock_clnt_start() - this was
   just a garbage from invalid way to fix the race in previous commit.

4. remove t/sync_sockets from the Makefile
@krizhanovsky krizhanovsky marked this pull request as ready for review September 4, 2022 20:13
@krizhanovsky
Copy link
Contributor Author

krizhanovsky commented Sep 4, 2022

The tests are broken. On CI I http://tempesta-vm.cloud.cherryservers.net:8080/job/tempesta-test%20PR/211/console do see these tests failed:

cache.test_purge_resp_hch.HeavyChunkedPurgeRespTest
tls.test_tls_tickets.TlsTicketTest

The both pass in my VM with Ubuntu 22 and tests master at 8efb2353a2b89a542de4e1783f9913fc32b924a7 . The last test throws though this warning:

/root/tempesta-test/tls/scapy_ssl_tls/ssl_tls_crypto.py:1093: UserWarning: Verification of GCM tag failed: MAC check failed
  warnings.warn("Verification of GCM tag failed: %s" % why)=

However, on my system these tests fail:

http_rules.test_http_tables.HttpTablesTestMarkRules
tls.test_tls_cert.TlsCertSelect
ws.test_ws_ping.WssPingProxy
ws.test_ws_ping.WssStress
test_limited_ka in regression.test_stress_failovering.HashFailovering
test_unlimited_ka in regression.test_stress_failovering.HashFailovering
test_limited_ka in regression.test_stress_failovering.RatioFailovering
test_unlimited_ka in regression.test_stress_failovering.RatioFailovering
regression.test_stress_pipeline.Pipeline
----------------------------------------------------------------------
Ran 355 tests in 2013.741s

FAILED (failures=5, errors=4, skipped=4)

The first test is about tempesta-tech/tempesta-test#285 , the TLS one is tempesta-tech/tempesta-test#302 .

I reran the failed CI tests on my VM and they do pass:

# ./run_tests.py -n cache.test_purge_resp_hch.HeavyChunkedPurgeRespTest tls.test_tls_tickets.TlsTicketTest
....
----------------------------------------------------------------------
Running functional tests...
----------------------------------------------------------------------

test_0_purge_resp_non_hch (cache.test_purge_resp_hch.HeavyChunkedPurgeRespTest) ... ok
test_1_purge_resp_hch (cache.test_purge_resp_hch.HeavyChunkedPurgeRespTest) ... ok
test_invalid_ticket (tls.test_tls_tickets.TlsTicketTest) ... /root/tempesta-test/tls/scapy_ssl_tls/ssl_tls_crypto.py:1093: UserWarning: Verification of GCM tag failed: MAC check failed
  warnings.warn("Verification of GCM tag failed: %s" % why)
ok
test_no_ticket_support (tls.test_tls_tickets.TlsTicketTest) ... ok

----------------------------------------------------------------------
Ran 4 tests in 43.270s

OK

fw/sock_clnt.c Outdated Show resolved Hide resolved
Copy link
Contributor

@s0nx s0nx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@const-t const-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fw/sock_clnt.c Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mixed protocol handlers on multiple listening sockets TCP: reset blocked connections
4 participants