Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM while running h2load when caching is enabled #1728

Closed
s0nx opened this issue Oct 24, 2022 · 2 comments · Fixed by #1734
Closed

OOM while running h2load when caching is enabled #1728

s0nx opened this issue Oct 24, 2022 · 2 comments · Fixed by #1734

Comments

@s0nx
Copy link
Contributor

s0nx commented Oct 24, 2022

Motivation

High memory usage observed while running h2load test when caching is enabled in Tempesta.
This leads to OOM in a couple of seconds later.

Scope

[  519.654584] [tempesta fw]   client was found in tdb                                                                                                         
[  519.671747] [tempesta fw]   client was found in tdb                                                                                                         
[  519.671754] [tempesta fw]   client was found in tdb                                                                                                         
[  522.992062] [tempesta fw]   connection error: 127.0.0.1:8000                                                                                                
[  522.992994] [tempesta fw]   connection error: 127.0.0.1:8000                                                                                                
[  522.993926] [tempesta fw]   connected: 127.0.0.1:8000                                                                                                       
[  522.994849] [tempesta fw]   connected: 127.0.0.1:8000                                                                                                       
[  528.480429] ksoftirqd/3: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0                                  
[  528.480441] CPU: 3 PID: 29 Comm: ksoftirqd/3 Tainted: G           OE     5.10.35perf+ #2                                                                    
[  528.480442] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015                                                                       
[  528.480444] Call Trace:                                                                                                                                     
[  528.480452]  dump_stack+0x6b/0x83                                                                                                                           
[  528.480455]  warn_alloc.cold+0x72/0xd6                                                                                                                      
[  528.480458]  __alloc_pages_slowpath.constprop.0+0xc40/0xc70                                                                                                 
[  528.480462]  __alloc_pages_nodemask+0x2e3/0x310                                                                                                             
[  528.480472]  tfw_cache_do_action+0xe90/0xfc0 [tempesta_fw]                                                                                                  
[  528.480479]  ? tfw_http_req_redir+0x7c0/0x7c0 [tempesta_fw]                                                                                                 
[  528.480485]  ? tfw_hash_str_len+0x94/0x140 [tempesta_fw]                                                                                                    
[  528.480490]  ? tfw_http_req_redir+0x7c0/0x7c0 [tempesta_fw]                                                                                                 
[  528.480495]  ? tfw_cache_process+0xa9/0x270 [tempesta_fw]                                                                                                   
[  528.480499]  tfw_cache_process+0xa9/0x270 [tempesta_fw]                                                                                                     
[  528.480504]  tfw_http_req_process+0x397/0x800 [tempesta_fw]                                                                                                 [  528.480510]  ? tfw_http_conn_msg_alloc+0x1bb/0x220 [tempesta_fw]                                                                                            
[  528.480518]  ? ss_skb_chop_head_tail+0x20/0x1b0 [tempesta_fw]                                                                                               
[  528.480524]  ? ss_skb_process+0xe5/0x120 [tempesta_fw]                                                                                                      
[  528.480530]  tfw_h2_frame_process+0x2e3/0x400 [tempesta_fw]                                                                                                 
[  528.480536]  tfw_connection_recv+0x53/0x90 [tempesta_fw]                                                                                                    
[  528.480542]  tfw_tls_connection_recv+0x278/0x360 [tempesta_fw]                                                                                              
[  528.480549]  ss_tcp_process_data+0x1d9/0x3c0 [tempesta_fw]                                                                                                  
[  528.480555]  ss_tcp_data_ready+0x3f/0xa0 [tempesta_fw]                                                                                                      
[  528.480557]  tcp_data_queue+0x8ab/0xd20                                                                                                                     
[  528.480559]  tcp_rcv_established+0x21c/0x630                                                                                                                
[  528.480561]  ? tcp_v4_inbound_md5_hash+0x55/0x160                                                                                                           
[  528.480562]  tcp_v4_do_rcv+0x131/0x1f0                                                                                                                      
[  528.480564]  tcp_v4_rcv+0xbef/0xd40                                                                                                                         
[  528.480566]  ? nf_hook_slow+0x3f/0xb0                                                                                                                       
[  528.480567]  ip_protocol_deliver_rcu+0x2b/0x1b0                                                                                                             
[  528.480569]  ip_local_deliver_finish+0x44/0x50                                                                                                              
[  528.480571]  __netif_receive_skb_core.constprop.0+0x55f/0xff0                                                                                               
[  528.480574]  ? __alloc_skb+0x3d/0x200                                                                                                                       
[  528.480575]  ? __napi_alloc_skb+0x3d/0xe0                                                                                                                   
[  528.480576]  __netif_receive_skb_list_core+0x126/0x2a0                                                                                                      
[  528.480578]  netif_receive_skb_list_internal+0x1b1/0x2c0                                                                                                    
[  528.480579]  ? dev_gro_receive+0x2da/0x6a0                                                                                                                  
[  528.480580]  gro_normal_one+0x73/0xa0                                                                                                                       
[  528.480582]  napi_gro_receive+0xfe/0x110                                                                                                                    
[  528.480584]  virtnet_poll+0x19c/0x331 [virtio_net]                                                                                                          
[  528.480586]  net_rx_action+0x135/0x3b0
[  528.480589]  __do_softirq+0xd5/0x297 
[  528.480592]  run_ksoftirqd+0x26/0x40 
[  528.480594]  smpboot_thread_fn+0xc5/0x160
[  528.480595]  ? smpboot_register_percpu_thread+0xf0/0xf0
[  528.480597]  kthread+0x11b/0x140
[  528.480598]  ? kthread_associate_blkcg+0xa0/0xa0
[  528.480600]  ret_from_fork+0x1f/0x30 
[  528.480602] Mem-Info:
[  528.480604] active_anon:245 inactive_anon:18078 isolated_anon:0
[  528.480604]  active_file:25590 inactive_file:23110 isolated_file:0
[  528.480604]  unevictable:0 dirty:15 writeback:0
[  528.480604]  slab_reclaimable:6868 slab_unreclaimable:11581
[  528.480604]  mapped:21161 shmem:277 pagetables:1167 bounce:0
[  528.480604]  free:15897 free_pcp:1567 free_cma:0

Testing

Tempesta FW conf:

listen 443 proto=h2;

cache 1;
cache_fulfill * *;

srv_group ngx_local {
	# The address is on the same VM
	server 127.0.0.1:8000;
}

vhost f35tfw.local {
	tls_certificate /root/certs/tempesta/RSA/tfw-root.crt;
        tls_certificate_key /root/certs/tempesta/RSA/tfw-root.key;

	proxy_pass ngx_local;
}

http_chain {
	-> f35tfw.local;
}

NGINX conf:

# vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4
user www-data;
worker_processes 4;
pid /run/nginx.pid;

events {
    worker_connections 768;
    # multi_accept on;
}

http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    # server_tokens off;

    # server_names_hash_bucket_size 64;
    # server_name_in_redirect off;

    include /etc/nginx/mime.types;
    default_type text/html;

    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log;

    gzip on;
    gzip_disable "msie6";

    # gzip_vary on;
    # gzip_proxied any;
    # gzip_comp_level 6;
    # gzip_buffers 16 8k;
    # gzip_http_version 1.1;
    # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    server {
        listen 127.0.0.1:8000 default_server;

        root /var/www/tempesta-tech.com;

        index index;

        server_name _;

        error_page 403 404 /oops;
    }
}

h2load params: h2load https://f35tfw.local -t 2 -c 1000 -D 30

@s0nx
Copy link
Contributor Author

s0nx commented Oct 24, 2022

I believe the root cause of the problem is not the same as the one fixed by #1687, because the cache was disabled while testing that issue.

@krizhanovsky
Copy link
Contributor

I'm wondering it it's a duplicate of #500 ? Need to debug the problem and learn what is the root cause of the OOM, but I'm not sure if we should fix it right now - I'm working on TDB v0.2 and nearly the whole TDB will be replaced (it also uses a new memory allocator), so any fixes related to TDB don't make much sense for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants