-
Notifications
You must be signed in to change notification settings - Fork 94
Low framerate with dma-buf output #35
Comments
Same here. What can be changed to improve the framerate? |
syzbot caught an infinite recursion in nsh_gso_segment(). Problem here is that we need to make sure the NSH header is of reasonable length. BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. depth: 48 max: 48! 48 locks held by syz-executor0/10189: #0: (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517 #1: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #1: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #2: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #2: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #3: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #3: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #4: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #4: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #5: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #5: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #6: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #6: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #7: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #7: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #8: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #8: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #9: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #9: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #10: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #10: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #11: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #11: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #12: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #12: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #13: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #13: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #14: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #14: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #15: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #15: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #16: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #16: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #17: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #17: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #18: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #18: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #19: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #19: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #20: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #20: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #21: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #21: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #22: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #22: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #23: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #23: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #24: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #24: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #25: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #25: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #26: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #26: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #27: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #27: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #28: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #28: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #29: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #29: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #30: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #30: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #31: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #31: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 dccp_close: ABORT with 65423 bytes unread #32: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #32: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #33: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #33: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #34: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #34: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #35: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #35: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #36: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #36: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #37: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #37: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #38: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #38: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #39: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #39: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #40: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #40: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #41: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #41: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #42: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #42: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #43: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #43: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #44: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #44: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #45: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #45: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #46: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #46: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #47: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #47: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 INFO: lockdep is turned off. CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920 rcu_lock_acquire include/linux/rcupdate.h:246 [inline] rcu_read_lock include/linux/rcupdate.h:632 [inline] skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865 skb_gso_segment include/linux/netdevice.h:4025 [inline] validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312 qdisc_restart net/sched/sch_generic.c:399 [inline] __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410 __dev_xmit_skb net/core/dev.c:3243 [inline] __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616 packet_snd net/packet/af_packet.c:2951 [inline] packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 __sys_sendto+0x3d7/0x670 net/socket.c:1789 __do_sys_sendto net/socket.c:1801 [inline] __se_sys_sendto net/socket.c:1797 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: c411ed8 ("nsh: add GSO support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Benc <jbenc@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Same here, refresh rate of whole "dmabuffed" display is really bad via Spice or GTK. OpenGL / DirectX apps FPS seems OK. Tested on Windows 10 Enterprise 1803 Guest with latest 24.20.100.6194 Intel drivers, custom compiled Qemu 2.12.1 and 4.18.0 kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2018-08-03/ (using it because of another bug)... all on Ubuntu 18.04. Any help please? |
When using spicy client conneced to local dmabuff VM try turning off spice protocol compression, refreshrate seems normal OK then! qemu-system-x86_64 ... -spice disable-ticketing,port=5910,image-compression=off,streaming-video=off,playback-compression=off,jpeg-wan-compression=never,zlib-glz-wan-compression=never |
@bangom |
@worron, well maybe not 60fps... more like 30fps. But huge step when comparing spice compression on (~5fps) and off (~30fps). Hope there will be fix soon so we can use GVT in OpenGL game scenarios on a Windows host. |
OK, thanks for clarification. Current state of dma-buf (~30fps) is pretty good but still not perfect. I'm just following the issue hoping for improvements. |
Tried gnif/looking-glass on a win10, can achieve 60fps in testufo, when under some kind of GPU load the fps drops to 10ish, not better than dma-buf for this case. |
When under heavy load (3dmark) dma-buf stutters really bad, it could freeze for 0.5s before showing next frame, while the 3dmark runs at 12-15fp Using virt-manager, gpu is uhd620 |
Hi, DMA-Buf via GTK is throttled to 33Hz by default fresh rate of GTK display. Change definition of macro I have filed a bug at launchpad.net for qemu. |
Thanks! I wonder why is the throttle 33hz, instead of 60hz or client display refresh rate.
|
Tracking down the commit history, the interval of 30 milliseconds dates back to 10+ years ago when VGA emulation was first introduced. |
@zhangchn, thank you! Finally can enjoy fair 60fps in gvt accelerated VM. Also I'm asking all concerned to vote the bug at launchpad. |
Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe #4 [9a068ba8] 440 bytes lock_acquire at 3576ee #5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 #6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 #7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 #8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 #9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c #10 [9a069dd0] 168 bytes wake_up_worker at 23f97c #11 [9a069e78] 200 bytes insert_work at 23fc2e #12 [9a069f40] 648 bytes __queue_work at 2487c0 #13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 #14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 #15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 #16 [9a06a3a0] 288 bytes __blk_mq_delay_run_hw_queue at 158168c #17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c #18 [9a06a580] 184 bytes blk_mq_sched_insert_requests at 15a2192 #19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a #20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 #21 [9a06acf8] 320 bytes schedule at 219e476 #22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac #23 [9a06b130] 408 bytes wait_for_common at 21a1706 #24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 #25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 #26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 #27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 #28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 #29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e #30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 #31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 #32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca #33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce #34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 #35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de #36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e #37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 #38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 #39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 #40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 #41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 #42 [9a06e518] 88 bytes do_writepages at 73fe6a #43 [9a06e570] 872 bytes __writeback_single_inode at a20cb6 #44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 #45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 #46 [9a06ec98] 928 bytes wb_writeback at a2500e #47 [9a06f038] 848 bytes wb_do_writeback at a260ae #48 [9a06f388] 536 bytes wb_workfn at a28228 #49 [9a06f5a0] 1088 bytes process_one_work at 24a234 #50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 #51 [9a06fe40] 104 bytes kthread at 26545a #52 [9a06fea8] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Here is a link to the launchpad report: https://bugs.launchpad.net/qemu/+bug/1802915 |
@zhangchn I was wondering, is there a formula that determines what the Someone could make a function that sets the right value based on the monitor that is currently connected. |
@CuriousTommy That's just the inverse of 60hz: 1/60hz=0.016666s. |
https://gist.github.com/Saren-Arterius/b31df23d5c359523f1ef430d27b0e834 |
I have set GUI_REFRESH_INTERVAL_DEFAULT to 17 as described and then recompiled quemu, but the framerate seems to bee still as low as before. Does anyone have the same issue, or know why that might be? |
@TerrenceXu : Is there any plan to implement adaptive fps in the near future? I think that would be very important for gvt to be more practical/useful. |
@tituslabienus: I just upgraded from qemu 4.2.0 (the Arch repository package) to qemu 5.0.0 rc4 (the Arch AUR qemu-git package), which I have modified with GUI_REFRESH_INTERVAL_DEFAULT=17. I noticed a definite improvement in smoothness. In the guest I opened a YouTube video with a 30fps/60fps side-by-side comparison and could clearly see that the 60fps was way smoother. However, the mouse cursor is still very choppy and still "feels" like 30fps, which is surprising to me. Does anyone know if there is something special going on with the mouse in qemu with Intel GVT-g? |
Relevant patch |
Are you 100% sure that Qemu is using the newly compiled binaries? I was stupid enough to just run
Was this not merged to the master branch already? Am I the only one still struggling with a laggy/stuttery/"30FPS" mouse in the libvirt viewer, even after applying the "60FPS" patch? |
@matthiasmeten Nope you're not the only one. I'm stuck with Low FPS mouse cursor with 60FPS virtualized desktop. Any fix for this? |
Passing through a mouse, for instance, makes it smooth, but now I notice some general visual latency. Could be my video mode though |
I'm working this out on my laptop and I have passed through my touchpad to guest using evdev tutorial I found online. The cursor lag seems to disappear when I remove spice channel from my vm but the rendered resolution is wrong (though the guest reports 1920x1080, the display window has vertical bars on the side). If I add spice channel back, the mouse cursor starts lagging. Is this something related to spice buffers? Edit: Fixed it by adding xres and yres Refer |
@metenn @UmashankarTriforce Am the dev who send the initial patch here. I've tracked down the bug, but the repair requires input from the guys maintaining that piece of code, I've already tried to email kraxel, but I got no reply. The issue is that the check ALWAYS fails due to an oversight in the "execution procedure". At the time when So this means that the code is effectively (so called 'ghost code') that never gets executed. When does the function run exactly? Only during window initialization. |
I've got two, somewhat sane solutions, but obviously I need input, since I'm not too familiar with the QEMU codebase.
|
And no, removing the check doesn't work, since it will always default to monitor 0, which might not be the one you're using fxp. |
@kraxel Hate to bother you, but can you give your input on this, I would greatly appreciate it. Thanks for your work, by the way. |
Nevermind, one of the maintainers got to me and wrote a patch. Should be merged pretty quick. |
In the existing NVMeOF Passthru core command handling on failure of nvme_alloc_request() it errors out with rq value set to NULL. In the error handling path it calls blk_put_request() without checking if rq is set to NULL or not which produces following Oops:- [ 1457.346861] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1457.347838] #PF: supervisor read access in kernel mode [ 1457.348464] #PF: error_code(0x0000) - not-present page [ 1457.349085] PGD 0 P4D 0 [ 1457.349402] Oops: 0000 [#1] SMP NOPTI [ 1457.349851] CPU: 18 PID: 10782 Comm: kworker/18:2 Tainted: G OE 5.8.0-rc4nvme-5.9+ #35 [ 1457.350951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214 [ 1457.352347] Workqueue: events nvme_loop_execute_work [nvme_loop] [ 1457.353062] RIP: 0010:blk_mq_free_request+0xe/0x110 [ 1457.353651] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8 [ 1457.355975] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282 [ 1457.356636] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 [ 1457.357526] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000 [ 1457.358416] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000 [ 1457.359317] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000 [ 1457.360424] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8 [ 1457.361322] FS: 0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000 [ 1457.362337] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1457.363058] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0 [ 1457.363973] Call Trace: [ 1457.364296] nvmet_passthru_execute_cmd+0x150/0x2c0 [nvmet] [ 1457.364990] process_one_work+0x24e/0x5a0 [ 1457.365493] ? __schedule+0x353/0x840 [ 1457.365957] worker_thread+0x3c/0x380 [ 1457.366426] ? process_one_work+0x5a0/0x5a0 [ 1457.366948] kthread+0x135/0x150 [ 1457.367362] ? kthread_create_on_node+0x60/0x60 [ 1457.367934] ret_from_fork+0x22/0x30 [ 1457.368388] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corer [ 1457.368414] ata_piix crc32c_intel virtio_pci libata virtio_ring serio_raw t10_pi virtio floppy dm_] [ 1457.380849] CR2: 0000000000000000 [ 1457.381288] ---[ end trace c6cab61bfd1f68fd ]--- [ 1457.381861] RIP: 0010:blk_mq_free_request+0xe/0x110 [ 1457.382469] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8 [ 1457.384749] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282 [ 1457.385393] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 [ 1457.386264] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000 [ 1457.387142] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000 [ 1457.388029] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000 [ 1457.388914] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8 [ 1457.389798] FS: 0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000 [ 1457.390796] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1457.391508] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0 [ 1457.392525] Kernel panic - not syncing: Fatal exception [ 1457.394138] Kernel Offset: disabled [ 1457.394677] ---[ end Kernel panic - not syncing: Fatal exception ]--- We fix this Oops by adding a new goto label out_put_req and reordering the blk_put_request call to avoid calling blk_put_request() with rq value is set to NULL. Here we also update the rest of the code accordingly. Fixes: 06b7164dfdc0 ("nvmet: add passthru code to process commands") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
I am experiencing latency on kernel 5.8.10 and qemu built from master branch of the official repo (5.1.50). After setting GUI_REFRESH_INTERVAL_DEFAULT it's all smooth but I fee a general lag about 100-200ms (maybe due to the optimus graphics). |
Get and put the reference to the ctrl in the nvme_dev_open() and nvme_dev_release() before and after module get/put for ctrl in char device file operations. Introduce char_dev relase function, get/put the controller and module which allows us to fix the potential Oops which can be easily reproduced with a passthru ctrl (although the problem also exists with pure user access): Entering kdb (current=0xffff8887f8290000, pid 3128) on processor 30 Oops: (null) due to oops @ 0xffffffffa01019ad CPU: 30 PID: 3128 Comm: bash Tainted: G W OE 5.8.0-rc4nvme-5.9+ #35 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.4 RIP: 0010:nvme_free_ctrl+0x234/0x285 [nvme_core] Code: 57 10 a0 e8 73 bf 02 e1 ba 3d 11 00 00 48 c7 c6 98 33 10 a0 48 c7 c7 1d 57 10 a0 e8 5b bf 02 e1 8 RSP: 0018:ffffc90001d63de0 EFLAGS: 00010246 RAX: ffffffffa05c0440 RBX: ffff8888119e45a0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8888177e9550 RDI: ffff8888119e43b0 RBP: ffff8887d4768000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffc90001d63c90 R12: ffff8888119e43b0 R13: ffff8888119e5108 R14: dead000000000100 R15: ffff8888119e5108 FS: 00007f1ef27b0740(0000) GS:ffff888817600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa05c0470 CR3: 00000007f6bee000 CR4: 00000000003406e0 Call Trace: device_release+0x27/0x80 kobject_put+0x98/0x170 nvmet_passthru_ctrl_disable+0x4a/0x70 [nvmet] nvmet_passthru_enable_store+0x4c/0x90 [nvmet] configfs_write_file+0xe6/0x150 vfs_write+0xba/0x1e0 ksys_write+0x5f/0xe0 do_syscall_64+0x52/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f1ef1eb2840 Code: Bad RIP value. RSP: 002b:00007fffdbff0eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1ef1eb2840 RDX: 0000000000000002 RSI: 00007f1ef27d2000 RDI: 0000000000000001 RBP: 00007f1ef27d2000 R08: 000000000000000a R09: 00007f1ef27b0740 R10: 0000000000000001 R11: 0000000000000246 R12: 00007f1ef2186400 R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000 With this patch fix we take the module ref count in nvme_dev_open() and release that ref count in newly introduced nvme_dev_release(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Brian Foster reported a lockdep warning on xfs/167: ============================================ WARNING: possible recursive locking detected 5.11.0-rc4 #35 Tainted: G W I -------------------------------------------- fsstress/17733 is trying to acquire lock: ffff8e0fd1d90650 (sb_internal){++++}-{0:0}, at: xfs_free_eofblocks+0x104/0x1d0 [xfs] but task is already holding lock: ffff8e0fd1d90650 (sb_internal){++++}-{0:0}, at: xfs_trans_alloc_inode+0x5f/0x160 [xfs] stack backtrace: CPU: 38 PID: 17733 Comm: fsstress Tainted: G W I 5.11.0-rc4 #35 Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018 Call Trace: dump_stack+0x8b/0xb0 __lock_acquire.cold+0x159/0x2ab lock_acquire+0x116/0x370 xfs_trans_alloc+0x1ad/0x310 [xfs] xfs_free_eofblocks+0x104/0x1d0 [xfs] xfs_blockgc_scan_inode+0x24/0x60 [xfs] xfs_inode_walk_ag+0x202/0x4b0 [xfs] xfs_inode_walk+0x66/0xc0 [xfs] xfs_trans_alloc+0x160/0x310 [xfs] xfs_trans_alloc_inode+0x5f/0x160 [xfs] xfs_alloc_file_space+0x105/0x300 [xfs] xfs_file_fallocate+0x270/0x460 [xfs] vfs_fallocate+0x14d/0x3d0 __x64_sys_fallocate+0x3e/0x70 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The cause of this is the new code that spurs a scan to garbage collect speculative preallocations if we fail to reserve enough blocks while allocating a transaction. While the warning itself is a fairly benign lockdep complaint, it does expose a potential livelock if the rwsem behavior ever changes with regards to nesting read locks when someone's waiting for a write lock. Fix this by freeing the transaction and jumping back to xfs_trans_alloc like this patch in the V4 submission[1]. [1] https://lore.kernel.org/linux-xfs/161142798066.2171939.9311024588681972086.stgit@magnolia/ Fixes: a1a7d05 ("xfs: flush speculative space allocations when we run out of space") Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
The guest cursor moves smoothly when using a pass-through/USB-redirected mouse. It seems like the cursor movement is somehow delayed from host to guest. |
After enabling CONFIG_SCHED_CORE (landed during 5.14 cycle), 2-core 2-thread-per-core interAptiv (CPS-driven) started emitting the following: [ 0.025698] CPU1 revision is: 0001a120 (MIPS interAptiv (multi)) [ 0.048183] ------------[ cut here ]------------ [ 0.048187] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6025 sched_core_cpu_starting+0x198/0x240 [ 0.048220] Modules linked in: [ 0.048233] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc3+ #35 b7b319f24073fd9a3c2aa7ad15fb7993eec0b26f [ 0.048247] Stack : 817f0000 00000004 327804c8 810eb050 00000000 00000004 00000000 c314fdd1 [ 0.048278] 830cbd64 819c0000 81800000 817f0000 83070bf4 00000001 830cbd08 00000000 [ 0.048307] 00000000 00000000 815fcbc4 00000000 00000000 00000000 00000000 00000000 [ 0.048334] 00000000 00000000 00000000 00000000 817f0000 00000000 00000000 817f6f34 [ 0.048361] 817f0000 818a3c00 817f0000 00000004 00000000 00000000 4dc33260 0018c933 [ 0.048389] ... [ 0.048396] Call Trace: [ 0.048399] [<8105a7bc>] show_stack+0x3c/0x140 [ 0.048424] [<8131c2a0>] dump_stack_lvl+0x60/0x80 [ 0.048440] [<8108b5c0>] __warn+0xc0/0xf4 [ 0.048454] [<8108b658>] warn_slowpath_fmt+0x64/0x10c [ 0.048467] [<810bd418>] sched_core_cpu_starting+0x198/0x240 [ 0.048483] [<810c6514>] sched_cpu_starting+0x14/0x80 [ 0.048497] [<8108c0f8>] cpuhp_invoke_callback_range+0x78/0x140 [ 0.048510] [<8108d914>] notify_cpu_starting+0x94/0x140 [ 0.048523] [<8106593c>] start_secondary+0xbc/0x280 [ 0.048539] [ 0.048543] ---[ end trace 0000000000000000 ]--- [ 0.048636] Synchronize counters for CPU 1: done. ...for each but CPU 0/boot. Basic debug printks right before the mentioned line say: [ 0.048170] CPU: 1, smt_mask: So smt_mask, which is sibling mask obviously, is empty when entering the function. This is critical, as sched_core_cpu_starting() calculates core-scheduling parameters only once per CPU start, and it's crucial to have all the parameters filled in at that moment (at least it uses cpu_smt_mask() which in fact is `&cpu_sibling_map[cpu]` on MIPS). A bit of debugging led me to that set_cpu_sibling_map() performing the actual map calculation, was being invocated after notify_cpu_start(), and exactly the latter function starts CPU HP callback round (sched_core_cpu_starting() is basically a CPU HP callback). While the flow is same on ARM64 (maps after the notifier, although before calling set_cpu_online()), x86 started calculating sibling maps earlier than starting the CPU HP callbacks in Linux 4.14 (see [0] for the reference). Neither me nor my brief tests couldn't find any potential caveats in calculating the maps right after performing delay calibration, but the WARN splat is now gone. The very same debug prints now yield exactly what I expected from them: [ 0.048433] CPU: 1, smt_mask: 0-1 [0] https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=76ce7cfe35ef Signed-off-by: Alexander Lobakin <alobakin@pm.me> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Hello,
I have dma-buf working with GVT-g on kernel 4.16 and QEMU 2.12. The framerate of the framebuffer output is very low however (seems to be around ~20 fps). I've run the Heaven benchmark in the VM, and that seems to perform about as expected, so it's only the dma-buf output that's slow. Does anyone else experience this?
I'm using Libvirt with local SPICE and OpenGL. Also tried different Intel drivers in the VM, no difference.
The text was updated successfully, but these errors were encountered: