-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vc4: Investigate scanning tile lists to skip loads/stores of unmodified tiles. #71
Comments
Could it work by setting the cs as the vs (no cs in this case), using a trivial rgba(1,1,1,1) fs and rendering to a downscaled render target, so that a pixel in the target corresponds to a tile in the "real" buffer. then reading the result back and only use the tiles, which are set in the buffer. Normally we would need only one tile, in extreme cases (HDR rendering or 4k displays, you would N´need some more). A problem could be, that the rasterizer might not draw the right pixels (see https://msdn.microsoft.com/en-us/library/windows/desktop/bb147314(v=vs.85).aspx for a example of the rasterization in DX, which should be the same as OGL). If the rasterize is conforming, then we would need to pad the resulting bitmap to the right bottom about one line, which would be not optimal. |
I think the problem there would be a frame that draws (say) a single pixel is unlikely to light up a pixel center of the downscaled image. |
commit 1c7de2b upstream. There is at least one Chelsio 10Gb card which uses VPD area to store some non-standard blocks (example below). However pci_vpd_size() returns the length of the first block only assuming that there can be only one VPD "End Tag". Since 4e1a635 ("vfio/pci: Use kernel VPD access functions"), VFIO blocks access beyond that offset, which prevents the guest "cxgb3" driver from probing the device. The host system does not have this problem as its driver accesses the config space directly without pci_read_vpd(). Add a quirk to override the VPD size to a bigger value. The maximum size is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h. We do not read the tag as the cxgb3 driver does as the driver supports writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes boundary. The quirk is registered for all devices supported by the cxgb3 driver. This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3 driver itself accesses VPD directly and the problem only exists with the vfio-pci driver (when cxgb3 is not running on the host and may not be even loaded) which blocks accesses beyond the first block of VPD data. However vfio-pci itself does not have quirks mechanism so we add it to PCI. This is the controller: Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030] This is what I parsed from its VPD: === b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K' 0000 Large item 42 bytes; name 0x2 Identifier String b'10 Gigabit Ethernet-SR PCI Express Adapter' 002d Large item 74 bytes; name 0x10 #00 [EC] len=7: b'D76809 ' #0a [FN] len=7: b'46K7897' #14 [PN] len=7: b'46K7897' #1e [MN] len=4: b'1037' #25 [FC] len=4: b'5769' #2c [SN] len=12: b'YL102035603V' #3b [NA] len=12: b'00145E992ED1' 007a Small item 1 bytes; name 0xf End Tag 0c00 Large item 16 bytes; name 0x2 Identifier String b'S310E-SR-X ' 0c13 Large item 234 bytes; name 0x10 #00 [PN] len=16: b'TBD ' #13 [EC] len=16: b'110107730D2 ' #26 [SN] len=16: b'97YL102035603V ' #39 [NA] len=12: b'00145E992ED1' #48 [V0] len=6: b'175000' #51 [V1] len=6: b'266666' #5a [V2] len=6: b'266666' #63 [V3] len=6: b'2000 ' #6c [V4] len=2: b'1 ' #71 [V5] len=6: b'c2 ' #7a [V6] len=6: b'0 ' #83 [V7] len=2: b'1 ' #88 [V8] len=2: b'0 ' #8d [V9] len=2: b'0 ' #92 [VA] len=2: b'0 ' #97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0d00 Large item 252 bytes; name 0x11 #00 [VC] len=16: b'122310_1222 dp ' #13 [VD] len=16: b'610-0001-00 H1\x00\x00' #26 [VE] len=16: b'122310_1353 fp ' #39 [VF] len=16: b'610-0001-00 H1\x00\x00' #4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0dff Small item 0 bytes; name 0xf End Tag 10f3 Large item 13315 bytes; name 0x62 !!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00' === Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Running ./test_verifier as unprivileged lets 1 out of 98 tests fail: [...] #71 unpriv: check that printk is disallowed FAIL Unexpected error message! 0: (7a) *(u64 *)(r10 -8) = 0 1: (bf) r1 = r10 2: (07) r1 += -8 3: (b7) r2 = 8 4: (bf) r3 = r1 5: (85) call bpf_trace_printk#6 unknown func bpf_trace_printk#6 [...] The test case is correct, just that the error outcome changed with ebb676d ("bpf: Print function name in addition to function id"). Same as with e00c7b2 ("bpf: fix multiple issues in selftest suite and samples") issue 2), so just fix up the function name. Fixes: ebb676d ("bpf: Print function name in addition to function id") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following false warning among others for LLIST_HEAD and PLIST_HEAD: WARNING: Missing a blank line after declarations #71: FILE: drivers/s390/scsi/zfcp_fsf.c:422: + struct hlist_node *tmp; + HLIST_HEAD(remove_queue); Link: http://lkml.kernel.org/r/20170614133512.89425-1-maier@linux.vnet.ibm.com Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Could we run a user shader in between binning completing and kicking off rendering, to scan for which tiles had no primitives binned to them and patch up the RCL to skip those? This could massively improve X and compositor performance. Thinking about this because I'm submitting yet another bug report for "Hey, could you scissor your rendering, please?" Would it be worth it even if we were scanning on the CPU?
The text was updated successfully, but these errors were encountered: