Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some constants not included in output #315

Closed
burgerindividual opened this issue Oct 1, 2024 · 11 comments · Fixed by #323, #327 or #328
Closed

Some constants not included in output #315

burgerindividual opened this issue Oct 1, 2024 · 11 comments · Fixed by #323, #327 or #328

Comments

@burgerindividual
Copy link

When writing SIMD code, I've noticed that some constants don't get shown, even with --include-constants.

Example:

#[no_mangle]
pub extern "C" fn test() {
    let simd_reg = unsafe {
        std::arch::x86_64::_mm_set_epi8(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
    };
    std::hint::black_box(simd_reg);
}

Command: cargo asm --include-constants test
Output:

.section .text.test,"ax",@progbits
        .globl  test
        .p2align        4, 0x90
        .type   test,@function
test:
        .cfi_startproc
        vmovaps xmm0, xmmword ptr [rip + .LCPI21_0]
        vmovaps xmmword ptr [rsp - 24], xmm0
        lea rax, [rsp - 24]
        #APP
        #NO_APP
        ret

In this example, .LCPI21_0 should be shown.

@pacak
Copy link
Owner

pacak commented Oct 1, 2024

Yeah, looks like they are get parsed as "generic directive" and are simplified away by the pretty printer. Should be fixable.

@burgerindividual
Copy link
Author

I'm not sure that this is entirely fixed. I have a function that has 7 constants used, but only seems to show 3 of them. .LCPI21_1, .LCPI21_4, .LCPI21_5, and .LCPI21_6 are missing. I'll try to come up with a way to reproduce this.

.section .text.test_pack,"ax",@progbits
	.globl	test_pack
	.p2align	4, 0x90
.type	test_pack,@function
test_pack:
	.cfi_startproc
	vmovd xmm2, edi
	vpshufb xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
	vpand xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
	vmovdqa xmm4, xmmword ptr [rip + .LCPI21_2]
	vinserti128 ymm0, ymm4, xmm0, 1
	vinserti128 ymm2, ymm2, xmm3, 1
	vpshufb ymm0, ymm2, ymm0
	vmovdqa xmm2, xmmword ptr [rip + .LCPI21_3]
	vinserti128 ymm1, ymm2, xmm1, 1
	vpsllw ymm2, ymm0, 4
	vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
	vpsllw ymm1, ymm1, 5
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpsllw ymm2, ymm0, 2
	vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
	vpand ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
	vpaddb ymm1, ymm1, ymm1
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpaddb ymm2, ymm0, ymm0
	vpaddb ymm1, ymm1, ymm1
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpmovmskb eax, ymm0
	vzeroupper
	ret

======================= Additional context =========================

.LCPI21_0:
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2

.LCPI21_2:
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0

.LCPI21_3:
	.byte	128
	.byte	128
	.byte	128
	.byte	64
	.byte	64
	.byte	64
	.byte	32
	.byte	32
	.byte	32
	.byte	16
	.byte	16
	.byte	16
	.byte	8
	.byte	8
	.byte	8

@burgerindividual
Copy link
Author

burgerindividual commented Oct 14, 2024

for reference, this is what's generated from rustc with --emit asm (everything filtered is a .zero directive)

	.section	.rodata.cst16,"aM",@progbits,16
	.p2align	4, 0x0
.LCPI21_0:
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
.LCPI21_1:
	.zero	16,3
.LCPI21_2:
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
.LCPI21_3:
	.byte	128
	.byte	128
	.byte	128
	.byte	64
	.byte	64
	.byte	64
	.byte	32
	.byte	32
	.byte	32
	.byte	16
	.byte	16
	.byte	16
	.byte	8
	.byte	8
	.byte	8
	.byte	4
	.section	.rodata.cst32,"aM",@progbits,32
	.p2align	5, 0x0
.LCPI21_4:
	.zero	32,240
.LCPI21_5:
	.zero	32,252
.LCPI21_6:
	.zero	32,224
	.section	.text.test_pack,"ax",@progbits
	.globl	test_pack
	.p2align	4, 0x90
	.type	test_pack,@function
test_pack:
	vmovd	xmm2, edi
	vpshufb	xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
	vpand	xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
	vmovdqa	xmm4, xmmword ptr [rip + .LCPI21_2]
	vinserti128	ymm0, ymm4, xmm0, 1
	vinserti128	ymm2, ymm2, xmm3, 1
	vpshufb	ymm0, ymm2, ymm0
	vmovdqa	xmm2, xmmword ptr [rip + .LCPI21_3]
	vinserti128	ymm1, ymm2, xmm1, 1
	vpsllw	ymm2, ymm0, 4
	vpand	ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
	vpsllw	ymm1, ymm1, 5
	vpblendvb	ymm0, ymm0, ymm2, ymm1
	vpsllw	ymm2, ymm0, 2
	vpand	ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
	vpand	ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
	vpaddb	ymm1, ymm1, ymm1
	vpblendvb	ymm0, ymm0, ymm2, ymm1
	vpaddb	ymm2, ymm0, ymm0
	vpaddb	ymm1, ymm1, ymm1
	vpblendvb	ymm0, ymm0, ymm2, ymm1
	vpmovmskb	eax, ymm0
	vzeroupper
	ret

@pacak
Copy link
Owner

pacak commented Oct 14, 2024

I'm not sure that this is entirely fixed. I have a function that has 7 constants used, but only seems to show 3 of them.

Is it using latest git release? It's not published yet at crates.io, I'm still looking at fixing some windows/mac regressions.

@burgerindividual
Copy link
Author

This is using commit 34f22d8 which seems to currently be latest

@pacak
Copy link
Owner

pacak commented Oct 14, 2024

I see. I appreciate the second report, will try to fix that a bit better :)

@pacak pacak reopened this Oct 14, 2024
@burgerindividual
Copy link
Author

If it helps, this seems to be the regex for Compiler Explorer's detection for data directives

@pacak
Copy link
Owner

pacak commented Oct 14, 2024

Yup, .zero is missing. I wonder if license allows me to steal the whole regexp...

@burgerindividual
Copy link
Author

I wonder if license allows me to steal the whole regexp...

It's BSD-2 so I think you need to include the license and copyright. Not a lawyer, though, so not totally sure.

@burgerindividual
Copy link
Author

burgerindividual commented Oct 14, 2024

I just tested the latest commit, and it seems to still have a small issue. The directive seems to get recognized, but the actual .zero statement doesn't seem to be included in the output. Actually, all of the constants seem to have one line cut off at the end of each one. Not sure if you want me to open a new issue for it, just lmk.

.section .text.test_pack,"ax",@progbits
	.globl	test_pack
	.p2align	4, 0x90
.type	test_pack,@function
test_pack:
	.cfi_startproc
	vmovd xmm2, edi
	vpshufb xmm3, xmm2, xmmword ptr [rip + .LCPI21_0]
	vpand xmm0, xmm0, xmmword ptr [rip + .LCPI21_1]
	vmovdqa xmm4, xmmword ptr [rip + .LCPI21_2]
	vinserti128 ymm0, ymm4, xmm0, 1
	vinserti128 ymm2, ymm2, xmm3, 1
	vpshufb ymm0, ymm2, ymm0
	vmovdqa xmm2, xmmword ptr [rip + .LCPI21_3]
	vinserti128 ymm1, ymm2, xmm1, 1
	vpsllw ymm2, ymm0, 4
	vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_4]
	vpsllw ymm1, ymm1, 5
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpsllw ymm2, ymm0, 2
	vpand ymm2, ymm2, ymmword ptr [rip + .LCPI21_5]
	vpand ymm1, ymm1, ymmword ptr [rip + .LCPI21_6]
	vpaddb ymm1, ymm1, ymm1
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpaddb ymm2, ymm0, ymm0
	vpaddb ymm1, ymm1, ymm1
	vpblendvb ymm0, ymm0, ymm2, ymm1
	vpmovmskb eax, ymm0
	vzeroupper
	ret

======================= Additional context =========================

.LCPI21_0:
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2
	.byte	128
	.byte	0
	.byte	1
	.byte	2

.LCPI21_1:

.LCPI21_2:
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0
	.byte	2
	.byte	1
	.byte	0

.LCPI21_3:
	.byte	128
	.byte	128
	.byte	128
	.byte	64
	.byte	64
	.byte	64
	.byte	32
	.byte	32
	.byte	32
	.byte	16
	.byte	16
	.byte	16
	.byte	8
	.byte	8
	.byte	8

.LCPI21_4:

.LCPI21_5:

.LCPI21_6:

@pacak
Copy link
Owner

pacak commented Oct 14, 2024

I just tested the latest commit, and it seems to still have a small issue

Hmm... Off by one somewhere it seems. Checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants