Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

movabs incorrect? #108

Open
mlgiraud opened this issue Nov 26, 2024 · 4 comments
Open

movabs incorrect? #108

mlgiraud opened this issue Nov 26, 2024 · 4 comments

Comments

@mlgiraud
Copy link

mlgiraud commented Nov 26, 2024

Hi, im currently writing a jit compiler and need to load a 64 bit immediate value into a register. The only way to do this in one instruction afaik is by using the mov encoding movabs rax, imm64. This however currently emits code that should correspond to something like movabs rax, [imm64], i.e. it tries to load from the immediate address. These encodings exist, but should be generated when writing movabs rax, [imm64] imho.

The language spec in the documentation says:

movabs al, imm64
movabs ax, imm64
movabs eax, imm64
movabs imm64, al
movabs imm64, ax
movabs imm64, eax
movabs imm64, rax
movabs rax, imm64

Which should probably be

movabs al, [imm64]
movabs ax, [imm64]
movabs eax, [imm64]
movabs [imm64], al
movabs [imm64], ax
movabs [imm64], eax
movabs [imm64], rax
movabs rax, [imm64]

movabs reg64, imm64 <---- This is missing

The spec in the documentation says that mov reg64, imm64 can be used, but this results in an error (or the truncation of imm64 to a 32 bit value if you leave the type up to the compiler with as _).
I think this should be moved to movabs reg64, imm64.

EDIT: So i misinterpreted the spec, and it actually is possible to encode movabs reg64, imm64 by providing mov rax, QWORD immediate as _, but i think the syntax should be corrected to align with other assembly tools.

@CensoredUsername
Copy link
Owner

The correct way to do that is indeed to do mov reg, QWORD imm.

movabs isn't actually an opcode in x64. For some reason at&t style uses it, but intel/nasm don't. But the issue is that in intel style assembly there isn't an explicit way to denote a 64-bit displacement in a memory reference. This has resulted in several assemblers using different ways to denote this. Dynamic assemblers like dynasm-rs or Luajit cannot just look at the value, so they use alternative assembler mnemnonics. Luajit uses mov64, I picked movabs (because the operation is move to/from absolute address). This is explicitly stated in the documentation.

Yes, this does cause some confusion with AT&T style movabs, but considering that's a whole other dialect from the intel/nasm style we use here, I don't find that the biggest problem.

@mlgiraud
Copy link
Author

Yeah i see the reasoning and agree that it is probably nicer to use the mov reg, QWORD imm syntax. However, i would still suggest changing the movabs syntax to take the address via [imm64]. This was confusing for me since most (all?) instruction use the [] syntax when an address is involved, right? Also, other assemblers/disassemblers use the same syntax with the []. This will inevitably lead to confusion imho.

@CensoredUsername
Copy link
Owner

I understand that this would be clearer. Unfortunately with how the x64 backend works right now this would entail a pretty big rewrite of the parser/compiler. It is built towards memory references only being of the regular kind, which have a specific instruction structure, while 64-bit displacement mov is actually encoded as a single register + 64-bit immediate operand.

@mlgiraud
Copy link
Author

Yeah i suspected as much, maybe we can add a small disclaimer in the documentation for x86 in the movabs part?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants