-
Notifications
You must be signed in to change notification settings - Fork 46
inconsistent use of immediate operand placeholders #17
Comments
Can you elaborate? I don't see the issue personally. In the first example:
The first In the second example:
It describes an instruction using 16-bit register or memory, followed by an 8-bit immediate, which is always signed and sign extends to 16-bit. Then in the opcode In general instruction manuals don't really care about signedness/unsignedness of immediates, but AsmJit, which uses asmdb does distinguish between signed / unsigned. |
BTW: thanks for this really awesome project. As you probably figured, I am processing the table programmatically for my own instruction encoder/decoder. Having said that, saying that this is just a transcription of the manuals is totally fine. I would love to base my encoder completely on your table without having to consult additional documentation. ;-) |
Although I designed the tables, I found it still to be pretty difficult to programatically generate assembler or disassembler out of the table. The problem is categorizing instructions into some groups that you can use to implement parts of the decoder / encoder. Not saying it's impossible, but it's difficult to group stuff - so maybe you will end up generating each instruction separately, which is wasteful :) For the case of encoding - you can actually describe immediate value as For the case of decoding, signed would be preferred when working with GP instructions, and unsigned when working with SIMD - for example a predicate in PSHUFD would be decoded as unsigned, (not sure I answered all the questions) |
I am quite confident that at least the decoder can be done entirely table driven. I only care about rather simple instruction that a typical compiler would generate so segment stuff is not important. I am also rather new to x86 encodings, so I am not 100% sure that knowing signedness is as important as I think it is. mov.q reg, -1 the immediate is only one byte and will be extended to a quad word by the CPU. Obviously, the value stored in the register |
I have completed the work on my decoder based on your tables and have confirmed that my output matches (The exact list of opcodes considered is here: https://github.com/robertmuth/Cwerg/blob/master/CpuX64/opcode_tab.py) I am super pleased with asmdb and will focus on an encoder next.
For example: since ib gets signed extended to id before adding More importantly: Should change the |
Example:
"add" , "x:al, ib/ub" , "I" , "04 ib"
place holders do not match ib/ub vs ib
On the other hand
"add" , "x:r16/m16, ib" , "MI" , "66 83 /0 ib"
uses ib consistently
The text was updated successfully, but these errors were encountered: