Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at ARM64 Dissassembler #207

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

joyent-automation
Copy link

First pass at ARM64 Dissassembler

Dissassemble more load store, decode sysregs

Fix tests from failing

This PR was migrated-from-gerrit, https://cr.joyent.us/#/c/4242/.
The raw archive of this CR is here.
See MANTA-4594 for info on Joyent Eng's migration from Gerrit.

CR discussion

@rmustacc commented at 2018-06-19T00:06:10

Patch Set 1:

(76 comments)

Patch Set 1 code comments
usr/src/lib/libdisasm/Makefile.com#63 @rmustacc

This should all line up with the lines above it.

usr/src/lib/libdisasm/Makefile.com#63 @TheSafo

Done

usr/src/lib/libdisasm/Makefile.com#73 @rmustacc

These lines need to line up with the ones above it.

usr/src/lib/libdisasm/Makefile.com#73 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#12 @rmustacc

Why is this '' character here?

usr/src/lib/libdisasm/common/dis_arm64.c#12 @rmustacc

No need for arguments in the forward declaration.

Is there a reason this function isn't in your header file?

usr/src/lib/libdisasm/common/dis_arm64.c#12 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#12 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#38 @rmustacc

The intention of the construct is that you can just return the value of dis_seterrno which whill be -1.

usr/src/lib/libdisasm/common/dis_arm64.c#38 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#51 @rmustacc

This should almost certainly be static. You should still prefix the name. Actually, I may consider just making it part of dis_arm64_disassemble.

usr/src/lib/libdisasm/common/dis_arm64.c#51 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#55 @rmustacc

This doesn't make any sense.

usr/src/lib/libdisasm/common/dis_arm64.c#55 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#56 @rmustacc

Should be joined with previous line.

usr/src/lib/libdisasm/common/dis_arm64.c#56 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#59 @rmustacc

It's probably worth a comment explaining why this is being done.

usr/src/lib/libdisasm/common/dis_arm64.c#59 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#66 @rmustacc

Other platforms assign dh_addr to addr here.

usr/src/lib/libdisasm/common/dis_arm64.c#66 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#69 @rmustacc

This is the only place the cur_instr is used. I don't think that you need to deal with allocating the private structure and moving things into it. Just declare the value on the stack as it isn't used elsewhere. Then you can also get rid of the dis_arm64_handle_attach and dis_arm64_handle_detach entry points.

usr/src/lib/libdisasm/common/dis_arm64.c#69 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64.c#73 @rmustacc

Is there any reason you don't use dh_addr?

usr/src/lib/libdisasm/common/dis_arm64.c#73 @TheSafo

The arm64ins_t doesn't have access to the dhp, so I used this field.

usr/src/lib/libdisasm/common/dis_arm64.c#75 @rmustacc

Please declare the value at the top with everything else.

usr/src/lib/libdisasm/common/dis_arm64.c#75 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#0 @rmustacc

In general, you should make sure functions that aren't needed outside this file are static, which is most of them. Also you have wildly different prefixes for functions. I'd probably make sure it's all prefixed with arm64 or aarch64.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#0 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#15 @rmustacc

This might be clearer just calling it instr_extract_bits and in the others, especially as you spell it out later on in dis_extract_reg(), etc.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#15 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#21 @rmustacc

Please declare variables at the top of the function. This is true of everything, but I'm only going to call it out here.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#21 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#32 @rmustacc

Type warning, hibit and lowbit are unsigned and can therefore overflow an int from this arithmetic.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#32 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#33 @rmustacc

Missing spaces.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#33 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#49 @rmustacc

This argument is always used as a boolean. I'd suggest you instead use a boolean_t and B_TRUE and B_FALSE for values.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#49 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#53 @rmustacc

Probably worth asserting that we're not going to overflow the array before we write to it.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#53 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#63 @rmustacc

Probably worth asserting that we're not going to overflow the array before we write to it.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#63 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#65 @rmustacc

This isn't a macro, right? Why is this here?

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#65 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#72 @rmustacc

This is in libc, see fls(3C).

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#72 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#74 @rmustacc

May prefer i > 0 here in case someone ever manipulates i down below.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#74 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#86 @rmustacc

Probably should assert someone doesn't give you a value that'll overflow.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#86 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#90 @rmustacc

It's worth a comment to explain what this is trying to do for the rotation.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#90 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#98 @rmustacc

Is the pre-increment important here, otherwise just prefer the post-increment.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#98 @rmustacc

Shift is a uint8_t, use the same type please.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#98 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#98 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#109 @rmustacc

Technically yes, 64 / len will always fit inside of an int, is there a reason that the type values are shifting around here?

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#109 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#120 @rmustacc

Can an immediate value never be negative?

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#120 @TheSafo

This immediate value is solely for logical operations, so its not really signed/unisgned, just bits. Comments have been updated on this function to make that clear.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#139 @rmustacc

This really deserves a comment explaining what it's trying to do and where this algorithm is based on. If it's coming from the ARM manual, reference that.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#139 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#153 @rmustacc

Probably want the assertion here on usage.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#153 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#154 @rmustacc

Here you're casting the uint64_t to an int64_t. Are all the other users of this going to know that it should be an unsigned value? Or if it should be a signed value, should dis_decode_bitmasked_imm return a signed value?

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#154 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#185 @rmustacc

I'm going to stop mentioning the overflow check in subsequent occurences, but we should do it here as well.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#185 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#206 @rmustacc

Another set of names we should probably make lowercase.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#206 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#253 @rmustacc

Explicit comparison.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#253 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#594 @rmustacc

Explicit comparisons, please.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#594 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#810 @rmustacc

It may be worth using the c99 element declaration style. The tradeoff is that you'll need to list the number of entries in the array, bu the rest will be automatically NULL. See lib/libdisasm/common/dis_s390x.c+2423 for an example.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#810 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#1161 @rmustacc

Please use constants to indicate the group names.

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#1161 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#1191 @rmustacc

Any reason not to just let these fall into the default case?

usr/src/lib/libdisasm/common/dis_arm64_decoder.c#1191 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#2 @rmustacc

Needs to be fixed, see prototypes.h.

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#2 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#17 @rmustacc

Missing prefixes?

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#17 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#48 @rmustacc

Where does four come from? How did you determine the number. This should be a constant (macro) with text that explains where the number comes from.

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#48 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#52 @rmustacc

There's no need to place the name of the argument in the definition here.

usr/src/lib/libdisasm/common/dis_arm64_decoder.h#52 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#0 @rmustacc

We should probably lower case all the condition codes, register names, and extensions to be consistent with what we do on other platforms.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#0 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#14 @rmustacc

Should we consider using the ABI naming for these? As in the values that are used in the PCS? For example, I think it will be much clearer when we're working through this when we're dealing with the lr, ip, etc.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#14 @TheSafo

I think this would be viable. Arm seems to suggest in the procedure call manual:

"Note The special register names (IP0, IP1, FP and LR) should be used only in the context in which they are special. It is recommended that disassemblers always use the architectural names for the registers."

However, I don't exactly understand why one wouldn't want the special registers named/ what contexts they are used that are not "special".

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#18 @rmustacc

The spacing between values is inconsistent. If we're trying to line everything up, then everything should be lined up and not subsets of it and it should be consistent. Otherwise, we should just do a single space.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#18 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#25 @rmustacc

The spacing between values is inconsistent. If we're trying to line everything up, then everything should be lined up and not subsets of it and it should be consistent. Otherwise, we should just do a single space.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#25 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#52 @rmustacc

It appears this isn't used outside of this file. It should be static.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#52 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#61 @rmustacc

Should be declared at the start of the function.

How did you pick the size for this?

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#61 @TheSafo

Fixed. I picked arbitrarily. I now have switched it to a define constant, and gave it length 1024- which is the same size as BUFSIZE used in the dis_main.c

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#62 @rmustacc

This should be sizeof (fmt_buf).

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#62 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#68 @rmustacc

Missing punctuation. Why are we shifting it. The fact that you're shifting is clear from the code. The why isn't to someone who doesn't intuitively know all of the arm ISA.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#68 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#74 @rmustacc

Worth asserting that opnd->a64_value isn't greater than the number of array entries?

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#74 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#85 @rmustacc

Seems like this is sized based on fmt_buf, I think you want sizeof (fmt_buf).

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#85 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#88 @rmustacc

Put this comment above so you don't throw off the compiler. They're getting a bit cleverer about how they look for fall through of case statements.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#88 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#90 @rmustacc

Here and for all the others, why strlcat it to the buffer if you're just going to end up doing the snprintf below and concatenate that? Why not just make it part of the snprintf buffer and the concatenate it all at the end?

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#90 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#91 @rmustacc

Use sizeof, here and in all the others.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#91 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#119 @rmustacc

Please use explicit comparisons for non-boolean values.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#119 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#132 @rmustacc

What's this doing here?

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#132 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#136 @rmustacc

Probably wroth a comment about what's going on here and why.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#136 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#137 @rmustacc

Probably should be a macro for this constant.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#137 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#138 @rmustacc

Should probably be a macro somewhere for this constant.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#138 @TheSafo

Done

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#161 @rmustacc

libdis is built as c89, this needs to be pulled out.
Also, a64_num_opnds is a uint_t, therefore this needs to be a uint_t.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#161 @TheSafo

Done

usr/src/lib/libdisasm/common/libdisasm.c#72 @rmustacc

We should set up a gcc instance and see what the actual macro it puts out is here.

usr/src/lib/libdisasm/common/libdisasm.c#72 @TheSafo

Set up ubuntu to run on qemu for aarch64. Gcc on that compiles with this flag.

Other flags present include: __ARM_64BIT_STATE which according to the docs is set for AARCH64 targets, when they are in the 64-bit state, __ARM_ARCH_ISA_A64 which set for targets in AARCH64 bit state, when target supports A64 instruction set.

usr/src/test/util-tests/tests/dis/Makefile#67 @rmustacc

Why are these commented out?

usr/src/test/util-tests/tests/dis/Makefile#67 @TheSafo

Those aren't yet implemented. Uncommented though in latest.

usr/src/test/util-tests/tests/dis/arm64/tst.addsub-carry.s#13 @rmustacc

Here and others, the copyright year should be the current year, 2018.

usr/src/test/util-tests/tests/dis/arm64/tst.addsub-carry.s#13 @TheSafo

Done

usr/src/test/util-tests/tests/dis/arm64/tst.branches.s#25 @rmustacc

Remove whitespace

usr/src/test/util-tests/tests/dis/arm64/tst.branches.s#25 @TheSafo

Done

usr/src/test/util-tests/tests/dis/arm64/tst.ldst-reg-unsigned.s#19 @rmustacc

Figure out the XXX.

usr/src/test/util-tests/tests/dis/arm64/tst.ldst-reg-unsigned.s#19 @TheSafo

Done

usr/src/test/util-tests/tests/dis/arm64/tst.simd-modimm.s#51 @rmustacc

What's this about?

usr/src/test/util-tests/tests/dis/arm64/tst.simd-modimm.s#51 @TheSafo

This was taken from jperkin's test suite, which I assume must not have had proper FP output. I will uncomment these out, as I did with other not yet supported tests in makefile

usr/src/test/util-tests/tests/dis/arm64/tst.simd-scalarshift.out#1 @rmustacc

Why is this an empty file?

usr/src/test/util-tests/tests/dis/arm64/tst.simd-scalarshift.out#1 @TheSafo

Hmm my guess is jerkin never got around to testing/implementing properly the simd-scalarshift.s file so there is no disasembly to compare to.

usr/src/test/util-tests/tests/dis/distest.ksh#46 @rmustacc

I don't think this should be going back like this.

usr/src/test/util-tests/tests/dis/distest.ksh#46 @TheSafo

Done

usr/src/test/util-tests/tests/dis/distest.ksh#188 @rmustacc

Why did we switch to a case-insensitive diff?

usr/src/test/util-tests/tests/dis/distest.ksh#188 @TheSafo

Done

usr/src/test/util-tests/tests/dis/distest.ksh#193 @rmustacc

Why did this change? Why are we doing a toupper/tolower pass on this? Feels like this is trying to paper over something.

usr/src/test/util-tests/tests/dis/distest.ksh#193 @TheSafo

Fixed cases in code + put back case sensitivity

usr/src/test/util-tests/tests/dis/distest.ksh#206 @rmustacc

The comment should probably be expanded and explain why it's necessary. This isn't useful as is.

usr/src/test/util-tests/tests/dis/distest.ksh#206 @TheSafo

Done - see below

usr/src/test/util-tests/tests/dis/distest.ksh#229 @rmustacc

Why does arm64 need to be special cased here? It appears this is trying to work around the fact that we don't have a 32-bit arm dis. In that case, the tests should just be prefixed with '64' not 'tst'.

usr/src/test/util-tests/tests/dis/distest.ksh#229 @TheSafo

"-64" on the arm gas fails, thus arm64 needs to be special cased. Added a comment on this in the code

usr/src/test/util-tests/tests/dis/distest.ksh#255 @rmustacc

So are newlines actually getting embedded in here? If so, we should probably write up a ticket that explains what's going on. The comment should probably be expanded and explain why it's necessary.

usr/src/test/util-tests/tests/dis/distest.ksh#255 @TheSafo

In latest patch, I've updated the script to require fewer changes to the IFS. The issue was a lot of the script relies on the default IFS value, however specifying -p changed the IFS, and then did not properly restore it. This Is also commented on in latest

@TheSafo commented at 2018-06-21T19:26:10

Patch Set 1:

(75 comments)

@TheSafo commented at 2018-06-22T01:19:11

Patch Set 2:

(1 comment)

@rmustacc commented at 2018-07-17T00:31:42

Patch Set 3:

(4 comments)

Thanks, this looks pretty good. There's a small amount of clean up, but I think we can get it back shortly.

Patch Set 3 code comments
usr/src/lib/libdisasm/common/dis_arm64_fmt.c#98 @rmustacc

If you put something like this here, I think that gets confusing from a licensing perspective. We should make sure to credit him in the commit as he also works at Joyent, we don't have copyright concerns.

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#98 @TheSafo

Removed comment. Do I add jprekin through Gerritt or is that something I modify on the commit?

usr/src/lib/libdisasm/common/dis_arm64_fmt.c#98 @rmustacc

We'll do it on the commit. Remind me and I'll show you when we're ready to get this back.

usr/src/test/util-tests/tests/dis/distest.ksh#184 @rmustacc

It's not clear what this change is here for now.

usr/src/test/util-tests/tests/dis/distest.ksh#184 @TheSafo

This change was just a shortcut for what I often found myself doing with the output files (printing out a diff between the two). I will revert back to be more consistent with the other error cases.

usr/src/test/util-tests/tests/dis/distest.ksh#221 @rmustacc

Missing space after the '#' character.

usr/src/test/util-tests/tests/dis/distest.ksh#221 @TheSafo

Done

usr/src/test/util-tests/tests/dis/distest.ksh#285 @rmustacc

I don't think the comment here is necessary.

usr/src/test/util-tests/tests/dis/distest.ksh#285 @TheSafo

Done

@TheSafo commented at 2018-07-18T17:43:44

Patch Set 4:

(4 comments)

@rmustacc commented at 2018-07-19T21:11:22

Patch Set 4: Code-Review+1

(1 comment)

Dissassemble more load store, decode sysregs

Fix tests from failing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant