Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RISC-V ASM-to-JSON conversion #506

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

anishathalye
Copy link

CompCert implements ASM-to-JSON conversion for the ARM and PowerPC architectures, but this functionality is missing for RISC-V. This patch implements this functionality. The implementation is based on the existing ARM/PowerPC AsmToJSON.ml implementations.

@anishathalye
Copy link
Author

anishathalye commented Nov 26, 2023

I'm neither an expert in CompCert or in OCaml, so any feedback on this PR would be appreciated 🙂

I needed this functionality for a research project I'm working on, so I went ahead and implemented it, but I thought I might as well try contributing upstream in case anyone else could benefit from this.

@xavierleroy
Copy link
Contributor

Thanks for the code! Maybe @bschommer or @m-schmidt could review it, as they wrote the ASM-to-JSON converters for the other platforms, and know this code better than me.

@bschommer
Copy link
Member

Thanks for the code from us too. The ASM-to-JSON converters are
currently only used for some internal tool, which is only available for
certain Architectures, however that might change in the future. So we
cannot guarantee that if we might need the code in the future or that
we do not change it.

I needed this functionality for a research project I'm working on, so
I went ahead and implemented it, but I thought I might as well try
contributing upstream in case anyone else could benefit from this.

Can you already tell for which research project you will us this?

@anishathalye
Copy link
Author

So we cannot guarantee that if we might need the code in the future or that we do not change it.

This is used by the Valex tool, right? Anything else?

I wouldn't mind if this functionality were changed in the future in a backward-incompatible way, I will pin a version of CompCert, and I can also update my code if there are breaking changes in CompCert.

Can you already tell for which research project you will us this?

Sure! I'm working on a follow-up project to Knox (paper, code). I want to dump CompCert's RISC-V Asm AST (before running the Asmexpand stage, so pseudo-instructions like Pallocframe are preserved, and so that I don't have to write a parser for the .s files) so that I can consume this output in a different tool not written in Coq. I'm working on rewriting the CompCert Asm semantics in Rosette, so that I can use Rosette to symbolically execute programs that are compiled using CompCert.

Reusing this AsmToJSON machinery seemed like the easiest way to do it. In my version, I actually need to remove the Asmexpand stage that runs before AsmToJSON. Right now, this is hard-coded, but I was considering adding a flag for that and submitting a PR, so one could do e.g., ccomp -noexpandasm -sdump to dump the pre-expansion version of the Asm. That would require some fixup in the ARM/PPC AsmToJSON code as well, and would also require editing those to support dumping the pseudo-instructions that are currently expanded away (and adding some logic to panic when those pseudo-instructions exist and -noexpandasm is not passed, to match the current behavior and error-checking).

@bschommer
Copy link
Member

This is used by the Valex tool, right? Anything else?

Yes, it used only by Valex atm.

I wouldn't mind if this functionality were changed in the future in a backward-incompatible way, I will pin a version of CompCert, and I can also update my code if there are breaking changes in CompCert.

Can you already tell for which research project you will us this?

Sure! I'm working on a follow-up project to Knox (paper, code). I want to dump CompCert's RISC-V Asm AST (before running the Asmexpand stage, so pseudo-instructions like Pallocframe are preserved, and so that I don't have to write a parser for the .s files) so that I can consume this output in a different tool not written in Coq. I'm working on rewriting the CompCert Asm semantics in Rosette, so that I can use Rosette to symbolically execute programs that are compiled using CompCert.

Reusing this AsmToJSON machinery seemed like the easiest way to do it. In my version, I actually need to remove the Asmexpand stage that runs before AsmToJSON. Right now, this is hard-coded, but I was considering adding a flag for that and submitting a PR, so one could do e.g., ccomp -noexpandasm -sdump to dump the pre-expansion version of the Asm. That would require some fixup in the ARM/PPC AsmToJSON code as well, and would also require editing those to support dumping the pseudo-instructions that are currently expanded away (and adding some logic to panic when those pseudo-instructions exist and -noexpandasm is not passed, to match the current behavior and error-checking).

Okay, sounds interesting. Unfortunately for our use cases it's quite the opposite, we want to have less pseudo-instructions in the dumped assembly output, so I would say that we are not that interested in having a switch -noexpandasm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants