Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RISC-V Assembly (RISCV64) #199

Closed
5 tasks done
kazk opened this issue Jul 28, 2022 · 29 comments
Closed
5 tasks done

Add RISC-V Assembly (RISCV64) #199

kazk opened this issue Jul 28, 2022 · 29 comments
Assignees

Comments

@kazk
Copy link
Member

kazk commented Jul 28, 2022

We should be able to support by creating a container image for linux/riscv64 with:

FROM ubuntu:22.04

RUN apt-get update && apt-get install -y binutils

# TODO Install dependencies and compile test framework

WORKDIR /workspace
docker buildx build --platform linux/riscv64 -t $I .

Running

Running it with QEMU emulation (the host must have /proc/sys/fs/binfmt_misc/qemu-riscv64 enabled).

.data
.MSG: .string "Hello, RISCV64!\n"
.equ MSG_LEN, . - .MSG

.text
.globl  _start
_start:
        li a7, 64
        li a0, 1
        lla a1, .MSG
        li a2, MSG_LEN
        scall

        li a7, 93
        li a0, 0
        scall
W=/workspace
C=$(docker container create --rm --platform linux/riscv64 -w $W $I sh -c "as main.s -o main.o && ld -s main.o -o main && ./main")

docker container cp main.s $C:$W
docker container start --attach $C

Testing

Unfortunately, Criterion doesn't support RISC-V (dependencies fails to compile with invalid architecture). Found Cgreen which compiles, runs test in separate process, and supports custom reporter.

W=/workspace
C=$(docker container create --rm --platform linux/riscv64 -w $W $I sh -c "gcc -O2 solution.s solution_tests.c -lcgreen codewars_reporter.c tests.c -o tests && ./tests")

docker container cp solution.s $C:$W
docker container cp solution_tests.c $C:$W
docker container start --attach $C

  • Building
  • Executing
  • Testing
  • CodeMirror mode
  • Icon

👍 reaction might help to get this request prioritized.

@kazk kazk added this to Code Runner Jul 28, 2022
@kazk kazk moved this to Backlog in Code Runner Jul 28, 2022
@DonaldKellett
Copy link
Member

Amazing! Sounds like we already have testing figured out, but feel free to indicate whether there's anything the community could assist with.

@kazk
Copy link
Member Author

kazk commented Jul 28, 2022

No, I haven't figured out the testing yet. I've only confirmed Cgreen compiles in RISCV64.

I'm hoping we can test with something like:

as solution.s -o solution.o

gcc \
  solution.o \
  solution_tests.c \
  codewars_reporter.o \
  tests.c \
  -lcgreen \
  -o tests

./tests

with tests.c

#include <cgreen/cgreen.h>

TestSuite *solution_tests(); // defined in `solution_tests.c`

int main(int argc, char **argv) {
  TestSuite *suite = create_test_suite();
  add_suite(suite, solution_tests());
  return run_test_suite(suite, create_codewars_reporter()); // use our reporter
}

We need to define create_codewars_reporter() by following https://cgreen-devs.github.io/cgreen/cgreen-guide-en.html#_rolling_our_own

Maybe you can play with the test framework and see how it works.

@DonaldKellett
Copy link
Member

Created a repository https://github.com/DonaldKellett/codewars-riscv64 to experiment with RISCV64 support

@DonaldKellett
Copy link
Member

@kazk Initial support for Codewars output format has been added, feel free to take a look. I've also added basic examples of a syntax error and invalid register name.

@kazk
Copy link
Member Author

kazk commented Jul 30, 2022

Can you prefix all the commands with \n?

  printf("\n<DESCRIBE::>%s\n", name);

This way, the outputs from the solution without trailing \n won't affect them.

Also, you need to replace all \n within the message with <:LF:>.

<FAILED::>Expected [multiply(3, 5)] to [equal] [15]<:LF:>		actual value:			[8]<:LF:>		expected value:			[15]

We should skip the outermost <DESCRIBE::> for main to avoid adding an unnecessary level.


You can test how the output is rendered with Kumite.

  1. open https://www.codewars.com/kumite/new?language=python
  2. disable tests
  3. print the output
print("""
<DESCRIBE::>solution_tests
<IT::>works_for_some_fixed_tests
a = 3, b = 5
<FAILED::>Expected [multiply(3, 5)] to [equal] [15]
		actual value:			[8]
		expected value:			[15]
<COMPLETEDIN::>
<COMPLETEDIN::>
""")

image
(shows why you need to replace \n)

@kazk kazk moved this from Backlog to In Progress in Code Runner Jul 30, 2022
@DonaldKellett
Copy link
Member

@kazk Done, please review

@DonaldKellett
Copy link
Member

DonaldKellett commented Jul 30, 2022

For the CodeMirror mode, we can probably re-use syntax highlighting for NASM.

For the icon, maybe we can use an uncolored version of https://antmicro.com/images/03-06-01-riscv-hero.svg , perhaps without the "RISC-V" at the bottom?

@kazk
Copy link
Member Author

kazk commented Jul 30, 2022

Nitpicks:

  char buf[2];
  buf[1] = '\0';
  while (*message) {
    if (*message == '\n') { // <- add `{`
      strcat(escaped_message, CODEWARS_LF);
    } else {
      // instead of `sprintf(buf, "%c", *message);`
      buf[0] = *message;
      strcat(escaped_message, buf);
    }
    ++message;
  }

Also, why destroy_codewars_escape_message(escaped_message) instead of free(escaped_message)?

Can you try adding durations? I'd use https://github.com/cgreen-devs/cgreen/blob/master/src/cdash_reporter.c as a reference. Looks like the reporter can have a memo to persist state.

@kazk
Copy link
Member Author

kazk commented Jul 30, 2022

For CodeMirror mode, I don't think reusing NASM mode will work.

  • Comment starts with #
  • Different instructions
  • Different registers
  • Different directives

It shouldn't be difficult to create one.

@DonaldKellett
Copy link
Member

Also, why destroy_codewars_escape_message(escaped_message) instead of free(escaped_message)?

It's the counterpart to create_codewars_escape_message(message), included for (IMHO) readability and maintainability, in case create_codewars_escape_message(message) returns something more complex than a single calloc()'d buffer string in the future. But since that's extremely unlikely to ever happen, I can optimize it out to just free(escaped_message) if you insist.

Nitpicks:

Thanks, I'll change it shortly.

Can you try adding durations? I'd use https://github.com/cgreen-devs/cgreen/blob/master/src/cdash_reporter.c as a reference. Looks like the reporter can have a memo to persist state.

For CodeMirror mode, I don't think reusing NASM mode will work.

* Comment starts with `#`

* Different instructions

* Different registers

* Different directives

It shouldn't be difficult to create one.

Thanks for the feedback, I'll look into them shortly.

@kazk
Copy link
Member Author

kazk commented Jul 30, 2022

But since that's extremely unlikely to ever happen, I can optimize it out to just free(escaped_message) if you insist.

Yeah, it's just confusing. Even if that's likely to happen, I don't think it's necessary because it's an internal function only used once. Let's keep it simple and just call free(escaped_message).

@DonaldKellett
Copy link
Member

DonaldKellett commented Jul 31, 2022

@kazk Durations have been added. If the repo looks fine to you, I'd like to transfer it over to @codewars for further development and maintenance.

For CodeMirror, I might look into it over the next week or so if time allows.

@DonaldKellett
Copy link
Member

CodeMirror has a GAS mode for syntax highlighting that I suspect is directly applicable to RISC-V assembly once we define the architecture-dependent registers, since we use the GNU toolchain to assemble the solution and compile the tests: https://github.com/codemirror/codemirror5/blob/master/mode/gas/gas.js

If so, I'll skip creating my own repo and demo. Valid register names in RISC-V can be found at https://en.wikichip.org/wiki/risc-v/registers

@kazk
Copy link
Member Author

kazk commented Aug 1, 2022

Can you transfer the repo again? The link expired.

For CodeMirror mode, are you planning to contribute upstream to add riscv support?

By the way, I'm thinking of using language id riscv and version RV64. I don't know whether we'd ever support RV32 or RV128, but I don't think it should be separate if we do.

@DonaldKellett
Copy link
Member

@kazk Noted, I have re-initiated the transfer.

For CodeMirror mode, are you planning to contribute upstream to add riscv support?

Sure, I'll try to look into it over the next few days, perhaps tonight. I guess we can wait about a week after that, and if the PR isn't yet accepted (or gets rejected) by then, I'll consider creating my own repo with the changes.

By the way, I'm thinking of using language id riscv and version RV64. I don't know whether we'd ever support RV32 or RV128, but I don't think it should be separate if we do.

Sounds fine by me.

@kazk
Copy link
Member Author

kazk commented Aug 2, 2022

It might be easier to just create a new mode. It's not difficult.

Also, GAS mode requires passing options like:

const editor = CodeMirror.fromTextArea(code, {
  mode: {name: "gas", architecture: "riscv"},
});        

which will require some changes to Codewars' codebase.

@kazk
Copy link
Member Author

kazk commented Aug 3, 2022

Can you add an example with preloaded?

@kazk
Copy link
Member Author

kazk commented Aug 3, 2022

Failing tests have exit code 0.

Unless I'm misunderstanding, run_test_suite(solution_tests(), create_codewars_reporter()) should be returning non-zero on failure, but it's not.

@kazk
Copy link
Member Author

kazk commented Aug 3, 2022

Found the cause. The reporter's total_failures must be updated. I'll fix it soon.

I'll also look into avoid spamming Test Passed.

@DonaldKellett
Copy link
Member

Can you add an example with preloaded?

Sure, I'll open a PR tonight if time allows.

@kazk
Copy link
Member Author

kazk commented Aug 3, 2022

Do we want preloaded in C?
See #71 (comment)

@DonaldKellett
Copy link
Member

DonaldKellett commented Aug 4, 2022

Do we want preloaded in C? See #71 (comment)

Yes, we do, for the following reasons:

  • There's no point having preloaded in assembly since the solver does not control it
  • Having preloaded in C is a better user experience for both Kata authors and translators, like how we have tests in C instead of assembly
  • A preloaded section written in C is much more readable and maintainable, and easier to debug (for the Kata author / translator) when things go wrong

I'll also address some of the points mentioned in the linked issue thread:

In my opinion, preloaded should contain NASM code because you usually hide a copy of your solution in there.

Yes for random tests. If you use NASM in Preloaded, you can just reuse your own solution for it.

Unless I'm terribly mistaken, placing the / a reference solution in Preloaded is just wrong and definitely a Kata issue since the solver can access and invoke it directly as long as they know the name(s) of the function(s) involved, bypassing the Kata. So this point is moot. Preloaded should only be used for additional boilerplate common to both sample and submission tests, or helper functions fully intended to be utilized by the solver.

Having Preloaded in the same language as the solution language is as far as I can see standard practice for all languages.

Having the sample / submission tests in the same language as the solution is also standard practice on CW, but there are exceptions sometimes because it's simply not possible to use the same language across solution and tests (e.g. SQL with tests in Ruby, BF with tests in JS), or doing so will degrade the experience for Kata authors and translators (e.g. NASM with tests in C). So I don't see why we shouldn't make similar exceptions for preloaded code which the solver does not control.

There's little point to make it in C because there's already a module in C, the tests.

I fail to see how having tests in C preclude preloaded code from being written in C. If the argument is that all C-related boilerplate for assertions can be simply placed directly in the sample / submission tests, I mentioned above that it could be useful to place them once in Preloaded instead so they can be reused by both sample and submission tests, instead of copy-pasting the same thing twice between sample and submission tests.

The question is whether the code in Preloaded should be assembled separately or concatenated with user code. In the latter case it could contain some compile-time things like macros.

Now this seems a bit interesting. But (1) concatenating Preloaded with the solver's solution often creates more problems than it solves, such as making it harder to determine where the error is when there is a bug in the solution (or whether the bug is in the solution at all), and (2) I suppose this (concatenating preloaded code with the solution) precludes assembly-level macros(?) from being used in the sample / submission tests, and one important use case of preloaded code is to share testing-related boilerplate code between sample and submission tests.

@kazk
Copy link
Member Author

kazk commented Aug 4, 2022

Yeah, preloaded shouldn't be where you put the reference solution in any language. I get that it's annoying to write a reference solution twice, but that shouldn't be the reason to make preloaded code assembly.

I also don't want to concatenate because we know it'll be confusing for solvers.

@error256 @nomennescio What do you think?


We can also decide this later like NASM.

@error256
Copy link

error256 commented Aug 4, 2022

If the argument is that all C-related boilerplate for assertions can be simply placed directly in the sample / submission tests, I mentioned above that it could be useful to place them once in Preloaded instead so they can be reused by both sample and submission tests, instead of copy-pasting the same thing twice between sample and submission tests.

I still don't like non-self-sufficient sample tests, maybe except some special cases. So the question is if there are more useful applications of preloaded in asm that those special cases for preloaded in C. I haven't looked at the RISC-V environment yet, so I'm not really sure.

@DonaldKellett
Copy link
Member

I still don't like non-self-sufficient sample tests

Fair point, but if there's common boilerplate for tests placed in the Preloaded section that shouldn't actually be hidden from the solver other than to avoid cluttering the sample tests, it's possible to replicate the Preloaded section in the Kata description, perhaps under a collapsible section, such that the solver can replicate the entire Kata setup locally.


Since we don't (yet) have Preloaded support in NASM either and it hasn't caused us any major issues so far, I say we can postpone the decision for RISC-V as well.

@kazk
Copy link
Member Author

kazk commented Aug 6, 2022

I decided to create a new RISC-V mode for CodeMirror using the simple mode. https://github.com/codewars/codemirror-riscv/blob/main/riscv.js

Demo at https://codewars.github.io/codemirror-riscv

The runner has been ready, so I'll try to enable it on Codewars over the weekend or next week.

@kazk
Copy link
Member Author

kazk commented Aug 11, 2022

Deployed.
You can play with it with Kumite https://www.codewars.com/kumite/new?language=riscv
Let me know if you find any issues. I'll probably announce it on Discord tomorrow.

@kazk kazk closed this as completed Aug 11, 2022
Repository owner moved this from In Progress to Done in Code Runner Aug 11, 2022
@dramforever
Copy link

dramforever commented Oct 27, 2022

Can we get some extra extensions like vector, bit-manipulation and scalar crypto enabled? It seems that for user-mode emulation setting QEMU_CPU to rv64,v=on etc would do the trick, though it does require a sort of recent version of QEMU, and I'm not sure what's the minimum required and what's the version on the runner. I've tested on QEMU 7.1.0/7.0.0 and GCC 11.3.0.

I roughly checked https://gitlab.com/qemu-project/qemu/-/blob/master/target/riscv/cpu.c and it seems that passing argument -march=rv64gcv_zba_zbb_zbc_zbs_zbkx_zk_zks to GCC and environment QEMU_CPU=rv64,v=on,zbkx=on,zk=on,zks=on to QEMU should enable everything that QEMU supports.

Though it seems that the crypto ones (Zbk* and Zk*) are not supported on 7.0.0 yet though, so -march=rv64gcv_zba_zbb_zbc_zbs and QEMU_CPU=rv64,v=on would be a great start (Zb* are enabled on QEMU by default). In addition, on QEMU 7.1.0, it complains with a warning unless you add vext_spec=v1.0 like rv64,v=on,vext_spec=v1.0, so that's something to keep in mind.

I've written a kumite that should work (at least compile fine (at least should fail with a different message)) if the changes above are implemented: https://www.codewars.com/kumite/635ac4a48f20014af2cf522e

@kazk
Copy link
Member Author

kazk commented Oct 31, 2022

@dramforever Can you open a new issue to track this? The image is maintained at https://github.com/codewars/riscv if you want to experiment with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants