-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Relax the canonical order checking of arch string for -march option #11
Comments
Thanks for writing down this proposal Kito. I like the idea that tools that consume human-generated input would be flexible in the order they would accept, but that they would always canonicalize the ISA string. So when a tool prints something it's always in canonical order and you can easily match that string. But why should the human have to remember that order? One possible exception that was mentioned in the LLVM RISC-V sync-up call was requiring that the |
I think the main problem is with the z and x extensions, which have unobvious orders. The single letter extensions are pretty clear about the right ordering. I'm Ok with relaxing the order check for user input for the multicharacter extensions. |
Nice idea. I often met this problem in develop and test. Since we usually use 'imafdc' in general, it's not easily always type other extensions in right sequence. Relaxing the order that not frequently is real helpful for both user and developer. |
Option D. As it was mentioned in the GCC call, the interpretation of the string should always be after the canonical order. If anything, to future proof it. When displaying the string, wherever in the toolchain, it should always be in the canonical order, whatever the user input. Also, the shorthand extension |
I support option c (Don't relax order check, but emit an error to tell user the right order). It's not just the GNU toolchain (and LLVM) that is emitting and parsing ISA strings. I've written helper scripts that help users find the right ISA string for the features they have or want, and having a specification that says There Is Only One Correct Answer makes things easier to do such tasks. |
Option a. Does it have another option that just relaxes the order in the same multi-letter extension? |
Option b or d both are better, it let user easy to use this 'march' option without any complex detail of SPEC |
I prefer option d. It's is flexible and reads clearly. |
For: Dynamically building up arch strings is awkward (e.g. you can't take rv64gc and then append q, because the order is rv64gqc) if you're taking in user input and appending Against: Comparing arch strings visually becomes much harder if they're not in canonical order |
Thanks all your feedback, sound like most people are happy to relax the canonical order checking, statistics so far:
|
@palmer-dabbelt has told me he is fine to relax the order, but we should take care about the ambiguous of interpreting the arch string, especially for the multi-letter and single-letter extension. e.g.
So I think we should write down few more explicit rule, like multi-letter extension must using underline between all adjacent extension. |
@kito-cheng proposes "option (d)":
I am against adding constraints that aren't in the (official) ISA string specification (Vol I Ch. 25). In this case, RV32IZam and RV32I_Zam are both legal, so mandating the latter is problematic in my opinion. So I'm strongly opposed to option (d). I am (mildly) in favor of option (c): do not relax the specification. As a toolchain user I have never found the law to be particularly onerous. My vote is motivated by concern of a (hypothetical) situation where a user becomes accustomed to writing illegal ISA strings, switches to a tool which enforces the law, and then files a (spurious) bug report, wasting some law-abiding developer's time. (Spare the rod, spoil the child?) This is a subjective argument, and certainly not a hill I will die on. |
@nick-knight I think the details could be discussed, since the intention is preventing paring ambiguous.
That sounds a valid concern, when we relax that, then people are used to using relaxed order, but I still concern about the order for multi-letter extension - it's really hard to sort, so I change my mind (back) to option a. |
Hey guys, thank all your feedback, now I created a PR #14 for this RFC, welcome any further comment on that PR :) |
In today's GCC RISC-V call we discovered the concern that a future ISA string format might cause a conflict. Addressing that in the specification (e.g. by relaxing the order there, or by getting hard guarantees there which can be relied upon) seemed unlikely. One way that would still be possible is to do something similar like the push_arch proposal in the RISC-V Assembly Programmer's Manual E.g. |
I agree with @palmer-dabbelt and @nick-knight that the option d will cause more ambiguous problems. It doesn't worth that spending times to resolve these ambiguous problems caused by option d. The PR #14 seems like the option b, but add a restriction that |
I think this is a good way to specify |
This is essentially a collection of all recent change requests for the -march string: * Relax the ISA string order ([1]) * Add custom extensions ([2]) * Add profiles support ([3]) Most of this patch is based on proposals from Kito Cheng <kito.cheng@gmail.com> (see linked resources below). [1] riscv-non-isa#14 [2] riscv-non-isa#1 [3] https://lists.riscv.org/g/sig-toolchains/message/379 [4] riscv-non-isa#11 Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This is essentially a collection of all recent change requests for the -march string: * Relax the ISA string order ([1]) * Add custom extensions ([2]) * Add profiles support ([3]) Most of this patch is based on proposals from Kito Cheng <kito.cheng@gmail.com> (see linked resources below). [1] riscv-non-isa#14 [2] riscv-non-isa#1 [3] https://lists.riscv.org/g/sig-toolchains/message/379 [4] riscv-non-isa#11 Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This is essentially a collection of all recent change requests for the -march string: * Relax the ISA string order ([1]) * Add custom extensions ([2]) * Add profiles support ([3]) Most of this patch is based on proposals from Kito Cheng <kito.cheng@gmail.com> (see linked resources below). [1] riscv-non-isa#14 [2] riscv-non-isa#1 [3] https://lists.riscv.org/g/sig-toolchains/message/379 [4] riscv-non-isa#11 Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This is essentially a collection of all recent change requests for the -march string: * Relax the ISA string order ([1]) * Add custom extensions ([2]) * Add profiles support ([3]) Most of this patch is based on proposals from Kito Cheng <kito.cheng@gmail.com> (see linked resources below). [1] riscv-non-isa#14 [2] riscv-non-isa#1 [3] https://lists.riscv.org/g/sig-toolchains/message/379 [4] riscv-non-isa#11 Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
This is essentially a collection of all recent change requests for the -march string: * Relax the ISA string order ([1]) * Add custom extensions ([2]) * Add profiles support ([3]) Most of this patch is based on proposals from Kito Cheng <kito.cheng@gmail.com> (see linked resources below). [1] riscv-non-isa#14 [2] riscv-non-isa#1 [3] https://lists.riscv.org/g/sig-toolchains/message/379 [4] riscv-non-isa#11 Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
* riscv-toolchain-conventions, PR, riscv-non-isa/riscv-toolchain-conventions#14 Issue, riscv-non-isa/riscv-toolchain-conventions#11 * Refer to the commit afc41ff, RISC-V: Reorder the prefixed extensions which are out of order. In the past we only allow to reorder the prefixed extensions. But according to the PR 14 in the riscv-toolchain-convention, we can also relax the order checking to allow the whole extensions be written out of orders, including the single standard extensions and the prefixed multi-letter extensions. Just that we still need to follow the following rules as usual, 1. prefixed extensions need to be seperated with `_'. 2. prefixed extensions need complete <major>.<minor> version if set. Please see the details in the march-ok-reorder gas testcase. Passed the riscv-gnu-toolchain regressions. bfd/ * elfxx-riscv.c (enum riscv_prefix_ext_class): Changed RV_ISA_CLASS_UNKNOWN to RV_ISA_CLASS_SINGLE, since everything that does not belong to the multi-keyword will possible be a single extension for the current parser. (parse_config): Likewise. (riscv_get_prefix_class): Likewise. (riscv_compare_subsets): Likewise. (riscv_parse_std_ext): Removed, and merged with riscv_parse_prefixed_ext into riscv_parse_extensions. (riscv_parse_prefixed_ext): Likewise. (riscv_parse_subset): Only need to call riscv_parse_extensions to parse both single standard and prefixed extensions. gas/ * testsuite/gas/riscv/march-fail-order-std.d: Removed since the relaxed order checking. * testsuite/gas/riscv/march-fail-order-std.l: Likewise. * testsuite/gas/riscv/march-fail-order-x-std.d: Likewise. * testsuite/gas/riscv/march-fail-order-z-std.d: Likewise. * testsuite/gas/riscv/march-fail-order-zx-std.l: Likewise. * testsuite/gas/riscv/march-fail-unknown-std.l: Updated. * testsuite/gas/riscv/march-ok-reorder.d: New testcase.
* riscv-toolchain-conventions, PR, riscv-non-isa/riscv-toolchain-conventions#14 Issue, riscv-non-isa/riscv-toolchain-conventions#11 * Refer to the commit afc41ff, RISC-V: Reorder the prefixed extensions which are out of order. In the past we only allow to reorder the prefixed extensions. But according to the PR 14 in the riscv-toolchain-convention, we can also relax the order checking to allow the whole extensions be written out of orders, including the single standard extensions and the prefixed multi-letter extensions. Just that we still need to follow the following rules as usual, 1. prefixed extensions need to be seperated with `_'. 2. prefixed extensions need complete <major>.<minor> version if set. Please see the details in the march-ok-reorder gas testcase. Passed the riscv-gnu-toolchain regressions. bfd/ * elfxx-riscv.c (enum riscv_prefix_ext_class): Changed RV_ISA_CLASS_UNKNOWN to RV_ISA_CLASS_SINGLE, since everything that does not belong to the multi-keyword will possible be a single extension for the current parser. (parse_config): Likewise. (riscv_get_prefix_class): Likewise. (riscv_compare_subsets): Likewise. (riscv_parse_std_ext): Removed, and merged with riscv_parse_prefixed_ext into riscv_parse_extensions. (riscv_parse_prefixed_ext): Likewise. (riscv_parse_subset): Only need to call riscv_parse_extensions to parse both single standard and prefixed extensions. gas/ * testsuite/gas/riscv/march-fail-order-std.d: Removed since the relaxed order checking. * testsuite/gas/riscv/march-fail-order-std.l: Likewise. * testsuite/gas/riscv/march-fail-order-x-std.d: Likewise. * testsuite/gas/riscv/march-fail-order-z-std.d: Likewise. * testsuite/gas/riscv/march-fail-order-zx-std.l: Likewise. * testsuite/gas/riscv/march-fail-unknown-std.l: Updated. * testsuite/gas/riscv/march-ok-reorder.d: New testcase.
Changes
The ISA spec has specify the ISA extension could be describe the arch string, and toolchain are using this way to control the code gen, and currently toolchain are following the order describe by the ISA spec, and emit error if order is wrong.
However we have bunch of sub-extension from several ISA extension now, zv*, zb*, zk*, zp*, zf* and zi*, and there is x*, s* and zxm* extensions, the order rule is also little bit different:
Okay, so how to describe a arch has rv64gc, sfoo, zba, zfh, zicsr, zxmbar, xsf and xabc?
The canonical order is there: rv64gc_zicsr_zfh_zba_sfoo_zxmbar_xabc_xsf.
I think it's hard to write it correct without check with ISA spec.
So my proposal is relax the canonical order checking for the -march option,
And I list few possible options:
a. Relax order check for multi-letter extension, e.g. order between z*, s*, h*, zxm* and x*. [*1]
b. Relax order check for whole arch string, including single letter extensions. [*1]
c. Don't relax order check, but emit an error to tell user the right order.
d. Relax order check for whole arch string, including single letter extensions. [*1] except I, G, E.
*1: The ELF attribute emission still must in canonical order to simplify the parser logic, that's also keep the possibility of cache the checking result if we check that in OS or runtime in future.
The text was updated successfully, but these errors were encountered: