Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Handling of X0 register #95

Open
PhilippvK opened this issue May 22, 2023 · 15 comments
Open

Improve Handling of X0 register #95

PhilippvK opened this issue May 22, 2023 · 15 comments

Comments

@PhilippvK
Copy link

PhilippvK commented May 22, 2023

Similarly to #94, theif(rd != 0) behavior is coded into every single instruction. We probably should find a way to specify the X[0]==0 constraint on a higher level. (Alternatively we could add a way to set access rights (R/W) for memory ranges, which could also be reused for some CSRs)

What do you think?

@AtomCrafty
Copy link
Contributor

I remember a while back @jopperm and I had a discussion about making aliases much more versatile by allowing concatenation aliases. We ultimately didn't continue with the idea, but it could almost solve the X[0] issue. The remaining issue I see with this example is that it would violate the "no non-const aliases to const values" rule.

image

@eyck
Copy link
Contributor

eyck commented May 23, 2023

Actually this does not really solve the actual problem. The RISC-V ISA spaec says that x0 can be written but will always read 0. Defining a register as const or setting a constraint on it would prevent this behavior.
Moreover, even if a read from memory (or more important CSR) would update X0 the read operation has to happen although X0 will not change.
Therefore I currently do not see any option to get rid of this check in the behavior

@PhilippvK
Copy link
Author

Thank you for your feedback. I can totally live with that check in the behavior.

@PhilippvK
Copy link
Author

PhilippvK commented May 23, 2023

What would you think about adding optional read/write wrappers to (subsets of) registers files? This would on the other hand make it easier to implement the RISC-V Exception/Interrupt Handling behavior in CoreDSL.

See https://github.com/tum-ei-eda/etiss_arch_riscv/blob/2dca88c533a145b50d07194fc6c10b552d96ee71/tum_mod.core_desc#L100 by @ wysiwyng.

Currently we need to overwrite each of the CSR* instructions with our own versions which calling read_csr(…)/write_csr(…) functions which take care of masking, aliasing etc. for relevant CSR addresses instead of indexing into CSR[] directly.

With this supported it would be straightforward to handle the X0 special case by replacing its write-behavior with a no-op and read-behavior with a constant 0.

In addition this might be helpful when implementing CSR/Memory-Mapped-Peripherals in CoreDSL (not sure if this is relevant). An example would be clearing an interrupt flag bit implicitly on a register read.

Please let me now if this is nonsense or rather desirable?

@neithernut
Copy link

It doesn't work for all cases relevant for CSRs, but what about the following?

always {
    X[0] = 0;
}

(please ignore any syntactical errors)

@PhilippvK
Copy link
Author

PhilippvK commented May 23, 2023

@neithernut Thank you for that proposal. This makes sense at a first glance and is very minimalistic. However have we already settled on how always blocks are scheduled? I mean for this we would need to execute the always block after each instruction while for other usecases (implicit pc increment) it would make more sense to execute the always block first.

@wysiwyng
Copy link

@neithernut per the always spec here: https://github.com/Minres/CoreDSL/wiki/Structure-and-concepts#always-blocks

Updates inside the behavior part of regular instructions take precedence over all always blocks.

would mean that x[0] = 0 in an always-block will always be overwritten.

@wysiwyng
Copy link

i support the read/write overrides proposed above, i suggested them in the past especially for CSR handling.

syntax suggestion 1 (using existing syntactical elements):

arch_state {
    register unsigned<XLEN> X[32];
    ...
}

functions {
    unsigned<XLEN> x0_read() [[ read_override_for=X[0] ]] {
         return 0;
    }
    
    void x0_write(unsigned<XLEN> val) [[ write_override_for=X[0] ]] {
         // nothing
    }
}

alternatively, a separate section for overrides could be introduced, using a syntax similar to Python or C# properties:

arch_state {
    register unsigned<XLEN> X[32];
    ...
}

accessors {
    X[0]->get() {
         return 0;
    }
    
    X[0]->set(unsigned<XLEN> val) {
         // nothing
    }
}

for the latter, it would be interesting whether the . operator can be used instead of ->.

open questions for any such implementation:

  1. should accessors be allowed on aliased entities or only on the parent entity
  2. how is inheritance handled
  3. which accessors are present by default, or rather does only specifying one accessor imply read-only / write-only access like in Python / C#

@AtomCrafty
Copy link
Contributor

AtomCrafty commented May 23, 2023

I strongly dislike the idea of these accessors and override functions. All they do is introduce syntactic sugar that converts expressions of the form X[0] = xyz to implicit function calls. Just make those explicit and write set_x(0, xyz).

void set_x(unsigned<5> rd, unsigned<XLEN> value) {
    if(rd != 0) X[rd] = value;
}

The proposals above are way too involved if their only goal is to avoid a handful of if(rd != 0) throughout the specification.

@PhilippvK
Copy link
Author

PhilippvK commented May 23, 2023

The proposed accessor feature would be overkill for the X0 "problem" but essential for handling the CSRs in a straightforward fashion.

@AtomCrafty Your proposed alternative would either require updating the upstream RISCV_ISA_COREDSL code (which is not always feasible) or reimplementing every CSR instruction in a downstream codebase (see tum_csr). The latter leads to many redundant lines of code (encoding & assembly is basically the same), which need to be maintained/synced with the original implementation. Providing read/write wrappers would avoid this workaround.

@AtomCrafty
Copy link
Contributor

Then I have fundamentally misunderstood the issue you're trying to solve here. Could you please explain that whole CSR issue? I'm not familiar with RISC-V, so this is going over my head.

It sounds to me like you saying the accessors are necessary because they allow you to change the behavior of existing code without having to change the code itself. Wouldn't it make more sense to allow for underspecified ("abstract") functions in the base implementation that can then be specified by the derived cores?

@PhilippvK
Copy link
Author

Let me try to motivate it using our implementations for the RVF, Zicsr and Zicntr Extensions.

Currently using read_csr/write_csr wrappers defined as functions we need 109 (98 excluding constants) lines of code due to the reimplementation of all 6 CSR instructions: https://github.com/tum-ei-eda/etiss_arch_riscv/blob/2dca88c533a145b50d07194fc6c10b552d96ee71/tum_mod.core_desc#L100-L207

With the syntax proposed above, I was able to cut this down to just 32 (21 excluding constants) lines:

InstructionSet tum_csr extends Zicsr {
    architectural_state {
        unsigned int FFLAGS_N = 0x001;
        unsigned int FRM_N = 0x002;
        unsigned int FCSR_N = 0x003;
        unsigned int CYCLE = 0xC00;
        unsigned int CYCLEH = 0xC80;
        unsigned int TIME = 0xC01;
        unsigned int TIMEH = 0xC81;
        unsigned int INSTRET = 0xC02;
        unsigned int INSTRETH = 0xC82;
    }

    functions {
        extern unsigned<64> etiss_get_cycles() [[ etiss_needs_arch ]];
        extern unsigned<64> etiss_get_time();
        extern unsigned<64> etiss_get_instret() [[ etiss_needs_arch ]];
    }
    accessors {
        CSR[FFLAGS_N]->get() { return CSR[FCSR_N] & 0x1F ; }
        CSR[FFLAGS_N]->set(unsigned<XLEN> val) { CSR[FCSR_N] = (CSR[FCSR_N] & (0x07 << 5)) | (val & 0x1F); }
        CSR[FRM_N]->get() { return (CSR[FCSR_N] >> 5) & 0x07; }
        CSR[FRM_N]->set(unsigned<XLEN> val) { CSR[FCSR_N] = ((val & 0x07) << 5) | (CSR[FCSR_N] & 0x1F); }
        CSR[FCSR_N]->set(unsigned<XLEN> val) { CSR[FCSR_N] = val & 0xFF; }
        CSR[CYCLE]->get() { return etiss_get_cycles(); }
        CSR[CYCLEH]->get() { return etiss_get_cycles() >> 32; }
        CSR[TIME]->get() { return etiss_get_time(); }
        CSR[TIMEH]->get() { return etiss_get_time() >> 32; }
        CSR[INSTRET]->get() { return etiss_get_instret(); }
        CSR[INSTRETH]->get() { return etiss_get_instret() >> 32; }
    }
}

Of course there are more CSRs which require this handling such as MSTATUS/SSTATUS (shaddowing due to different privilege levels) or MISA (read only).

@AtomCrafty
Copy link
Contributor

@PhilippvK Thank you for the explanation, I think I understand the issue now.
Zicsr defines a number of regular read/write registers and specific extensions like tum_csr redefine a number of them to return dynamically calculated results on reads or perform specific actions on writes.

From a software developer's perspective, this should be solved by defining extension points in Zicsr (like an on_read or on_write filter that can be overwritten in the derived instruction set), but I take it that this is not always feasible in the context of ISAs. I can already see myself how this approach would break down with multiple inheritance. Even the implementation of tum_mod you linked would not work together with another extension that needs to define other CSRs, so every combination of such extensions would require a separate implementation.

So now that I actually understand the issue, let me address the proposed solution properly.
You basically want to give the implementors the possibility to intercept reads and writes to certain elements in an address space, in this example the CSR register file. Schematically, multiple extensions would define accessors like this:
CSR Filters-Individual

I will retract my previous disagreement with the approach, but let me propose a slight variation that could make this even more flexible. Currently, it only allows the interception of individual elements, but I could conceive a scenario where one would like to intercept entire ranges of an address space - e.g. memory mapped ROM. How practically relevant that is I cannot judge.

Anyway, my idea would be to instead define filter layers. Instead of defining multiple interceptors, you would define one or more filters over the entire address space. The filters could check whether the index is in range and decide whether to take action. If that should not be the case, control can be passed to the next filter layer.

CSR Filters-Layers

This would allow the specification of features like Intel's APIC, which is memory mapped to a dynamically configurable range of addresses, so it cannot be modelled by accessors since those seem to have fixed indices. But I don't know whether modelling a memory controller is a valid use case for CoreDSL. I'm much more familiar with x86 than RISC-V, so that is usually my point of reference, but I don't think accurately specifying x86 is a design goal of CoreDSL.

I realize this approach might not work in practice because these kinds of dynamic checks are prohibitively expensive compared to the zero overhead accessors. Maybe both are relevant for different scenarios. Or maybe one is just a special case of the other, where all indices are known at compile time and dynamic checks can be optimized away.

@eyck
Copy link
Contributor

eyck commented May 24, 2023

As far as I can tell the whole dicussion took the wrong turn. First and foremost: the point was handling of X0. Secondly: CoreDSL as it stands describes instructions in an implementation independent way and not just for RSIC-V.

All the discussion above bring implementation details into this description. Implementation details in the sense of platform specfiics like the existence and behavior of specific CSRs (which might vary from core to core) and in the sense of downstream tools consuming the CoreDSL descriptions (in this case ETISS).
The latter has to be avoided and should be handled in the downstream tools. E.g. a write to a specific exterm space can be transformed into whatever function is being needed. Btw, this is what is being done in our code generation backend.
For the first set of details we may discuss some extension to the language to describe the behavior of specific memory locations and areas in an implementation agnostic way. But this is not the topic of this issue...

@PhilippvK
Copy link
Author

@eyck I agree that I should habe opened up a new issue for the topic. But the discussion was already going and I was too lazy to summarize everything again over there.

The discussed approach would as a side effect also solve the X0 handling.

All the discussion above bring implementation details into this description

Yes for Zicsr and probably Interrupt/Exception Handling. But for the FCSR and MISA this behavior shouldn’t be implementation specific at all.

For the first set of details we may discuss some extension to the language to describe the behavior of specific memory locations and areas in an implementation agnostic way. But this is not the topic of this issue...

Yes, let’s not forget this completely.

This issue however can be closed, since getting rid of the if (rd != ) { is not really relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants