Skip to content

Conversation

@AlexJones0
Copy link

@AlexJones0 AlexJones0 commented Nov 14, 2025

See the relevant commit messages and comments for more details. This PR introduces the logic for both Earlgrey and Darjeeling to consider system resets that are not requested by the rstmgr hardware (e.g. a system-reset command sent over the QEMU monitor) as Power-on Resets (PORs). This information is then forwarded to the rstmgr which it uses during its own reset to determine the RESET_INFO that it should display. This PR introduces the logic for the rstmgr to consider any reset not triggered from its internal request to be a POR, rather than just the first reset after initialization. This PR hence allows SW to correctly determine the reset reason after a system reset.

@AlexJones0 AlexJones0 changed the title ot_rstmgr, ot_darjeeling, ot_earlgrey: Reflect Power-on Reset (POR) on system reset ot_rstmgr: Reflect Power-on Reset (POR) on system reset Nov 14, 2025
@rivos-eblot
Copy link

I've not thought deeply about it, but can't this feature achieve with a ot_rstmgr property rather than being handled in the machine, since it's does not seem to be a HW feature but more a hack to use the qemu monitor?
Another question: why not adding a dedicated monitor feature for doing so, it seems it would be more flexible (as in selecting a specific kind of reset)?

@AlexJones0
Copy link
Author

can't this feature achieve with a ot_rstmgr property rather than being handled in the machine

Yes, thinking about it I think you are correct, this should be on the rstmgr. I put this here because in reality the POR signal exists as a direct manual pad on the SoC-level that is then passed through, but all the logic that makes the distinction between the POR and the HW reset is actually internal to the rstmgr from looking at the RTL so I think that is the more correct place. I'll change that.

it's does not seem to be a HW feature but more a hack to use the qemu monitor?

That is the motivation but I am not clear why you think this is not a HW feature - the rstmgr RTL always resets the POR reset info on a power-on-reset and otherwise assigns the reset info according to the reset type. The current QEMU rstmgr implementation effectively signals that POR should only be set in the first reset after realization and never again. It does not distinguish between the reset source. So you would need to recreate the entire machine to be able to emulate a POR signal. But in actual HW you would send Power-on Resets to the rstmgr at any point via the pad.

why not adding a dedicated monitor feature for doing so, as it seems it would be more flexible

From my understanding I don't think the current monitor feature is wrong - calling a system-reset directly invokes resettable_reset on the system - and so another feature isn't really needed for this. The error is the expectation that POR can only happen in the reset after realization, which doesn't make sense if I have to essentially destroy and recreate the whole machine just to emulate a signal on one pad. If there is a mechanism that resets the system that doesn't come from the rstmgr or the QEMU Monitor (not sure if it actually exists, just a hypothetical) then I would expect that to be considered a POR from the rstmgr's perspective as well.

@AlexJones0
Copy link
Author

I've pushed the change to handle the logic internal to the rstmgr now - it turns out a lot simpler and I think it is more clear that the change should match the rstmgr HW.

@rivos-eblot
Copy link

hat is the motivation but I am not clear why you think this is not a HW feature - the rstmgr RTL always resets the POR reset info on a power-on-reset and otherwise assigns the reset info according to the reset type.

I think I start to understand your point, but QEMU is not really designed to support multiple kind of resets. Although the ResettableAPI is supposed to support several kinds of reset, they are not actually implemented for now.

The OT ResetManager tracks different kind of resets (PoR, SW, Watchdog, etc.)

The way it has been designed IIRC is that you need to connect the PoR pad to the reset manager (OT_RSTMGR_RST_REQ) and request a OT_RSTMGR_RESET_POR. This way, the reset manager should perform a system reset with PoR reason. Here I think it could be nice to have a dedicated QEMU monitor handler (if you wish to use it)

I do not think it is possible to achieve a true PoR with system-reset, or at least each OT component needs to be checked, as several of them rely on the VM to be initialized as the true power up sequence. It might work .. or not.

Trying to map QEMU reset onto OT reset is likely to break something somewhere. If there is a way to preserve the existing behavior, that should be fine.

@AlexJones0
Copy link
Author

AlexJones0 commented Nov 18, 2025

I think I start to understand your point, but QEMU is not really designed to support multiple kind of resets. Although the ResettableAPI is supposed to support several kinds of reset, they are not actually implemented for now.

I see - I understand what you are saying now. I guess based on the docs there is also a mismatch here between QEMU / OpenTitan. So for example on a rstmgr HW reset request we call a "Cold" reset on the SoC (and later assert that all EG/DJ machine reset reasons are cold resets), but the rstmgr technically does not follow QEMU's definition of a cold reset as it does not get reset to the same initial state as at the start of QEMU. And I guess there is no nice mapping of this reset to the current reset types.

I personally expected a monitor system reset to be a POR, but I guess it is actually a cold reset where cold ≠ POR.

The way it has been designed IIRC is that you need to connect the PoR pad to the reset manager (OT_RSTMGR_RST_REQ) and request a OT_RSTMGR_RESET_POR. This way, the reset manager should perform a system reset with PoR reason. Here I think it could be nice to have a dedicated QEMU monitor handler (if you wish to use it)

I think this makes sense, I am trying to think about what is the best way to expose this functionality. It doesn't seem correct to me add OT-specific functionality to the QEMU monitor (hence I was avoiding that), but I am not sure what the best way to expose this POR pad is - a CharDev on the SoC / rstmgr? The goal is to implement a method of emulating using reset strappings to perform a POR without needing to kill and recreate the QEMU process, as a lot of OpenTitan SW is reliant on this rstmgr reset cause.

Trying to map QEMU reset onto OT reset is likely to break something somewhere. If there is a way to preserve the existing behavior, that should be fine.

That makes sense, I understand better now why you say the QEMU/OT resets are so different to each other...
I think luckily this reset cause register is the only example I have seen so far where this causes a difference in a way that is very noticeable to SW, so I'm hoping that adding this small patch is sufficient. If it turns out not to be then I guess it would require a much more careful examination of the whole resetting / power management in QEMU (same problem I am running into with the RV_DM as well...)

For now I'll change this PR to retain the existing behaviour and gate this new functionality behind a property, and try and document it a bit better - does that sound okay to you? Otherwise, if you have any ideas of nice ways to expose the ot_rstmgr_reset_req(..., OT_RSTMGR_RESET_POR) then that might be preferable.

Adds an optional property (defautl disabled) to the rstmgr to treat any
reset of unknown origin (i.e. not triggered by an `ot_rstmgr_reset_req`)
as Power-on-Resets (PoR) rather than using the currently latched cause.

This will enable resets through e.g. the QEMU monitor to be treated as a
PoR, rather than just assuming that PoR can only occur when the system
is created (in the first reset after realization).

In the future, it might be that there should instead be some external
mechanism for exposing a POR pad that will be used to signal
`ot_rstmgr_reset_req(..., OT_RSTMGR_RESET_POR)` externally.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
@AlexJones0 AlexJones0 changed the title ot_rstmgr: Reflect Power-on Reset (POR) on system reset ot_rstmgr: Add option to treat unexpected resets as Power-on-Reset Nov 18, 2025
@rivos-eblot
Copy link

rivos-eblot commented Nov 18, 2025

It doesn't seem correct to me add OT-specific functionality to the QEMU monitor (hence I was avoiding that),

Why would it be? We already have an OT-specific monitor command, or rather Ibex-specific command: "ibex", see 74de0eb.

but I am not sure what the best way to expose this POR pad is - a CharDev on the SoC / rstmgr?

I think you do not need to add a new device & protocol if you use already use the monitor: you can add device properties and use qom-set command to toggle the property.

I think this boils down to a recurring task for EerlGrey, is that it needs a padring device, where you can connect those kinds of signals, GPIOs, etc.

We have an implementation for our chip for example, and this is how we trigger such reset or other I/O signals; to sum up the involved components look like the following:

python script -> python module -> QMP protocol -> QOM property set on padring -> irq_set -> rstmgr

I think it is far better than to add a new CharDev w/ a custom protocol: this implementation only uses existing QEMU components to trigger the reset manager PoR line from an external pad.

From the monitor, that would look something like

qom-set por true

@AlexJones0
Copy link
Author

Why would it be? We already have an OT-specific monitor command, or rather Ibex-specific command: "ibex", see 74de0eb.

I see - I thought generally the preference was to avoid changes to QEMU that would make it more difficult to rebase on upstream unless they are necessary, but I see there are other device-specific commands there as well so I guess this is fine to add to the monitor?

I think this boils down to a recurring task for EerlGrey, is that it needs a padring device, where you can connect those kinds of signals, GPIOs, etc.

Yes probably this should be added to the open padring issue, though unfortunately I don't know if I have the capacity to look into properly resolving that at the moment. The solution you describe with QOM setting properties on the padring sounds a lot better than using a CharDev or adding a new monitor command.

I could look into stubbing a basic padring device that just supports this property for now and then link that up to the rstmgr, do you think that sounds like a better fix? I wonder if the rstmgr property would be useful anyway for any other cases of resets coming from sources external to the rstmgr?

@rivos-eblot
Copy link

rivos-eblot commented Nov 18, 2025

but I see there are other device-specific commands there as well so I guess this is fine to add to the monitor?

I'd say this would generate some easy-to-fix conflicts.
For this very specific need, I think the padring approach is better, as it would avoid customizing generic files + add the base to support other I/O related needs.

I could look into stubbing a basic padring device that just supports this property for now and then link that up to the rstmgr, do you think that sounds like a better fix? I wonder if the rstmgr property would be useful anyway for any other cases of resets coming from sources external to the rstmgr?

I've created a fully untested skeleton, should you want to use it as for bootstrapping a padring: 011f00e

I do not think you are using Python to drive the monitor; if you are, let me know.

@rivos-eblot
Copy link

rivos-eblot commented Nov 18, 2025

That makes sense, I understand better now why you say the QEMU/OT resets are so different to each other...

Yes, we are trying to address different needs with different time sequence with a single, very limited and yet-to-be-completed Resettable API...

I think this is the topic I've refactored the most for the last nearly 3 years (with many try-and-error sessions :-)). It does not mean it has reached the best solution yet. However this version has not bumped into show stoppers and is much simpler and more generic that the previous iterations and scales well with several OT instances within the same machine.

@AlexJones0
Copy link
Author

For this very specific need, I think the padring approach is better, as it would avoid customizing generic files + add the base to support other I/O related needs.

Ok great I will look into this, thanks for the link to the skeleton.

I do not think you are using Python to drive the monitor; if you are, let me know.

I am going through our host-side Rust code (OpenTitanLib) but I'd prefer if any solution I implement is compatible with the current Python scripting, so it'd be useful to know if there's anything extra I'd need to add there to support it.

However this version has not bumped into show stoppers and is much simpler and more generic that the previous iterations and scales well with several OT instances within the same machine

I think the only issue that I have run into so far is with handling the Debug Module's HALTREQ for an unresponsive (resetting) hart, where ideally I need a way to have the vCPU call into the RV_DM to be halted when it begins executing instructions, which I would expect to map to a riscv_cpu_reset_exit. But this is not the case in opentitan where we reset and then subsequently disable it. Maybe this is an example of a case that is very difficult to support correctly in QEMU though and I should just patch OpenOCD to work around in SW. Though this is perhaps too unrelated to this PR :)

@rivos-eblot
Copy link

I am going through our host-side Rust code (OpenTitanLib) but I'd prefer if any solution I implement is compatible with the current Python scripting, so it'd be useful to know if there's anything extra I'd need to add there to support it.

Fortunately, there is nothing specific to Python; the issue is when using Python: QEMU provides a QMP wrapper module, which is nice; however it is an async IO module, which means that everything else needs to be async... I've added a de-async-ifier module on top of it, in order to be able to combine it with other Python regular modules. You should not need to use this if you call the QMP protocol from Rust.

I think the only issue that I have run into so far is with handling the Debug Module's HALTREQ for an unresponsive (resetting) hart, where ideally I need a way to have the vCPU call into the RV_DM to be halted when it begins executing instructions, which I would expect to map to a riscv_cpu_reset_exit.

I think we would need a reference from the CPU to the DM module, which is not supposed to exist in QEMU. There may be a way through the RISCV CPU class, but this is for sure a high probability of upstream merge conflics, as RISCV target keeps being heavily modified on each QEMU major version. I do agree there is a missing piece of code here (and there are other bugs in the RISVC "HW" debug support...)

Though this is perhaps too unrelated to this PR :)

😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants