Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] change the software interrupt to syscall instruction #14672

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

chirping78
Copy link

@chirping78 chirping78 commented Nov 6, 2024

Summary

Currently xtensa arch porting is using software interrupt (through write intset register) to trigger context switch.
But this put a strict requirement to the chip configuration:
there must be a software interrupt, it's priority is less or equal to XCHAL_EXCM_LEVEL;
and it's priority must be greater than XCHAL_SYSCALL_LEVEL (the XCHAL_SYSCALL_LEVEL is not a cadence macro, it just means normal hardware interrupt priority).

Since the xtensa arch is very flexible, the chip designer might have not noticed above requirement in chip design phase.
And if the chip has been taped out without this requirement, there is no posibility to port nuttx to the chip.

This PR will try to change the software interrupt to syscall instruction.
The syscall instruction is a synchronous exception, and is not locked by PS.intlevel, which means it's priority is high enough.

Impact

This PR will impact context switch, interrupt handling, and protect build.

Testing

test summary (Will be continuously updated when the issues are fixed)

11/18: rebased, better than last time.

rebase to version: nuttx master 637a0f5, apps master fee82bd3d,
plus this PR.

esp32s3-devkit:nsh ostest pass

esp32s3-devkit:tickless ostest pass

esp32s3-devkit:smp ostest pass with percentage ~20%
The possibility of passing the test is higher if power off then on through plug and unplug usb.

esp32s3-devkit:knsh ostest pass with percentage ~40%

11/13:

With #14760 esp32s3-devkit:smp ostest is more stable, but still not 100% pass.

11/05: has regression problem

base version: nuttx master 56f292f, apps master 400a8e326,
plus this PR.

esp32s3-devkit:nsh (+CONFIG_TESTING_OSTEST)
ostest pass

esp32s3-devkit:tickless
ostest pass

esp32s3-devkit:smp
ostest failed (only "nested signal handler test" failed)

esp32s3-devkit:knsh
ostest failed (fail at random position with message "xtensa_panic: Unhandled Exception 2", 2 means double exception)

Since there is no esp32 and esp32s2 boards in hand (have placed order), so these boards have not been tested.

@github-actions github-actions bot added Arch: xtensa Issues related to the Xtensa architecture Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Nov 6, 2024
@xiaoxiang781216
Copy link
Contributor

@tmedicci could you help review this pr?

@nuttxpr
Copy link

nuttxpr commented Nov 6, 2024

[Experimental Bot, please feedback here]

This PR does not fully meet the NuttX requirements yet, though it's on its way. Here's a breakdown:

Missing/Incomplete Information:

  • Summary: While the "why" is explained well, the "how" is a bit vague. More detail is needed on precisely how switching to syscall will work within the context switch mechanism. Mentioning specific functions or code sections that are modified would be beneficial. Also, a related NuttX issue should be linked if one exists (even if it's just a discussion about the software interrupt limitations).
  • Impact: The description of impact is too general. Specifically address:
    • User Impact: Will applications need to be recompiled? Will there be any behavioral changes?
    • Build Impact: Will any Kconfig options be added/removed/changed?
    • Hardware Impact: While the intent is to reduce hardware impact, specify which architectures/boards are affected (Xtensa, specifically ESP32, ESP32S2, ESP32S3). Will this change require any modifications to existing board configurations?
    • Documentation Impact: Even if no documentation update is required, mention that explicitly. If documentation is needed, what needs to be updated?
    • Security Impact: While it might not seem like a direct security impact, analyze whether changing the exception handling mechanism introduces any potential vulnerabilities.
    • Compatibility Impact: Address both backward compatibility (with older Xtensa chips) and forward compatibility (with future Xtensa designs).
  • Testing: The testing section is a good start but needs more formality.
    • Provide the full commit hashes for the base NuttX and apps versions.
    • Instead of "ostest pass," provide a summary of passed tests or ideally, a link to a CI log.
    • For the failing tests, provide the full error messages and backtraces if possible. The information about the double exception is useful, but more context is needed.
    • The statement about ESP32 and ESP32S2 being ordered is unnecessary; focus on the tests you have performed. Once you have those boards, update the PR with the results.

Recommendations:

  1. Expand the Summary: Provide more technical details on the implementation of the syscall mechanism for context switching.
  2. Detailed Impact Assessment: Go through each impact category and provide specific information, even if it's "NO."
  3. Structured Testing Results: Present the test results in a more organized way, including full commit hashes, clear pass/fail criteria, and detailed error messages for failures. Consider using a table format.
  4. Create a NuttX Issue: Even if it's just for discussion, creating a related issue can be beneficial for tracking the problem and the solution. This can then be linked in the PR.
  5. Address the Regressions: The reported regressions are a major blocker. Prioritize fixing these issues and updating the testing results accordingly.

By addressing these points, you'll significantly strengthen your PR and make it easier for reviewers to understand and approve the changes.

@tmedicci
Copy link
Contributor

tmedicci commented Nov 7, 2024

I'm busy during these days, so I'll take a while to fully review it. For now, I'll run our internal pipeline and report its results.

But I have question: IINM, syscall was being used for context switch and it was reverted to the software interrupt. I don't recall the context of the discussion, but it'd be good to try to retrieve it in this PR. We can check the commit/PR history and gather information from there.

(@Ouss4 and @gustavonihei may have some thoughts about it. If they have spare time, I'd like to hear about...).

@tmedicci
Copy link
Contributor

tmedicci commented Nov 7, 2024

Hi @xiaoxiang781216 , can we mark it as a draft until we fully review this PR to avoid it being merged by mistake?

@xiaoxiang781216 xiaoxiang781216 marked this pull request as draft November 7, 2024 13:18
@tmedicci
Copy link
Contributor

tmedicci commented Nov 8, 2024

For now, I'll run our internal pipeline and report its results.

The CI shows the same failed results regarding ostest for Xtensa devices.

@tmedicci
Copy link
Contributor

For now, I'll run our internal pipeline and report its results.

The CI shows the same failed results regarding ostest for Xtensa devices.

Hi @chirping78 , were you able to verify why ostest fails?

@chirping78
Copy link
Author

Hi @chirping78 , were you able to verify why ostest fails?

@tmedicci
I'm still investigating these failing cases, and have not root caused them.
Any help or patch is welcome.

@chirping78
Copy link
Author

With #14760 esp32s3-devkit:smp ostest is more stable, but still not 100% pass.

@tmedicci
Copy link
Contributor

With #14760 esp32s3-devkit:smp ostest is more stable, but still not 100% pass.

There were some changes recently that affect ostest behavior, can you please rebase your code?

chenxiaoyi added 3 commits November 18, 2024 20:17
Signed-off-by: chenxiaoyi <chenxiaoyi@xiaomi.com>
Signed-off-by: chenxiaoyi <chenxiaoyi@xiaomi.com>
Signed-off-by: chenxiaoyi <chenxiaoyi@xiaomi.com>
/* Enable the software interrupt. */

up_enable_irq(XTENSA_IRQ_SWINT);
irq_attach(XTENSA_IRQ_SYSCALL, (xcpt_t)xtensa_swint, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove (xcp_t)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: xtensa Issues related to the Xtensa architecture Area: OS Components OS Components issues Size: M The size of the change in this PR is medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants