Skip to content

poll is not blocking in presence of rtos #9112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
embeddedteam103 opened this issue Dec 16, 2018 · 14 comments
Closed

poll is not blocking in presence of rtos #9112

embeddedteam103 opened this issue Dec 16, 2018 · 14 comments

Comments

@embeddedteam103
Copy link
Contributor

Description

mbed_poll does not do proper blocking in the presence of rtos.
It is a form of busy waiting that prevents the microcontroller sleeping.

Note: this issue is in a TODO comment in the file, but quick search in this github does not result in any issue matching it.

Issue request type

[ ] Question
[x] Enhancement
[ ] Bug
@ciarmcom
Copy link
Member

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-323

@kjbracey
Copy link
Contributor

(Comment moved from #9113)

Thanks for looking at this (and noticing it, and caring).

This is a sore point in the design, and if it was as simple as #9113, it would have been done some time ago.

poll is a particular case, and the less common one - we have the same issue in ordinary blocking reads and writes. Writes in particular are the most likely to cause problems, as people do tend to do writes to console from multiple threads. This sort of mechanism does not scale to that case - it can't wake multiple threads blocked on output when TX buffer space becomes available.

Now, I do have my own prototype solution, but it's been parked for a while, due to combination of lack of interest and some contention over design (mainly "does this encourage critical section use too much?", iirc). What I was aiming for with it:

  • fully copes with multiple threads blocked on a device
  • doesn't interfere with user's sigio API
  • can in principle work equally well without the RTOS

First attempt is here: https://github.com/kjbracey-arm/mbed-os/commits/cv_cs
And a follow-up trying to pull out the common logic as a helper is here: https://github.com/kjbracey-arm/mbed-os/commits/cv_cs_fhdwh, but I don't recall the details of how solid that ended up

Care to try that cv_cs branch? I can see about kicking it off as a proper PR for discussion.

@embeddedteam103
Copy link
Contributor Author

@kjbracey-arm The non busy waiting is important for us as we want to sleep a lot to save power. So we rolled this as our own ad-hoc solution. Lack of interest is definetly not the case if this was the reason for your delay :)
Skimming through your branch with the common logic (and looking at UARTSerial as a usage example) it seems your solutions is to wrap the sigio of driver like file handles with another event that is used with a separate wake api.
The whole is_stream should be just the usage policy of the specific driver. Am I right (Logically, we could split the FileHandleDeviceWakeHelper class to a base class WakebleFileHandle and a child WakebleStreamFileHanlde that also inherits StreamBase and overrides the read/write according to WakebleFileHandle with the streaming policy on top of it)?
In any case, threads must use the wake/wait api to do proper polling (and this is given in the driver level). Am I right in my understanding?

@kjbracey
Copy link
Contributor

Right, you're probably more familiar than that than I am - I had totally forgotten about FileHandleDeviceWakeHelper until I went to find the link for cv_cs, and I haven't had my coffee yet.

Let me try to get this clear in my own head too.

The FileHandle abstract API already provides:

  • sigio callback
  • read and write calls - nonblocking and blocking
  • non-blocking poll() status read - (used by ::poll(), but no hook to wake ::poll)

There are currently no helpers for someone implementing FileHandle, so they have to do that themselves. (Apart from Stream, which doesn't work for any of this stuff - it's pure blocking only, no poll support. Should be deprecated IMO - #5655 ).

FileHandleDeviceWakeHelper was an attempt to do pull out the common stuff for the above plus the newly-added poll wake which complicates it further, so the driver only has to implement

  • poll status read
  • non-blocking read and write
    and then it has to make the wake callback to the helper - that sends sigio and wakes poll and wakes its blocking reads and writes

All blocks are inside the driver/helper/poll, and are always based on a ConditionVariableCS::wait, which is currently ultimately a ThreadFlag for the RTOS. (Earlier CMSIS-RTOS had a low-level primitive to wake a thread without an OS object, but that vanished in CMSIS-RTOS 2. Maybe it would have been useful here)

is_stream is a fixed policy for the driver, yes, so conceivably could have two classes for that. I'm generally resistant to trying to dictate policies like that in the class hierarchy - the number of possible device behaviour variations seems too potentially multiplicative, even if you think you're trying to pin it down to fundamentals like "stream" versus "datagram".

@embeddedteam103
Copy link
Contributor Author

Okay. So it seems I understand the design. What I still don't quite get is how the global poll function can be implemented without busy waiting?
The sigio of FileHandleDeviceWakeHelper should notify it somehow and therefore the global poll impl it needs to wait for this event (via ConditionVar or EventFlag or whatever). Therefore the poll impl should dispatch on whether the file handles are poll-wake style or not.
I see your impl of poll does exactly this.

If there is such divergence in functionality, maybe this should be a separate function, it is less intrusive as a change:

  • Have WakebleFileHandle that inherits FileHandle and extends it with an abstract poll_wake function.
  • Add global poll_wake function that takes an array of WakebleFileHandle.
  • Add helpers for driver implementers as you already provided, but the drivers (and StreamBase) would now inherit from WakebleFileHandle instead.

The fundamental design question is - should non-blocking be imposed on the user of FileHandle?

For the drivers the question is silly as they should use poll-wake if they can, but for normal down-to-earth file handles, this is contentious.

What do you think?

@kjbracey
Copy link
Contributor

kjbracey commented Dec 17, 2018

The work on that branch is not supposed to change anything about the application-facing API of FileHandle, or poll. It extends the driver API between FileHandle and poll to allow poll to block properly.

Add global poll_wake function that takes an array of WakebleFileHandle.

The non-wakeable filehandles are, in my version, not publically distinguished at all to users. The same API works either way, it's just an efficiency issue. Ultimately the "non-wake" case could be deprecated and retired. (Maybe it doesn't even go in in the first place? Almost every FileHandle out there I know of is either a real file or Stream derived, so has no use for poll. Who is this legacy support for?)

So, anyway, poll currently takes non-wakeable filehandles and works, and I see no reason the user can't mix wakeable and non-wakeable. Conceivably we could have an optimisation for one that took an array of wakeable ones, but that means a new type, new PollFH, and I don't know how to work it into the POSIX poll taking integer descriptors.

I don't see any benefit to making in the dual modes into public-facing API when the old mode will be retired. And I don't want to force applications to change their existing working code to get the benefit of waking file handles.

Maybe it simplifies the current implementation of poll, but it's not simpler in the long run. The complication in poll can be removed as soon as we decide we don't support poll-without-wake FileHandles.

The fundamental design question is - should non-blocking be imposed on the user of FileHandle?

You've lost me there. Users always have the choice of blocking or non-blocking, or they can kind of mix by setting the FileHandle to nonblocking and using poll on it. Nothing about this wake stuff affects that public API.

@embeddedteam103
Copy link
Contributor Author

If you think the poll-wake mechansim should not be reflected in application API then I agree, but I raised it as a design question that should be answered (as it can offer an adavantage).
If the goal is to ultimately deprecate and remove non-wakeble file handles, then I 100% agree with you on all points.

So, anyway, poll currently takes non-wakeable filehandles and works, and I see no reason the user can't mix wakeable and non-wakeable. Conceivably we could have an optimisation for one that took an array of wakeable ones, but that means a new type, new PollFH, and I don't know how to work it into the POSIX poll taking integer descriptors.

Maybe add to FileHandle a virtual member function is_wakeble() which defaults to false (and following deprecation, to change this function's behaviour). What do you think?

You've lost me there. Users always have the choice of blocking or non-blocking, or they can kind of mix by setting the FileHandle to nonblocking and using poll on it. Nothing about this wake stuff affects that public API.

The last sentance came out wrong :)
I meant the waking API in the global poll

@kjbracey
Copy link
Contributor

Maybe add to FileHandle a virtual member function is_wakeble() which defaults to false

So that would then be a precheck on every file descriptor in the poll array, and if every one is wakeable, call the wakeable version else the non-wakeable version. And that will always pull in both versions of poll because of that runtime test.

Don't see how that's really any better than doing the test inside the first poll loop. I guess it leaves the door open to the optimised wakeable-only form for the C++, but leaves the POSIX form seriously code-size disadvantaged, which seems unhelpful. Total code size is probably bigger if you use that POSIX version.

I meant the waking API in the global poll

In which case I still don't understand what the question was. Are you asking about code size cost of supporting both? If so, then my answer is to address that by dropping support for the hypothetical old FileHandles, or just not putting it in in the first place. I wouldn't want to address it by requiring anything of the application users, who have done nothing wrong. It's not their fault that poll isn't working well - I really want to fix it without them needing to change anything.

@embeddedteam103
Copy link
Contributor Author

embeddedteam103 commented Dec 18, 2018

Don't see how that's really any better than doing the test inside the first poll loop. I guess it leaves the door open to the optimised wakeable-only form for the C++, but leaves the POSIX form seriously code-size disadvantaged, which seems unhelpful. Total code size is probably bigger if you use that POSIX version.

You are right here.
From my POV there is only the C++ version, so I didn't even think about the POSIX form.

In which case I still don't understand what the question was. Are you asking about code size cost of supporting both? If so, then my answer is to address that by dropping support for the hypothetical old FileHandles, or just not putting it in in the first place. I wouldn't want to address it by requiring anything of the application users, who have done nothing wrong. It's not their fault that poll isn't working well - I really want to fix it without them needing to change anything.

If the design goal is no user intervention whatsoever then I agree, but you did not state it in the design goals you mentioned :)

You have convinced me fully of your approach now :)

If the way to go forward is deprecation of non-wakeble file handles with plans to remove them in the future, then how does this work? It's a deprecation of a "concept" and not specific function or something - How do users get notified of such change?

I think the way forward is to kick-start your branch as a PR as you mentioned earlier.
I will definitely help here (removing all forking of mbed in our code-base is a priority for us right now, and this is a major hurdle).

@kjbracey
Copy link
Contributor

It's a deprecation of a "concept" and not specific function or something - How do users get notified of such change?

Good question. Best mechanism we've come up with so far is "MBED_ERROR in debug builds". Issue came up in adjustments to ::wait_ms() - trying to get people to not call it in non-thread context.

I think the way forward is to kick-start your branch as a PR as you mentioned earlier.

Okay, I'll recheck at least the base cv_cs version and make a PR for it this week.

I'm not sure if I also want to incorporate the FileHandleDeviceWakeHelper into that review in its entirety now - but maybe point to it as a future path to offset increased driver complexity.

@embeddedteam103
Copy link
Contributor Author

Is there a TODO list for the PR? We want to help.

@kjbracey
Copy link
Contributor

Afraid I've had other stuff on my plate. #10104 in particular, and this ties in with it. ConditionVariableCS's non-RTOS implementation will want to use the work from that.

I'll raise this up in 5.13 planning - this issue is another facet of power saving, but I think very few people are using these blocking APIs in such a way that they notice the issue, so it's not really got any attention. The focus has been more on identifying background periodic tickers you see generally. These tickers only show up while you're blocking, and the vast majority of our middleware is non-blocking and event-driven...

@0xc0170
Copy link
Contributor

0xc0170 commented Aug 14, 2019

Afraid I've had other stuff on my plate. #10104 in particular, and this ties in with it. ConditionVariableCS's non-RTOS implementation will want to use the work from that.

PR #10104 was integrated for 5.14. What is left for this issue?

@ciarmcom
Copy link
Member

Thank you for raising this issue. Please note we have updated our policies and
now only defects should be raised directly in GitHub. Going forward questions and
enhancements will be considered in our forums, https://forums.mbed.com/ . If this
issue is still relevant please re-raise it there.
This GitHub issue will now be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants