Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CLOCK_MONOTONIC_RAW when available #1124

Merged
merged 2 commits into from
Dec 8, 2021
Merged

Conversation

piru
Copy link

@piru piru commented Dec 5, 2020

Use CLOCK_MONOTONIC_RAW (when available) which isn't affected by system time adjustments (such as NTP or adjtime).
Overall I think this is a better fix for #1014 since this is a generic solution (this issue might also affect other platforms, not just Mac OS).

I think the reason CLOCK_MONOTONIC manifests this behaviour is due to Apple's libc clock_gettime() CLOCK_MONOTONIC implementation here: https://opensource.apple.com/source/Libc/Libc-1353.41.1/gen/clock_gettime.c.auto.html

clock_gettime() CLOCK_MONOTONIC code path uses _mach_boottime_usec() which and ends up using gettimeofday() - boottime. gettimeofday() is affected by system time changes. Thus the time returned may suddenly jump backwards.

CLOCK_MONOTONIC_RAW on the other hand uses mach_continuous_time() directly, which doesn't get adjusted by system time changes.

In my personal opinion this behaviour of CLOCK_MONOTONIC is wrong, and CLOCK_MONOTONIC should rather be made to behave identical to CLOCK_MONOTONIC_RAW instead. But I digress.

@drosehn
Copy link

drosehn commented Dec 30, 2020

For what it's worth: I've seen the crash described in #1014 many times on a Mac Mini running macOS Catalina (see my comments on that issue). I would have 8-10 mosh sessions running concurrently, and it was rare that I could go more than 3 days without seeing that crash at least once. Sometimes I'd see the crash 10-15 times in a single day.

I recompiled mosh with this change to use CLOCK_MONOTONIC_RAW. I have now had 10 mosh sessions running concurrently for more than two weeks, and have not seen a single crash. So I think this is a good change to make.

@jsquyres
Copy link

It would be great if this PR could be merged so that it could be pulled into Homebrew as a patch. I highly doubt they'd take this fix until it's actually merged into the mosh repo.

@drosehn
Copy link

drosehn commented Oct 20, 2021

I'll update this one more time to note that I've now had 10 more months of running with the change to use CLOCK_MONOTONIC_RAW, and have not seen any crashes on MacOS 10 Catalina.

@keithw
Copy link
Member

keithw commented Oct 21, 2021

Hmm, thank you for submitting this. I'm a little hesitant to work around a macOS-specific bug with a fix that isn't macOS-specific. Would you be open to making this an architecture-specific workaround? (And, has any Mac user actually heard back from Apple on their bug reports? It feels suboptimal to be working around an OS defect with zero communication from the vendor or understanding why the behavior was changed.)

@piru
Copy link
Author

piru commented Oct 21, 2021

I see no problem in making it #if defined(__APPLE__) && defined(CLOCK_MONOTONIC_RAW)
although I see really no problem in using CLOCK_MONOTONIC_RAW on other platforms, too, if available. After all we don't know if apple is the only one having a time traveling CLOCK_MONOTONIC implementation...

@nanonyme
Copy link

It really sounds like a good idea to me to use this always when available. It is always a better including on Linux. CLOCK_MONOTONIC is subject to NTP-adjustments on Linux as well.

@piru
Copy link
Author

piru commented Oct 24, 2021

NTP changes can't make linux CLOCK_MONOTONIC go backwards in time however:

The CLOCK_MONOTONIC clock is not affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP. This clock does not count time that the system is suspended. All CLOCK_MONOTONIC variants guarantee that the time returned by consecutive calls will not go backwards, but successive calls may—depending on the architecture—return identical (not-increased) time values.

That being the case, I really don't see much of a problem in applying this fix in either form. AFAIK it makes no difference here.

@maximilianh
Copy link

I'm still getting this error, every few days. Is there a plan to merge this?

@maximilianh
Copy link

Can we help with anything, e.g. testing, to help with merging this patch?

@eminence
Copy link
Member

eminence commented Dec 5, 2021

I don't have a MacOS machine that I can use to test this, but I'm inclined to merge this, since: the change is small, platform specific, easy to revert if there are problems, and several people have tested it and reported it to have solved the problem. I would be grateful if others MacOS users could leave a comment here if this ends up fixing the issue.

@jsquyres
Copy link

jsquyres commented Dec 5, 2021

Previous to the patch on this PR, I would run into the problems described in #1014 at least once a week (sometimes more frequently). Since I built my own mosh with the patch from this PR on 20 Oct 2021, I have not run into the #1014 problem at all (i.e., I've been using the patch on this PR for about 1.5 months).

I use mosh extensively on both both macOS Big Sur (v11.x) and Monterey (v12.0.x). 👍

carlocab added a commit to carlocab/homebrew-core that referenced this pull request Dec 6, 2021
Let's apply the patches from mobile-shell/mosh#1124 that are intended to
fix mobile-shell/mosh#1014. The patch makes sense, and several users
have successfully used the patch for quite some time now.

Fixes Homebrew/discussions#2583.
@carlocab
Copy link

carlocab commented Dec 6, 2021

I've opened Homebrew/homebrew-core#90580 to apply the patches here to Homebrew's mosh.

Edit: Now merged.

BrewTestBot pushed a commit to Homebrew/homebrew-core that referenced this pull request Dec 6, 2021
Let's apply the patches from mobile-shell/mosh#1124 that are intended to
fix mobile-shell/mosh#1014. The patch makes sense, and several users
have successfully used the patch for quite some time now.

Fixes Homebrew/discussions#2583.

Closes #90580.

Signed-off-by: Michka Popoff <3406519+iMichka@users.noreply.github.com>
Signed-off-by: BrewTestBot <1589480+BrewTestBot@users.noreply.github.com>
@eminence
Copy link
Member

eminence commented Dec 8, 2021

Ok, no negative feedback, so I"m merging this. @carlocab, thanks for getting this patched into homebrew, that will be helpful for our MacOS users.

@eminence eminence merged commit 3117647 into mobile-shell:master Dec 8, 2021
@eminence eminence added this to the 1.4.0 milestone Aug 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants