Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive clock queries causing slowdown on freebsd #38877

Closed
Keno opened this issue Dec 14, 2020 · 13 comments
Closed

Excessive clock queries causing slowdown on freebsd #38877

Keno opened this issue Dec 14, 2020 · 13 comments
Labels
ci Continuous integration system:freebsd Affects only FreeBSD

Comments

@Keno
Copy link
Member

Keno commented Dec 14, 2020

On freebsd, the jl_hrtime calls that we use to compute time spent in the compiler appear to be taking 70% of total execution time of compile-heavy julia processes. On linux it's not quite as bad, because this syscall has a special fastpath, but either way, this time is clearly excessive. cc @ianshmean who added this timer in #37678. At the very least, it should only be on in @time.

@Keno Keno added the system:freebsd Affects only FreeBSD label Dec 14, 2020
@Keno
Copy link
Member Author

Keno commented Dec 14, 2020

I'm a bit confused though, because FreeBSD appears to have the same vDSO based acceleration for this system call (https://github.com/lattera/freebsd/blob/401a161083850a9a4ce916f37520c084cff1543b/lib/libc/sys/__vdso_gettimeofday.c)

@Keno
Copy link
Member Author

Keno commented Dec 14, 2020

Hmm:

# sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: i8254(0) ACPI-fast(900) TSC-low(-100) dummy(-1000000)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 30709
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 7891073
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.TSC-low.quality: -100
kern.timecounter.tc.TSC-low.frequency: 1500038608
kern.timecounter.tc.TSC-low.counter: 4233127671
kern.timecounter.tc.TSC-low.mask: 4294967295

This is on AWS, but it looks like if the TSC timecounter is disabled, then so is the gettime fastpath: https://github.com/lattera/freebsd/blob/401a161083850a9a4ce916f37520c084cff1543b/sys/x86/x86/tsc.c#L760

@Keno
Copy link
Member Author

Keno commented Dec 14, 2020

Two action items here:

  • Stop spamming clock queries
  • Somebody should talk to upstream about the lack of vdso clock acceleration in virtualized environments (AWS in particular). Since linux can get away with it, I assume FreeBSD could too

@ararslan
Copy link
Member

@kostikbel, apologies for the ping, but do you have any insight on the FreeBSD internals aspect here? I ask because you're on the copyright for the file in question. 🙂

@kostikbel
Copy link

kostikbel commented Dec 14, 2020

Problem is that smp_tsc == 0. For start, try to set the loader tunables kern.timecounter.smp_tsc=1 and kern.timecounter.smp_tsc_adjust=1 and see

  1. if smp_tsc is reported as 1 and default timecounter hardware choice changed
  2. if 1 is fine, whether ntpd can pace the hardware, i.e. ntpd should not give up after one or two hours of uptime
  3. if 2 is fine, whether it actually helps julia.

@Keno
Copy link
Member Author

Keno commented Dec 14, 2020

@ararslan Can you experiment with this? E.g. just time the build of corecompiler.ji with the various options.
Here's what happens if I manually turn on TSC-low:

$ time /usr/home/ec2-user/julia/usr/bin/julia -C "native" --output-ji /usr/home/ec2-user/julia/usr/lib/julia/corecompiler.ji.tmp --startup-file=no --warn-overwrite=yes -g0 -O0 compiler/compiler.jl
      740.55 real       111.85 user       615.77 sys
$ su
# sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: ACPI-fast -> TSC-low 
$ time /usr/home/ec2-user/julia/usr/bin/julia -C "native" --output-ji /usr/home/ec2-user/julia/usr/lib/julia/corecompiler.ji.tmp --startup-file=no --warn-overwrite=yes -g0 -O0 compiler/compiler.jl
       93.66 real        81.22 user         0.60 sys

@Keno
Copy link
Member Author

Keno commented Dec 14, 2020

Note that I'm just using the stock AMI that's available on EC2.

@Keno Keno added the ci Continuous integration label Dec 14, 2020
@IanButterworth
Copy link
Member

Meanwhile I'm working on a PR to disable spamming the timer unless during @time

@ararslan
Copy link
Member

@ararslan Can you experiment with this?

Yes but only on a local machine; I don't have a virtualized FreeBSD environment. Will that still be useful?

@Keno
Copy link
Member Author

Keno commented Dec 15, 2020

I can spin you up a freebsd box on AWS with identical configuration.

@IanButterworth
Copy link
Member

If anyone has a test system up, it may be good to double check that #38885 removes all unwanted jl_hrtime usage

@Keno Keno closed this as completed in fa6077e Dec 15, 2020
@Keno
Copy link
Member Author

Keno commented Dec 15, 2020

Alright, this issue is resolved for us, but I think it's worth for @ararslan to continue following up with upstream to see why the vdso optimization is disabled on the default AWS images. After all, what's the point of such an optimization if it doesn't actually run a lot of the time.

KristofferC pushed a commit that referenced this issue Dec 17, 2020
@emaste
Copy link

emaste commented Dec 17, 2020

For reference, upstream code review to stop avoiding TSC in VMs (which will enable the vdso optimization): https://reviews.freebsd.org/D27629

markjdb pushed a commit to markjdb/freebsd that referenced this issue Dec 23, 2020
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Dec 27, 2020
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
staticfloat pushed a commit that referenced this issue Jan 15, 2021
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Jan 29, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Feb 28, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Mar 28, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Apr 9, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Jun 2, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Aug 27, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
brooksdavis pushed a commit to CTSRD-CHERI/cheribsd that referenced this issue Oct 15, 2021
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
Yamagi pushed a commit to Yamagi/freebsd that referenced this issue Feb 3, 2022
I suspect that virtualization techniques improved from the time when we
have to effectively disable TSC use in VM.  For instance, it was reported
(complained) in JuliaLang/julia#38877 that
FreeBSD is groundlessly slow on AWS with some loads.

Remove the check and start watching for complaints.

Reviewed by:	emaste, grehan
Discussed with:	cperciva
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27629
staticfloat pushed a commit that referenced this issue Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration system:freebsd Affects only FreeBSD
Projects
None yet
Development

No branches or pull requests

5 participants