Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia v1.11.1 regression: open(::Cmd) is 4X slower than v1.10.4 #56352

Closed
cihga39871 opened this issue Oct 26, 2024 · 6 comments · Fixed by #56380
Closed

Julia v1.11.1 regression: open(::Cmd) is 4X slower than v1.10.4 #56352

cihga39871 opened this issue Oct 26, 2024 · 6 comments · Fixed by #56380
Labels
performance Must go faster regression 1.11 Regression in the 1.11 release

Comments

@cihga39871
Copy link

cihga39871 commented Oct 26, 2024

Hi,

I found Base.Process created by open(x::Cmd) is extremely slower than v1.10.4.

Minimal example in Linux:

First create a large file:

file = joinpath(Sys.BINDIR, "../include/julia/uv.h")
large_file = "z.large.file"

for i in 1:1000
    run(pipeline(`cat $file`, stdout = large_file, append=true))
end

Then, run the following command in Julia v1.11.1 and v1.10.4:

large_file = "z.large.file"

c = 0
io = open(`cat $large_file`)

@time while !eof(io)
    line = readline(io)
    c += length(line)
end

close(io)

Here is the differences:

  • v1.11.1: 2.123290 seconds (12.57 M allocations: 496.226 MiB, 5.34% gc time, 2.38% compilation time)
  • v1.10.4: 0.579683 seconds (8.07 M allocations: 440.373 MiB, 3.59% gc time, 3.00% compilation time)

I also wrote a package to solve the regression, which is BiBufferedStreams.jl

Its readme has a thorough benchmark, eliminating the effects of precompilation.

@nsajko nsajko added the performance Must go faster label Oct 26, 2024
@PallHaraldsson

This comment has been minimized.

@vtjnash
Copy link
Member

vtjnash commented Oct 26, 2024

Do you know what caused the slow down? I notice your read implementations don't acquire the required locks that the other functions use, so they are somewhat dangerous to call those functions (unlike the default implementations), though most typical uses are probably quite fine as the locks may be often unnecessary for single threaded use

@cihga39871
Copy link
Author

I did not look into the detail implement ation of readline in Julia v1.11 because it calls a C function. Unfortunately, I am not a C person and cannot tell what lead to the regression.

Actually, when I found readline of Base.Process is slow, I did not know it was a performance regression. I thought it was always that speed. So I wrote that package to fit my need (a faster readline). I do not use multi-thread when reading data. It would be nice to benchmark after adding a lock for readline.

On the other hand, I am quite impressed that my code runs faster in v1.11, so the new version definitely has performance improvement on very low level codes. Probably it is not a very huge bug, just some extra copy?

@KristofferC
Copy link
Member

KristofferC commented Oct 28, 2024

Regressed in c14d4bb, cc @stevengj.

Looks like heavy locking + unlocking in

julia/base/io.jl

Lines 1000 to 1007 in 2cdfe06

output! = isa(out, IO) ? write : push!
for c in readeach(s, T)
if c == delim
keep && output!(out, c)
break
end
output!(out, c)
end

@KristofferC KristofferC added the regression 1.11 Regression in the 1.11 release label Oct 28, 2024
@KristofferC
Copy link
Member

Ref #50615

@vtjnash
Copy link
Member

vtjnash commented Oct 28, 2024

Looks like there is a readuntil optimization in base/stream for this, which missed getting renamed to copyuntil in #48273

vtjnash pushed a commit that referenced this issue Oct 29, 2024
Fixes #56352, with the repro in
that issue:

```
Master:
  1.114874 seconds (13.01 M allocations: 539.592 MiB, 3.80% gc time)

After:
   0.369492 seconds (12.99 M allocations: 485.031 MiB, 10.73% gc time)

1.10:
    0.341114 seconds (8.36 M allocations: 454.242 MiB, 2.69% gc time)
```
KristofferC added a commit that referenced this issue Nov 11, 2024
Fixes #56352, with the repro in
that issue:

```
Master:
  1.114874 seconds (13.01 M allocations: 539.592 MiB, 3.80% gc time)

After:
   0.369492 seconds (12.99 M allocations: 485.031 MiB, 10.73% gc time)

1.10:
    0.341114 seconds (8.36 M allocations: 454.242 MiB, 2.69% gc time)
```

(cherry picked from commit 07530bc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression 1.11 Regression in the 1.11 release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants