-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Format parsing in Printf (escaped %) #37807
Conversation
Sorry, no idea what caused the error on 32-bit during bootstrap on the new
|
Fix 32-bit version Co-authored-by: Simeon Schaub <simeondavidschaub99@gmail.com>
Could you walk-through the changes here? It's a bit hard to tell what changed with the refactoring. What was the core issue? What is the fix? |
It's hard to follow mainly because diff at GitHub failed to detect that I have moved ~80% of code unchanged to a separate function Now, the proposed code in Variables |
I would suggest viewing with the "ignore whitespace" box checked: https://github.com/JuliaLang/julia/pull/37807/files?diff=unified&w=1 |
I have tried to create a micro-benchmark focused mainly on format parsing. Computational time in nanoseconds measured by `@belapsed`, `NaN` -> failed
The test instances were as follows # T1
Printf.@sprintf("short");
# T2
Printf.@sprintf("longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong");
# T3
Printf.@sprintf(" %15d ", 10^6);
# T4
Printf.@sprintf(" %d %d %d %d %d ", 1, 2, 3, 4, 5);
# T5
Printf.@sprintf(" %f %f %f %f %f ", pi, pi, pi, pi, pi);
# T6
Printf.@sprintf(" %% %% %% %% %% %% %% "); Surprisingly cases T2, T4, T5 are slower than in v1.5.2 (at least on my CPU). During playing around, I also found that unrolling in output function All the branches have been tested on the same Julia binary:
Full Julia source-code to replicate resultsmodule BenchmarkPrintf
using BenchmarkTools
using DataFrames
using Test
# where the tested codes are stored
dir = "versions"
branches = [
"JuliaLang/julia/v1.5.2",
"JuliaLang/julia/master",
"JuliaLang/julia/jq/37784",
"petvana/julia/jq/37784",
"petvana/julia/Printf-no-unroll",
]
#rm(dir, recursive=true, force=true)
data = DataFrame()
data.PrintfVersion = branches
data.T1 = NaN
data.T2 = NaN
data.T3 = NaN
data.T4 = NaN
data.T5 = NaN
data.T6 = NaN
for (idx, name) in enumerate(branches)
@info "Benchmarking $(name)"
actdir = dir * "/" * replace(name, "/" => "-")
source_file = actdir * "/Printf.jl"
mkpath(actdir)
if name == "local"
# cp(local_file, source_file, force=true)
else
url = "https://raw.githubusercontent.com/$(name)/stdlib/Printf/src/Printf.jl"
if !isfile(source_file)
@info "Downloading from $(url)"
download(url, source_file)
run(`sed -i 's/using Base.Grisu/using Grisu/g' $(source_file)`)
end
end
include(source_file)
using .Printf
instance = :T1
try
t = @belapsed Printf.@sprintf("short");
data[idx, instance] = t * 10^9
catch
end
instance = :T2
try
t = @belapsed Printf.@sprintf("longlonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglonglong");
data[idx, instance] = t * 10^9
catch
end
instance = :T3
try
t = @belapsed Printf.@sprintf(" %15d ", 10^6);
data[idx, instance] = t * 10^9
catch
end
instance = :T4
try
t = @belapsed Printf.@sprintf(" %d %d %d %d %d ", 1, 2, 3, 4, 5);
data[idx, instance] = t * 10^9
catch
end
instance = :T5
try
t = @belapsed Printf.@sprintf(" %f %f %f %f %f ", pi, pi, pi, pi, pi);
data[idx, instance] = t * 10^9
catch
end
instance = :T6
try
t = @belapsed Printf.@sprintf(" %% %% %% %% %% %% %% ");
data[idx, instance] = t * 10^9
catch
end
println(data)
end
println("")
show(stdout, MIME("text/html"), data)
println("")
println("")
end |
Remove unnecessary unroll, and add comments to the code.
To take it more seriously, I have prepared a benchmark closer to real usage. It prints 1 or 10^7 lines into @quinnj What do you think about the PR and results? Utilized formats: # F1
" %f \n",
# F2
" %f %f \n"
# F3
" %f %f %f %f \n",
# F4
" %f %f %f %f %f %f %f %f \n"
# F5
" Influence %f [%%], data %f \n"
# F6
" Influence %f [%%], data %f, %f, %f, %f, %f, %f, %f, \n"
# F7
" Very very very very very very very very very very very very very very very very very very very very long text %f \n"
# D1
" %d \n",
# D2
" %d %d \n"
# D3
" %d %d %d %d \n",
# D4
" %d %d %d %d %d %d %d %d \n"
# D5
" Influence %d [%%], data %d \n"
# D6
" Influence %d [%%], data %d, %d, %d, %d, %d, %d, %d, \n"
# D7
" Very very very very very very very very very very very very very very very very very very very very long text %d \n" Print 1 line with float value(s) [s]
Print 10^7 lines with float value(s) [s]
Print 1 line with integer value(s) [s]
Print 10^7 lines with integer value(s) [s]
|
Going to try and dig into this now; sorry for the delay. Those benchmarks are.......a little weird since they seem like very obscure/corner case uses of printf. Most of the benchmarking I've done is just regular floats/ints/strings and sometimes mixed. The unrolling code should be much faster in the mixed type args case. The only benchmark you posted that really concerns me is T4 where printing 5 integers seems to be slower than 1.5.2. |
Ok, I pushed another simple commit that fixes the rest of the escaping test cases (and adds those tests) (d2f65db). I do appreciate the performance benchmarking you've done across all these cases @petvana; I think that's a bit outside of the scope of the specific fix for the original issue (the fix doesn't itself introduce any performance issues). So let's merge the fixes in my branch and maybe open a new issue for the problematic performance issues you've found. |
Thank you for the fix @quinnj. I have gone through the code, and it seems to solve all possible cases. I have also tested performance with mixed types, and I can confirm the unrolling code is much faster in such a scenario. It seems the compiler can optimize (unroll) the code automatically if all the types are the same. Now, the only performance drop (10-13%) of your branch compare to master is for very long substring ranges (F7, D7), but I guess they are sporadic. An alternative would be to pre-process the format string and remove escaped '%' symbols during Format parsing (compile time), but it would be necessary to store the modified format as a copy. The reason for the performance drop in T4 (compare to 1.5.2) is still unknown to me, but it can be somehow related to the fact that the inputs are constant. I will close this PR for now, as the bug is fixed. |
Thanks for the response @petvana; I looked a little into the performance of T4 last night and I think it has to do with our creating a |
This PR aims to fix #37784 by updating
Format(f::AbstractString)
function. It moves format detection into a separate inlined function (only refactoring, no changes) for better readability. There are also new tests for the issue. Notice performance has not been properly tested yet.