Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect escaping in Printf on master (@sprintf("GAP[%%]")) #37784

Closed
petvana opened this issue Sep 28, 2020 · 6 comments · Fixed by #37785
Closed

Incorrect escaping in Printf on master (@sprintf("GAP[%%]")) #37784

petvana opened this issue Sep 28, 2020 · 6 comments · Fixed by #37785
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@petvana
Copy link
Member

petvana commented Sep 28, 2020

On master, the Printf library is redisegned but the escape sequences seems not to be hanled correctly (or like in C as written in docs). Now, it is not possible to print single '%' in the middle of string.

julia> using Printf
julia> @sprintf("GAP[%%]")
ERROR: LoadError: ArgumentError: invalid format string: 'GAP[%%]', invalid type specifier: ']'
Stacktrace:
 [1] Printf.Format(f::String)
   @ Printf ~/software/julia/julia-dev/usr/share/julia/stdlib/v1.6/Printf/src/Printf.jl:157
 [2] @sprintf(__source__::LineNumberNode, __module__::Module, fmt::Any, args::Vararg{Any, N} where N)
   @ Printf ~/software/julia/julia-dev/usr/share/julia/stdlib/v1.6/Printf/src/Printf.jl:783
in expression starting at REPL[3]:1

C style

julia> @ccall printf("GAP[%%]\n"::Cstring)::Cvoid
GAP[%]

The problem of the new design comes from the commit #32859 where the structure in stdlib/Printf/src/Printf.jl

struct Format{S, T}
    str::S # original full format string as CodeUnits
    # keep track of non-format specifier strings to print
    # length(substringranges) == length(formats) + 1
    # so when printing, we start with printing
      # str[substringranges[1]], then formats[1] + args[1]
      # then str[substringranges[2]], then formats[2] + args[2]
      # and so on, then at the end, str[substringranges[end]]
    substringranges::Vector{UnitRange{Int}}
    formats::T # Tuple of Specs
end

expects that length(substringranges) == length(formats) + 1 but it is not true in general if "%%" occurs in the middle of the string format. To provide simple example, there is no format but several substrings in the following:

@assert @sprintf("a%%f%%d%%x") == "a%f%d%x"

Utilized version of master branch

julia> versioninfo()
Julia Version 1.6.0-DEV.unknown
Commit b89c4a3 (2020-09-28 08:12 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-10.0.1 (ORCJIT, icelake-client)

Btw, all it works well on 1.5.2.

@petvana
Copy link
Member Author

petvana commented Sep 28, 2020

There is also a newly added unit test that fails in Julia 1.5.2, see (pull/37470).

@test @sprintf("%s%%%s", "a", "b") == "a%%b"

I propose to update it to

@test @sprintf("%s%%%s", "a", "b") == "a%b"

quinnj added a commit that referenced this issue Sep 28, 2020
Fixes #37784. The classic blunder of not unescaping the escaped when
printing. I considered trying to detect escaped chars during format
construction-time, but there isn't a clean way to utilize the current
`Printf.Format` structure to achieve this, since it requires
`length(f.formats) == length(f.substringranges) + 1`, and
`f.substringranges` must be the concretely typed
`Vector{UnitRange{Int}}`. Doing the unescaping in `Printf.format` when
we're outputting seems to not incur any performance penalty and seems
like a pretty clean fix since all outputting is handled in this one
function anyway.
quinnj added a commit that referenced this issue Sep 28, 2020
Fixes #37784. The classic blunder of not unescaping the escaped when
printing. I considered trying to detect escaped chars during format
construction-time, but there isn't a clean way to utilize the current
`Printf.Format` structure to achieve this, since it requires
`length(f.formats) == length(f.substringranges) + 1`, and
`f.substringranges` must be the concretely typed
`Vector{UnitRange{Int}}`. Doing the unescaping in `Printf.format` when
we're outputting seems to not incur any performance penalty and seems
like a pretty clean fix since all outputting is handled in this one
function anyway.
@quinnj
Copy link
Member

quinnj commented Sep 28, 2020

Thanks for reporting @petvana! Fix is up: #37785.

@petvana
Copy link
Member Author

petvana commented Sep 28, 2020

Wow! Thanks for the rapid response. You solved the existing test case, but the original issue with "[%%]" is still unsolved.

The printing in function format(buf::Vector{UInt8}, pos::Integer, f::Format, args...) is fixed, but the substringranges seem not be correctly detected in Format(f::AbstractString).

Thus, I propose to add to more tests like:

@test Printf.@sprintf("1%%") == "1%"
@test Printf.@sprintf("%%1") == "%1"
@test Printf.@sprintf("1%%2") == "1%2"
@test Printf.@sprintf("1%%%d", 2) == "1%2"
@test Printf.@sprintf("1%%2%%3") == "1%2%3"

@musm musm added the bug Indicates an unexpected problem or unintended behavior label Sep 29, 2020
@petvana
Copy link
Member Author

petvana commented Sep 29, 2020

I have tried to fix that in my fork petvana/julia/tree/jq/37784. I have moved the format detection into a separate inlined function for better readability. If you like it, I can submit a pull request to the branch. It succeeds in all tests, but I have concerns about its performance. @quinnj Which instances you use for performance testing?

@KristofferC
Copy link
Member

KristofferC commented Sep 29, 2020

Make sure to run any performance testing on top of #37802. At least for floats.

@quinnj
Copy link
Member

quinnj commented Sep 29, 2020

@petvana, yes, feel free to do a PR against my branch and I can review/try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants