Skip to content
This repository has been archived by the owner on Mar 1, 2024. It is now read-only.

RFC: Adding a sfmt simple format function #8

Open
ScottPJones opened this issue Jun 18, 2015 · 15 comments
Open

RFC: Adding a sfmt simple format function #8

ScottPJones opened this issue Jun 18, 2015 · 15 comments

Comments

@ScottPJones
Copy link
Collaborator

I had an idea to add a simple formatter, which could have different methods for different types of values:

sfmt( value [, format [, format arguments] ] )

The reason to have the value first, is that you could have methods added to have a default format for specific types.

For example, the C sprintf(buf, “%*s”, 8, string) would become something like:
sfmt( string, “*s”, 8)

I realize this duplicates functionality already present, it is just an attempt to make the syntax simpler and easier to use... (note: I'd originally suggested naming it fmt, before I found out that Formatting.jl already had a function with that name... unfortunately, at least in the case of fmt(string, string), it would be ambiguous).

@tbreloff
Copy link

Sorry for the delayed response, Scott. I come from the perspective where there are a few very common things to do, and those should be as easy as possible:

  • Print floating point numbers consistently
  • Add commas as thousand-separators
  • Pad (either left or right) the resulting string with blanks or zeros

I also feel like it should be clear (just by reading the code) what the operation is doing. So sfmt(str, "*s", 8) is a little unclear as to exactly what is done. Especially since your average person won't realize the C-equivalent.

As an exercise, lets try out some possible method calls and the resulting string that we would expect:

x = 1234.56789
fmt(x)   # "1234.568"
fmt(x, 1)  # "1234.6"
fmt(x, 1, w=8)  # "  1234.6"
fmt(x, 1, :left, w=8)  # "1234.6  "
fmt(x, 2, :commas) # "1,234.57"
fmt(rand(5), 2)  # "[0.12 0.55 0.32 0.97 0.43]"

If there is no ambiguity in arguments (by mixing ints, symbols, and keyword args) then we can support arbitrary reordering of arguments as well as default values. As well we could keep global defaults that can be changed by the user like:

Formatting.set_default_precision(1)
Formatting.set_default_width(8)
fmt(x) # "  1234.6"

@ScottPJones
Copy link
Collaborator Author

@tbreloff I think there are already, in the very nice Formatting.jl package, the sorts of explicit formatting, all spelled out with keyword arguments, that you want.
I was trying to get something that would immediately be familiar precisely to all the C/C++/C#/Java programmers, that wouldn't require any interpolation, or keyword arguments, just a constant format string.
Maybe I should have called it cfmt, instead of sfmt (which was for "simple format").
Even Julia has @printf @sprintf, but as macros, which has been a complaint from a number of people,
hence this RFC.
I like your idea of the type specific cfmt being able to use settable defaults... that makes them much more useful.
Note: I can't call this function fmt, because that's already used in Formatting.jl, and would be ambiguous for the case of fmt(str1::AbstractString,str2::AbstractString)

@tbreloff
Copy link

I agree that the major functionality is already there... this RFC is just to specify a nicer way to call it. Remember that the @sprintf macro exists for speed... the format string doesn't need to be re-parsed every time. One possible solution would be to add memoization to a cfmt method call, so that if you repeatedly call it with the same string it can look up a cached formatter.

I would prefer to split fmt's current functionality into cfmt (for c-style formatting) and pfmt (for python-style formatting) and create a new "julian" way of formatting that is called with fmt that depends less on parsing format strings, and more on methods with smart dispatch. Thoughts?

@tbreloff
Copy link

Actually the memoization already happens (see generate_formatter), so it would only involve adding some additional method signatures to simplify calling for common use-cases.

@ScottPJones
Copy link
Collaborator Author

Splitting them up into C-like, Python-like, and best practices Julian way sounds like a great idea, as well as doing memoization. Thanks for all the great ideas! (Are you able to do some work on them yourself? 😀 I'm just doing stuff as I find I need it...)

@tbreloff
Copy link

Yes I can do some of this, however my use-case is limited to technical applications in the US. Maybe I'll fork and make a first pass at this, then ask for comments on julia-users?

@ScottPJones
Copy link
Collaborator Author

Sounds good! Which are you mostly interested in working on, the pfmt, the "julian" fmt (which you'll have to deal with the fact that there is already a fmt in the package, and make sure that your methods are not ambiguous), or the cfmt (which is the one that interests me 😀)
Just curious, why does your use-case change what you'd do? (my use case would just be for cfmt)

@tbreloff
Copy link

My goal is to be as julian as possible. I only really care about the fmt method, however sometimes c-style and python-style formatting have their place. Both cfmt and pfmt would be light wrappers around what already exists, whereas fmt would be a new approach which uses dispatch heavily (but still builds a FormatSpec object and calls the existing methods). In an ideal world, the user never needs to know about the horror of printf. (yes... the HORROR)

@tbreloff
Copy link

Also regarding the existence of fmt... there are enough breaking changes flowing through Julia that it's not that big a deal for users to find/replace all calls to fmt with a call to cfmt

@ScottPJones
Copy link
Collaborator Author

Hehe... yes, I agree about the horror... why else do you think I'm so active with Julia, which I only discovered 3 months ago 😀?

@ScottPJones
Copy link
Collaborator Author

The one thing I dislike about the current fmt is that it puts the format spec first, instead of the value, and allows a string instead of a compiled format spec also... preventing having a simple default fmt(x) dispatched on the type of x...

@tbreloff
Copy link

Agreed... I would like the option of doing something like:

default_width!(10)  # change the default width
v = rand(10)
strs = fmt(v)  # Vector of strings... all length 10. defined as:  fmt(v::AbstractArray) = map(fmt, v)

This particular example might not be a big win, but it could be nice for more complicated stuff.

@tbreloff
Copy link

Ok I threw together a preview in my fork: https://github.com/tbreloff/Formatting.jl/tree/tom_fmt

This doesn't touch the python formatting or formatting more complex strings... it's only for formatting individual values right now. Here's a sample of how you use it:

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+5199 (2015-06-04 05:45 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 5ea501d* (18 days old master)
|__/                   |  x86_64-redhat-linux

julia> using Formatting

julia> x = 1234.56789
1234.56789

julia> fmt(x)
"1234.567890"

julia> fmt(x,2,20,:commas)
"             1,234.57"

julia> fmt(x,2,20,:commas,:left)
"1,234.57             "

julia> default!(:commas, :left, width=20)

julia> fmt(x)
"1,234.567890         "

julia> fmt(x,2)
"1,234.57             "

julia> map(fmt,rand(10))
10-element Array{UTF8String,1}:
 "0.543696            "
 "0.920439            "
 "0.642571            "
 "0.220117            "
 "0.804826            "
 "0.637169            "
 "0.507218            "
 "0.812312            "
 "0.736187            "
 "0.745050            "

Note the use of symbols also could let us set "themes"... for example I could see doing:

default!(:finance)
fmt(1234.56789)  # returns "$1,234.57"

I'm looking forward to comments...

@ScottPJones
Copy link
Collaborator Author

I like, I like! 👍

@tbreloff
Copy link

see pull request #10

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants