Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whole number float values marshal to ambiguous integer values #430

Open
sclevine opened this issue Jan 18, 2019 · 10 comments
Open

Whole number float values marshal to ambiguous integer values #430

sclevine opened this issue Jan 18, 2019 · 10 comments

Comments

@sclevine
Copy link

Using go-yaml v2.2.2, whole number floats (e.g., 1.0) are marshaled to values that are ambiguous with integers (e.g., 1). This makes it difficult to use go-yaml to modify arbitrary YAML without converting floats to integers.

The YAML v1.2 spec makes it seem like 1 is a valid representation of a float:
https://yaml.org/spec/1.2/spec.html#id2804923

[-+]? ( \. [0-9]+ \| [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )? 

While the YAML v1.1 spec suggests otherwise:
https://yaml.org/type/float.html

 [-+]?([0-9][0-9_]*)?\.[0-9.]*([eE][-+][0-9]+)? (base 10)

However, it isn't clear to me if these expressions are intended to apply to untagged values (without !!float), or just to allowed values when !!float is used to specified the type.

Given that YAML has distinct float and integer types, I propose that whole number floats be marshaled unambiguously.

For example:

t := struct{ F float64 }{1.0}
fmt.Printf("%T\n", t.F)
r, err := yaml.Marshal(t)
if err != nil {
	panic(err)
}
fmt.Print(string(r))
var out interface{}
if err := yaml.Unmarshal(r, &out); err != nil {
	panic(err)
}
fmt.Printf("%T\n", out.(map[interface{}]interface{})["f"])

Currently outputs:

float64
f: 1
int

Proposed change:

float64
f: 1.0
float64
@niemeyer
Copy link
Contributor

The key reason to use YAML as a configuration format is for the content to be human-friendly, both when reading and when writing. From that perspective, "42" is a significantly more usual way for a person to express that number than "42.0". So if an application is behaving in a different way when someone enters "42" or "42.0", that sounds like a bug worth fixing in the application. Along the same lines, forcing everybody to read "42.0" in their configuration files just because the code is accepting a non-integer number also sounds unfriendly to those that will be reading the file.

With that said, the v3 of go-yaml that I've been working on will allow you to unmarshal into an intermediate format that preserves exactly the original representation, including octals, hexadecimals, specific tags, etc. That will enable you to roundtrip the original representation much more precisely. In fact, even comments will be preserved to a good degree. The v3 release will be out for testing as soon as I manage to finish the last couple of incompatible changes

@sclevine
Copy link
Author

Thanks for the quick response @niemeyer!

Your explanation makes sense to me. This request is a nice-to-have formatting enhancement and not a failure of go-yaml to implement the YAML spec.

That said, I would prefer to preserve the original value for my use case, and I look forward to the v3 release. 😄

@july2993
Copy link

Is v3 available to do such formatting now?

@philomory
Copy link

I would argue that the current behavior (where a float64 value of 1.0 is rendered in the YAML text as 1) is simply wrong, according to the YAML specification, because float values do not round-trip successfully: https://play.golang.org/p/4NiEY5ZOwb8

In other words, because go-yaml is able to successfully decode 1.0 into a float64, when it encodes it back into YAML again, it needs encode it as something that will still successfully decode as a float64. It doesn't need to encode to the original string, !!float 1 would technically be fine, but you need to satisfy encode(data) == encode(decode(encode(data))) and decode(string) == decode(encode(decode(string))).

The regex that @sclevine quotes from the YAML 1.2 spec is the specification of a float... sort of. There is a list of regexes used to categorize untagged values, and, per the specification, "first match wins". So in order for an untagged value to be parsed as a float, it doesn't only have to match [-+]? ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )?, it also has to fail to match any previous expression in the list. In particular, 1 (without a tag) is not a valid representation of a float, because it matches the integer regex ([-+]? [0-9]+) first. You can use 1 to represent a float in a tagged value, however (e.g. !!float 1).

@niemeyer
Copy link
Contributor

@philomory Hi Adam. For a 1-to-1 type matching I suggest looking into v3's yaml.Node. It goes even further by preserving the text. Otherwise, for general encoding/decoding what was described above applies and the human factor wins over the strict representation of types. I care too much about my inbox to consider !!float 1 to be "fine". :)

@philomory
Copy link

@niemeyer Just, for the sake of clarity, I never meant to suggest that !!float 1 was advisable; merely that it was a valid syntax for rendering a float in YAML, whereas simply 1 is not a valid way or rendering a float. The preferred way of rendering a float in YAML would be 1.0.

Is there an example anywhere of how to use yaml.Node to turn map[string]interface{}{"a": 1.0, "b": 2.0} into a: 1.0\nb: 2.0?

@niemeyer
Copy link
Contributor

I suggest just decoding that string into a yaml.Node with yaml.Unmarshal, and printing it out. You can move it back and forth and it should roundtrip.

@philomory
Copy link

philomory commented Feb 24, 2021

@niemeyer And if I want to go the other way?

Edit: To be clear, I mean, I have an arbitrary map[string]interface{} that contains floats, and I want to turn it into a YAML document that actually renders floats such that when decoded they will still be floats. How do I accomplish that?

@niemeyer
Copy link
Contributor

If you're not using yaml.Node, the conversation above has all the details about what works when and why.

@egor-ryashin
Copy link

egor-ryashin commented Aug 6, 2024

Could somebody give an example how to output 2.0 here without recursively traversing the map (outside of Marshal)?

import (
	"gopkg.in/yaml.v3"
)
func TestA(t *testing.T) {
	data := make(map[interface{}]interface{})
	data["a"] = 2.0

	bts, err := yaml.Marshal(data)
	require.NoError(t, err)

	require.Equal(t, "a: 2.0\n", string(bts))
}

Right now it's like this:

        	Error:      	Not equal: 
        	            	expected: "a: 2.0\n"
        	            	actual  : "a: 2\n"

For example, this works for a flat map:

func TestA(t *testing.T) {
	data := make(map[interface{}]yaml.Node)
	node := yaml.Node{}
	node.Kind = yaml.ScalarNode
	node.Value = "2.0"
	data["a"] = node

	bts, err2 := yaml.Marshal(data)

	if err2 != nil {
		require.NoError(t, err2)
	}

	require.Equal(t, "a: 2.0\n", string(bts))
}

But if there's map of maps then one needs to traverse the map and generate a mirror structure but with yaml.Nodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants