-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby 2.1/2.2/2.3 yaml parsing anomaly for comma separated integers #273
Comments
Apparently Japan and China will separate at four digits, but to be honest Psych internals don't care, they just remove all commas and cast to a number. The only reason for this is legacy support for Syck documents. I agree that both values should evaluate to strings, but I'm not sure how to do that and support old formats. |
- Otherwise multiline values break like crazy - Update specs that would fail otherwise, to show that you must quote numbers containing commas - See ruby/psych#273 for why..
I've hit this issue today after trying to parse some YAML output by another tool (kustomize build in my case, but not that relevant). I think as YAML becomes more and more ubiquitous it might be a good idea to provide an option for "strict" parsing according to the standard? It seems to me that we use those regexp to decide whether or not to parse a Scalar as a Float or Integer: psych/lib/psych/scalar_scanner.rb Lines 11 to 20 in 9f8c365
I did this patch on my side as a workaround, which solved the issue I had: Psych::ScalarScanner.send(:remove_const, "INTEGER")
Psych::ScalarScanner::INTEGER = /^(?:[-+]?0b[0-1_]+ (?# base 2)
|[-+]?0[0-7_]+ (?# base 8)
|[-+]?(?:0|[1-9][0-9_]*) (?# base 10)
|[-+]?0x[0-9a-fA-F_]+ (?# base 16))$/x (basically replacing the regexp with a subset of the standard one from from http://yaml.org/type/int.html) Would it be possible to have say two constants: INTEGER_ENRICHED=/.../
INTEGER_STANDARD=/.../ And chose one or the other based on a flag to keep backward compatibility but offer a stricter/more standard alternative? I can offer to work on a patch if that can help. |
We would love to see this fixed as well. The current parsing logic does not conform to the YAML specification, so we believe something like @Jell's proposed solution should be considered. Would the maintainers (@tenderlove) of Psych be amenable to a PR implementing the above solution or similar? |
@tenderlove Do you have a stance on whether or not you would accept a PR that adds a flag to change the behavior to something stricter? We are willing to work on this, but don't want to get started if you don't think it would be worthwhile. |
Yes, I'm definitely open to it. There was a PR submitted here for strict hash keys. I made a commit here that demonstrates how to accomplish it. Maybe this commit should just be extended to be "strict parsing" or something (rather than just specific to hash keys). |
@tenderlove I created a PR that implements this (#537). Let me know what you think! |
#537 fixed this issue. |
A string of digits separated by comma (without quotes) is evaluated as integer. Is it intentional?
I understand that the first example can be seen as a number in American notation but the 2nd example is not following any notation/standard that I'm aware of. I was expecting to have both of the values evaluated as string.
With the quotes it works as expected:
Thanks
The text was updated successfully, but these errors were encountered: