Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I only parse the first level and get the value as string? #1175

Closed
kgbook opened this issue Jul 26, 2018 · 39 comments
Closed

How can I only parse the first level and get the value as string? #1175

kgbook opened this issue Jul 26, 2018 · 39 comments
Labels
state: needs more info the author of the issue needs to provide more details

Comments

@kgbook
Copy link

kgbook commented Jul 26, 2018

I use the latest release version 3.1.2.

I want to parse the JSON data:

"{"json":"{"error":true,"message":"404 not found","code":0,"nonce":"0.8054632351656703","timestamp":1532525070907}", "signature":"Q5s1tDJVXWcNXp"}"

And My code:

main.cpp

#include <iostream>
#include "main.h"

using json = nlohmann::json;
using namespace std;
using namespace tp;

int main() {
    stCfgStr cfgStr;
    json in = "{\"json\":\"{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.8054632351656703\",\"timestamp\":1532525070907}\", \"signature\":\"Q5s1tDJVXWcNXp\"}"_json;

    try {
        decode(in, cfgStr);
        cout <<"json: "<<cfgStr.json <<endl;
        cout <<"signature: " <<cfgStr.signature <<endl;
    }
    catch (json::parse_error &e){
        cout <<"msg: "<<e.what()<<"\n"
              <<"exception id: "<<e.id<<"\n"
              <<"byte position of error: "<<e.byte<<endl;
    }

    return 0;
}

main.h

#ifndef NLOHMANNJSON_SAMPLE_MAIN_H
#define NLOHMANNJSON_SAMPLE_MAIN_H

#include <iostream>
#include <nlohmann/json.hpp>
#include <cstdint>

using namespace std;
using nlohmann::json;

namespace tp {
    struct stCfgStr {
        string json;
        string signature;
    };
}

namespace tp{
    void encode(const stCfgStr &cfg, nlohmann::json &j){
        j = nlohmann::json{{"json", cfg.json}, {"signature", cfg.signature}};
    }

    void decode(const nlohmann::json &j, stCfgStr &cfg){
        cfg.json = j.at("json").get<string>();
        cfg.signature = j.at("signature").get<string>();
    }
}

#endif //NLOHMANNJSON_SAMPLE_MAIN_H

Then I got a exception named json.exception.parse_error, the detail information is

parse error at 12: syntax error - invalid literal; last read: '"{"e'; expected '}'

I just want to decode the value of json object with std::string type, and then decode again.

Note that:

"{"json":"{"error":true,"message":"404 not found","code":0,"nonce":"0.8054632351656703","timestamp":1532525070907}", "signature":"Q5s1tDJVXWcNXp"}"
"{"json":{"error":true,"message":"404 not found","code":0,"nonce":"0.8054632351656703","timestamp":1532525070907}, "signature":"Q5s1tDJVXWcNXp="}"

There are not the same!
As for the first one, the value of json embrace with double quotes ", and the other is not!

So I have to decode the value of json as a string, not object.

I used RapidJson library but failed with the same issue, parseErrorCode:kParseErrorObjectMissCommaOrCurlyBracket .

Is it possible to decode those json data I describe above with the nlomann JSON library?

@nlohmann
Copy link
Owner

The syntax error "{"json":"{"error... is already in your input json in = "{\"json\":\"{\"error....

@nlohmann
Copy link
Owner

(The quote after the colon is invalid JSON.)

@gregmarr
Copy link
Contributor

I don't think it's the quote after the colon that's invalid, it's that there isn't enough quoting of the quotes inside the string, which is itself JSON. I think using a raw string literal would make this more obvious:

json in = R"({
    "json": "{
        \"error\": true,
        \"message\": \"404 not found\",
        \"code\": 0,
        \"nonce\": \"0.8054632351656703\",
        \"timestamp\": 1532525070907
    }", 
    "signature": "Q5s1tDJVXWcNXp"
})"_json;

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

print pretty:

{
    "json": "{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.8054632351656703\",\"timestamp\": 1532525070907}", 
    "signature": "Q5s1tDJVXWcNXp"
}

get the respond with python or from your browser

URL: http://recognition.bi.tuputech.com/v3/recognition

>>> import requests
>>> r = requests.get('http://recognition.bi.tuputech.com/v3/recognition')
>>> respond = r.json()
>>> print json.dumps(respond, indent=4)
{
    "json": "{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.11027675858112329\",\"timestamp\":1532617356385}", 
    "signature": "r5HFxSFJR7qb1eJqDYQUhc3KOMaFK6GO9zNIAp5ypAXxNfdjC6mbK4gRtCxhzHX6aETP6oJscZgRa8GNsN+WMmVEFxQQ9oxXLRMScyPXQKRXeiGgB8zOU+xDo263BEyGF6knLIjv+P/m9sEyrblwLIC21eLdOp6EoHVt0axZycI="
}

It sounds reasonable if we treat the value of json with the string type.

@gregmarr
Copy link
Contributor

@kgbook If you put that text in a file, and read that, or wrap it in a raw string literal, it should work. If you just paste that in your code and put plain quotes around it, you have to quote each " character AND each \ character.

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

@gregmarr I did, just see main.cpp.

@gregmarr
Copy link
Contributor

@kgbook You mean the one in your first message? That's wrong. You put it in your code, surrounded it in plain quotes, and quoted SOME of the characters.

If you just use a regular string literal instead of a raw string literal, you need this:

"{\"json\": \"{\\\"error\\\":true,\\\"message\\\":\\\"404 not found\\\",\\\"code\\\":0,\\\"nonce\\\":\\\"0.8054632351656703\\\",\\\"timestamp\\\": 1532525070907}\",\"signature\": \"Q5s1tDJVXWcNXp\"}"

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

@gregmarr oh, you are right!
But what the meaning of triple \?

@gregmarr
Copy link
Contributor

gregmarr commented Jul 26, 2018

You want the resulting string to contain \", but both \ and " are special in regular string literals, so you need to quote the \ as \\ and then the " as \". So \" together end up as \\\".

Raw string literals avoid all that, so you just wrap the string as-is in R"( and )", as long as the character sequence )" doesn't appear in your string. If it does, then you need to use different an additional delimiter, such as R"foo( and )foo".

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

@gregmarr Great! I really appreciate your support!

@kgbook kgbook closed this as completed Jul 26, 2018
@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

decode failed when using Raw string literals, but ok in regular string literals.

[json.exception.parse_error.101] parse error at 17: syntax error - invalid string: control character must be escaped; last read: '"{<U+000A>'

environment:

  • the latest release version 3.1.2
kang:~ kang$ /Library/Developer/CommandLineTools/usr/bin/c++ -v
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

  1. decode ok
json in = "{\"json\": \"{\\\"error\\\":true,\\\"message\\\":\\\"404 not found\\\",\\\"code\\\":0,\\\"nonce\\\":\\\"0.8054632351656703\\\",\\\"timestamp\\\": 1532525070907}\",\"signature\": \"Q5s1tDJVXWcNXp\"}";
  1. decode ok
json in  = R"({"json": "{\"error\": true, \"message\": \"404 not found\", \"code\": 0, \"nonce\": \"0.8054632351656703\", \"timestamp\": 1532525070907}", "signature": "Q5s1tDJVXWcNXp"})"_json;
  1. decode failed
json in = R"({
    "json": "{
        \"error\": true,
        \"message\": \"404 not found\",
        \"code\": 0,
        \"nonce\": \"0.8054632351656703\",
        \"timestamp\": 1532525070907
    }", 
    "signature": "Q5s1tDJVXWcNXp"
})"_json;

@kgbook kgbook reopened this Jul 26, 2018
@gregmarr
Copy link
Contributor

@kgbook Are you running on Windows? It is reporting a parse error at a carriage return character.

@gregmarr
Copy link
Contributor

Ah, I see above that you're not. I'm not sure why it doesn't like the newlines in the raw literal.

@kgbook
Copy link
Author

kgbook commented Jul 26, 2018

@gregmarr Mac OS X, 10.13.5 (17F77)

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

Three situation when using Raw string literals:

  1. decode ok
json in  = R"({"json": "{\"error\": true, \"message\": \"404 not found\", \"code\": 0, \"nonce\": \"0.8054632351656703\", \"timestamp\": 1532525070907}", "signature": "Q5s1tDJVXWcNXp"})"_json;
  1. decode ok
json jStr = R"({
                        "json":"{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.8054632351656703\",\"timestamp\":1532525070907}",
                        "signature":"Q5s1tDJVXWcNXp"
                   })"_json;
  1. decode failed
json in = R"({
    "json": "{
        \"error\": true,
        \"message\": \"404 not found\",
        \"code\": 0,
        \"nonce\": \"0.8054632351656703\",
        \"timestamp\": 1532525070907
    }", 
    "signature": "Q5s1tDJVXWcNXp"
})"_json;

Only the last one decode failed!

@gregmarr
Copy link
Contributor

Well, there are only two { characters in there, so the problem would have to be the one before error.

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

I can't agree that.

json in = R"({
    "json": "{\"error\": true,
        \"message\": \"404 not found\",
        \"code\": 0,
        \"nonce\": \"0.8054632351656703\",
        \"timestamp\": 1532525070907}",
    "signature": "Q5s1tDJVXWcNXp"
})"_json;

Also decode failed in this situation.
Not the character{ or , cause the problem.

@nlohmann
Copy link
Owner

@kgbook What is the exact issue you are having? As far as I can see, you seem to get parse error exceptions as you pass invalid JSON to the parser (which is implicitly called by the suffix _json).

@nlohmann nlohmann added the state: needs more info the author of the issue needs to provide more details label Jul 27, 2018
@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

@nlohmann The issue in my first message solved with the help of gregmarr.
A new issue about the raw string literals describe above, maybe it is related to the newlines in the raw literal.

@nlohmann
Copy link
Owner

Which issue do you mean exactly?
#1175 (comment) ?

@gregmarr
Copy link
Contributor

@kgbook Then the error must have changed, as before it was

[json.exception.parse_error.101] parse error at 17: syntax error - invalid string: control character must be escaped; last read: '"{<U+000A>'

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

@nlohmann yes, and more detailed information #1175 (comment) and #1175 (comment)

@nlohmann
Copy link
Owner

Yes, the 3rd example is not valid JSON.

@gregmarr
Copy link
Contributor

@nlohmann Can you explain why 2 is valid but 3 is not? I've been trying to figure that out.

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

@gregmarr Actually, the error info you reference is the latest.

#1175 (comment)

parse error at 12: syntax error - invalid literal; last read: '"{"e'; expected '}'

#1175 (comment) :

[json.exception.parse_error.101] parse error at 17: syntax error - invalid string: control character must be escaped; last read: '"{<U+000A>'

@nlohmann
Copy link
Owner

{
  "json": "{\"error\": true, \"message\": \"404 not found\", \"code\": 0, \"nonce\": \"0.8054632351656703\", \"timestamp\": 1532525070907}",
  "signature": "Q5s1tDJVXWcNXp"
}
{
  "json": "{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.8054632351656703\",\"timestamp\":1532525070907}",
  "signature": "Q5s1tDJVXWcNXp"
}
parse_error: [json.exception.parse_error.101] parse error at 17: syntax error - invalid string: control character must be escaped; last read: '"{<U+000A>'

after fixing the whitespace

{
  "json": "{ \"error\": true, \"message\": \"404 not found\", \"code\": 0, \"nonce\": \"0.8054632351656703\", \"timestamp\": 1532525070907 }",
  "signature": "Q5s1tDJVXWcNXp"
}

@nlohmann
Copy link
Owner

@kgbook So the parser is telling you where the parse error is. What is the issue then?

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

@nlohmann the JSON data of the 3rd example is valid, but failed to decode and I want to figure it out.

@kgbook
Copy link
Author

kgbook commented Jul 27, 2018

U+000A is a Unicode Character named LF, which means Newline.
some information i googled is below
Unicode Character 'LINE FEED (LF)' (U+000A)
Newline-wikipedia

@nlohmann
Copy link
Owner

The error message says that that control character must be escaped.

@kgbook
Copy link
Author

kgbook commented Jul 30, 2018

json jStr = R"({
                        "json": "{
                            \"error\": true,
                            \"message\": \"404 not found\",
                            \"code\": 0,
                            \"nonce\": \"0.8054632351656703\",
                            \"timestamp\": 1532525070907}",
                        "signature": "Q5s1tDJVXWcNXp"
                    })"_json;

It doesn't work.

[json.exception.parse_error.101] parse error at 37: syntax error - invalid string: control character must be escaped; last read: '"{\n<U+000A>'`

I modify the format, and there is a little difference.
previous parse error at 17, now parse error at 37.

Break at nlohmann/json.hpp:7643, and print the value.

(gdb) p s
$1 = 0x10002fa8f "{\n                        \"json\": \"{\n                            \\\"error\\\": true,\n                            \\\"message\\\": \\\"404 not found\\\",\n                            \\\"code\\\": 0,\n                            \\\"nonce\\\": \\\"0.8054632351656703\\\",\n                            \\\"timestamp\\\": 1532525070907}\",\n                        \"signature\": \"Q5s1tDJVXWcNXp\"\n                    }"

However, the control character LF actually has escaped to \n.

@nlohmann
Copy link
Owner

I guess when you just print the string to std::cout rather than parsing it, you will quickly see the error:

{
    "json": "{
    \"error\": true,
    \"message\": \"404 not found\",
    \"code\": 0,
    \"nonce\": \"0.8054632351656703\",
    \"timestamp\": 1532525070907}",
    "signature": "Q5s1tDJVXWcNXp"
}

The error message indicates that a newline (0x0A) occurs inside a string, and JSON demands it to be escaped.

@kgbook
Copy link
Author

kgbook commented Jul 30, 2018

No, you perhaps misunderstand something.

using json = nlohmann::json;
...
catch (json::parse_error &err){
        cout <<"msg: " <<err.what() <<endl;
        cout <<"exception id: " <<err.id <<endl;
        cout <<"byte: " <<err.byte <<endl;
    }

so I got the error message from the std::cout.

You can test it and will get the same result.

@kgbook
Copy link
Author

kgbook commented Jul 30, 2018

And I print the value of s at nlohmann/json.hpp:7643.

LF has escaped to \n, I saw it with gdb, also you can printf or std::cout as you like.

@nlohmann
Copy link
Owner

I see no error in the parser. The string that you pass is not valid JSON. If you print the string to std::cout and copy/paste it to https://jsonlint.com you get a similar error message.

@kgbook
Copy link
Author

kgbook commented Jul 30, 2018

sorry, I can't open your URL.
I copy and paste it to https://jsonformatter.curiousconcept.com/ , it's a invalid json.

json jStr = R"({
                        "json":"{\"error\":true,\"message\":\"404 not found\",\"code\":0,\"nonce\":\"0.8054632351656703\",\"timestamp\":1532525070907}",
                        "signature":"Q5s1tDJVXWcNXp"
                   })"_json;
json jStr = R"({
                        "json": "{
                            \"error\": true,
                            \"message\": \"404 not found\",
                            \"code\": 0,
                            \"nonce\": \"0.8054632351656703\",
                            \"timestamp\": 1532525070907}",
                        "signature": "Q5s1tDJVXWcNXp"
                    })"_json;

the first is valid, and the other is not valid.

In a word, the value of the key should't contain a LF character. Right?

@nlohmann
Copy link
Owner

Yes. A JSON string must escape control characters:

@kgbook
Copy link
Author

kgbook commented Aug 2, 2018

Thanks

@kgbook kgbook closed this as completed Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs more info the author of the issue needs to provide more details
Projects
None yet
Development

No branches or pull requests

3 participants