Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for streaming JSON encoder/decoder #1755

Merged
merged 22 commits into from
Mar 22, 2017
Merged

Conversation

pjsg
Copy link
Member

@pjsg pjsg commented Jan 22, 2017

Fixes #1666 (partially).

  • This PR is for the dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well (if not, please describe why).
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/en/*.

This makes use of the jsonsl package (which is under a very liberal license). It wraps it in a layer to interface to lua for decoding json strings. There is also an incremental encoder.

The decoder can call out to lua code during the decoding to give the lua full control over which parts of the data structure should be retained (or handled inline).

It does provide an API which is compatible with cjson.

This is the first step to resolve the issue #1666. The next is to deal with the http client....

@djphoenix
Copy link
Contributor

Overflows iROM... again. Hmmm...

@pjsg
Copy link
Member Author

pjsg commented Jan 22, 2017

I think that we are getting close to where we can't increase the irom any more.... We have enough modules that they will not be able to be built at the same time!

@djphoenix
Copy link
Contributor

Safe size for iROM is 0x100000 (1M - max mapped size) - 0x8000 (iRAM size) - 0x14000 (dRAM size) = 0xE4000.
If we will use a bootloader (e.g. rBoot), we may increase it to 0x100000 - 0x2000 (bootloader + config size) = 0xFE000, because iROM starts right after loader, and iRAM/dRAM may be placed over 1M bound.

@pjsg
Copy link
Member Author

pjsg commented Jan 23, 2017

Also should resolve #1316 #1580


- `setpath({}, {})` the `table` argument is the object that will correspond with the value of the JSON object.
- `setpath({}, {"foo"})` the `table` argument is the object that will correspond with the value of the outer JSON array.
- `setpath({}, {"foo", 3})` the `table` argument is the object that will correspond to the empty inner JSON array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended use of setpath? The first argument appears to always be an empty table. Can I do something interesting with that table? Can I return something from the setpath call to indicate that I don't want this piece decoded/kept?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added documentation to describe it better. I'm definitely open to making this a checkpath method instead that returns true if you want to keep the element, and false if not. Default would be true if the checkpath method was not supplied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really thought through what the likely usecases are here, but I suspect a checkpath would be more readily understood. Or it could be that we just need some more docs or examples?

- `string` the next piece of JSON encoded data

####Returns
The constructed lua object or `nil` if the decode is not yet complete.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the same value as from sjson.decoder:result() then? Do we really want both of these then? It seems redundant at first glance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a way to signal that decode is complete (if that is of interest). You want the result method so that you can always get the final result.

@marcelstoer
Copy link
Member

Safe size for iROM is .... 0xE4000.

Then why don't we use that value - at least during PR builds?

@pjsg
Copy link
Member Author

pjsg commented Feb 2, 2017

I changed the setpath to checkpath to allow you to suppress an entire object. Also, I fixed the maximum length of a string to be limited by memory rather than artificially constrained.

@marcelstoer
Copy link
Member

Do you really want to maintain two JSON modules? If it's a drop-in replacement for cjson don't we want to dump, or at least deprecate, that one?

@pjsg
Copy link
Member Author

pjsg commented Feb 14, 2017

I'd like to replace CJSON entirely. I didn't want to call this new code cjson, but maybe that is the easiest thing.....

@@ -5,7 +5,7 @@ MEMORY
dport0_0_seg : org = 0x3FF00000, len = 0x10
dram0_0_seg : org = 0x3FFE8000, len = 0x14000
iram1_0_seg : org = 0x40100000, len = 0x8000
irom0_0_seg : org = 0x40210000, len = 0xD0000
irom0_0_seg : org = 0x40210000, len = 0xE0000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is also good enough to build #1231, I just tested. So, if we merge this one first we can finally merge that one as well.

@marcelstoer
Copy link
Member

I didn't want to call this new code cjson, but maybe that is the easiest thing.....

Even though JSON support is quite popular (among top-10 non-default modules) I wouldn't do that and introduce a breaking change instead. Anyone building a new firmware will notice that CJSON is gone and will check the documentation. Of course the release notes will also state this but who reads release notes anyway.

@marcelstoer
Copy link
Member

Oh, and ideally you keep cjson.md to avoid dead links even when the CJSON code is gone. The page would only need to show a brief note that the module was replaced by SJSON.

@pjsg
Copy link
Member Author

pjsg commented Mar 9, 2017

This removes cjson (but leaves the documentation that now just points to sjson). Examples updated.

@marcelstoer marcelstoer added this to the 2.0.0-follow-up milestone Mar 9, 2017
@jmattsson jmattsson merged commit b09cac0 into nodemcu:dev Mar 22, 2017
@jmattsson
Copy link
Member

Sorry for the delay, I had it in my head this had already been merged. Many thanks @pjsg, I won't miss the cjson mess :)

@hanfengcan
Copy link
Contributor

hi, I'm write a article about this module。 I have some difficulty understanding metatable when I use sjson.decoder()。

http://nodemcu.readthedocs.io only one example。Can anyone give more than one example?

and this is my articles http://www.jianshu.com/nb/7000517 ,Chinese Only

thanks

@pjsg
Copy link
Member Author

pjsg commented Jun 11, 2017

The idea is to allow you to decode large JSON responses without needing to get the entire response into memory at once. One of the motivations that I had was to be able to store a nodemcu lua project in GIT and have the nodemcu auto-provision itself and auto-update itself. I haven't actually done this yet, but it is not that difficult. The steps are:

[Actually, now I read it, this is basically the example]

There are a bunch of JSON endpoints which return more data than the client actually wants. This allows that filtering. I'll dig up some more examples.

What aspect are you interested in?

@hanfengcan
Copy link
Contributor

such as a weather @pjsg . Usually, weather api return json, and I need some infomation like temperature

@hanfengcan
Copy link
Contributor

hanfengcan commented Jun 17, 2017

here is a free weater api
https://api.caiyunapp.com/v2/TAkhjf8d1nlSlspN/121.6544,25.1552/forecast.json
and return json.
But, the return data is larger(24K byte).

http model can't deal it, How can I do?

@pjsg
Copy link
Member Author

pjsg commented Jun 18, 2017

I wrote this example piece of code that just gets the weather in London. It actually doesn't query the live API, but uses a sample data blob so it should print out "Drizzle"

The __newindex function is just selecting the single attribute main from the JSON and saving its value. Everything else is discarded. The HTTP fetch function doesn't build a single buffer with all the data in it, it just incrementally calls the decoder:write() method.

Does this help?

function getdata(host, path, callback, meta)
    local conn=net.createConnection(net.TCP, 0) 
    conn:on("connection", function(conn) 
        conn:send("GET " .. path .. " HTTP/1.0\r\nHost: " .. host .. "\r\n"
            .. "Accept: */*\r\n\r\n")            
    end)
    local decoder = nil
    
    conn:on("receive", function(conn, pl)    
        if not decoder then
            local location = string.find(pl, "\r\n\r\n")        
            if location ~= nil then
                pl = string.sub(pl, location + 4)
                decoder = sjson.decoder({ metatable=meta })
            else
                return
            end
        end    
        decoder:write(pl)
    end)
    conn:on("disconnection", function(conn)
       callback(decoder:result())
    end)
    conn:connect(80, host)
end

local main = nil
getdata("samples.openweathermap.org",
        "/data/2.5/weather?q=London,uk&appid=b1b15e88fa797225412429c1c50c122a1", 
        function(r) print(main) end, 
        {checkpath=function() return false end,
         __newindex=function(t,k,v) if k == "main" then main = v end end})

@hanfengcan
Copy link
Contributor

yes, thinks @pjsg

@dnc40085
Copy link
Contributor

@pjsg Recently, I was looking at the Travis CI build log and saw a couple of warnings ELF binary has undefined symbol calloc and ELF binary has undefined symbol free which I traced back to app/sjson/jsonsl.c, I'm not sure if you noticed this when testing this PR.

@jmattsson
Copy link
Member

I can confirm that I too am seeing these undefined symbol warnings. I'm surprised that they're only warnings, I'm sure they used to be errors.

eiselekd pushed a commit to eiselekd/nodemcu-firmware that referenced this pull request Jan 7, 2018
@spotrevoc
Copy link

The docs explain that the __newindex method can be used to filter what goes into the new table by checking the key value and skipping the 'rawset' function.

When the checkpath method is called, the metatable has already be associated with the new table. Thus the checkpath method can replace it if desired. For example, if you are decoding { "foo": { "bar": [1,2,3,4], "cat": [5] } } and, for some reason, you did not want to capture the value of the "bar" key, then there are various ways to do this:

In the __newindex metamethod, just check for the value of the key and skip the rawset if the key is "bar". This only works if you want to skip all the "bar" keys.]
Can you please provide a working example of this for a nested table? I'm pulling the forecast data from api.weather.gov/gridpoints/STO/60,69/forecast and I only want to store everything under the "periods" path. I have figured out a way to break up the returned string into a string array and feed it back to the decoder via the decode:write() function.

This is the code i have so far
decode = sjson.decoder({metatable={__newindex=function(t,k,v) if k == "periods" then rawset(t,k,v) end end}})

But when I try to recall or iterate through decode:result() it turns out to be nil. What am I missing here?

@HHHartmann
Copy link
Member

Please do not post on closed items.
And for any sort of developer support questions like "how do I ..." or a "I can't get this to work ...". See our support page for alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants