Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize/deserialize binary values to the serialization format #1302

Merged
merged 26 commits into from
Apr 26, 2017

Conversation

jasongrout
Copy link
Member

@jasongrout jasongrout commented Apr 19, 2017

CC @maartenbreddels.

@SylvainCorlay - the version 1 specification was too restrictive - it didn't allow for any additional properties. I bumped the specification to version 2, added the extra fields for binary buffers, and removed the additionalProperties: false keys so we can add optional fields in the future.

@jasongrout
Copy link
Member Author

@SylvainCorlay - What do you think of the schema change (I'd like to get your thoughts before publishing the new schema package).

1. We add optional buffer_paths and buffers keys to a model’s state dictionary.
2. We remove the requirement that each schema defines *all* of the properties an object may have, so that we can have backwards-compatible updates by adding optional properties. In other words, removed a number of additionalProperties: false entries in the schema.
@jasongrout
Copy link
Member Author

Travis is failing because I haven't published the schema package yet.

@jasongrout
Copy link
Member Author

Restarting the build now that we published the jupyter-widgets-schema package

@jasongrout
Copy link
Member Author

I'll plan on merging this tomorrow. @SylvainCorlay, @maartenbreddels, please let me know if you have any comments.

Copy link
Member

@maartenbreddels maartenbreddels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I have some concerns about using just base64 (hardcoded). And also, the python side needs a similar code, otherwise the jupyter_sphinx extension would be broken.

"items": {
"description": "A base64-encoded binary buffer",
"type": "string"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say, include the encoding(s) in the description, since if at some moments we decide to support compression, we may have to change the schema again. If we put the encoding in the state file it's more flexible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So have one more piece of metadata in the field saying how the buffers are encoded? That makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say a list of encoding, say, to make it more concrete with a python code:

import codecs
data_encoded = data # data is the original data we'd like to encode
encodings = ["bz2", "base64"] # this should come from the json
for encoding in encodings:
    data_encoded = codecs.encode(data_encoded, encoding)

And for the decoding:

data_decoded = data_encoded
for encoding in encodings[::-1]:
    data_decoded = codecs.decode(data_decoded, encoding)

so if we store the list of used encodings, and specify what we support (now just base64), we are a bit more future proof I think, what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SylvainCorlay? With what I just pushed, we have a single encoding for each model's binary buffers (base64 or hex), rather than an encoding per buffer. Since the state serialization happens at the model level right now, and the buffer splitting and encoding is completely independent of knowledge of what is in the buffer, I don't think it makes sense to support a more fine-grained per-buffer encoding specification. I mean, how would a model specify the encoding of a single buffer somewhere in its serialized state?

@jasongrout
Copy link
Member Author

(When the next json schema spec is published and tools adopt it, we can use the new "const" property instead of single-valued enums for the buffer encoding. See json-schema-org/json-schema-spec#139, which is in draft 6 of the spec (https://tools.ietf.org/html/draft-wright-json-schema-validation-01#section-6.24).

…w encoding attribute

We initially have the ‘hex’ and ‘base64’ encodings.

We unpublished the 0.2.0 package so we could replace version 2 of the spec with these updates, so we bump the package number to 0.3.0-beta.0
@maartenbreddels
Copy link
Member

So just a single encoding, not a list of, to support compression? Maybe it is overkill, and it can be just a single encoding, and later on we can include a compression option.

@SylvainCorlay
Copy link
Member

Are you guys available for a quick chat?

@SylvainCorlay
Copy link
Member

@jasongrout
Copy link
Member Author

jasongrout commented Apr 20, 2017

From the appear.in conversation between @SylvainCorlay, @maartenbreddels, and me: the buffer information should be zipped, so that each buffer is represented by an object of the form {path: [...], data: '...'}.

Now a model state can have structure like:

```
state: ...
buffers: {
  buffers: [
    {path: ["a", 1, "b"], data: "AFE0139AB"},
    {path: ["x", 0], data: "45E01FE"}
  ],
  encoding: "hex"
}
```
@jasongrout
Copy link
Member Author

I updated so that the following is a valid state:

{
  version_major: 2,
  version_minor: 0,
  state: {
    "id-1": {
      model_name: "asdf",
      model_module: "asdf",
      state: {
        'a': [0,null],
        'b': [{}, {}]
      },
      buffers: {
        buffers: [
          {path: ['a', 1], data: "0123BEEF"},
          {path: ['b', 1, 'c'], data: "AFEF"}
        ],
        encoding: "hex"
      }
    }
  }
}

I'd rather not have the same name repeated twice, buffers.buffers, but there is precedence for it here, state.state. Thoughts?

@jasongrout
Copy link
Member Author

(I published this jupyter-widgets-schema as 0.3.0-beta.0)

@jasongrout
Copy link
Member Author

Another thing that would be backwards-incompatible changes, but now might be the right time to do it: how about we have the version as a single field:

version: [2,0] or version: {major: 2, minor: 0}. That's quite a bit cleaner, I think.

@jasongrout
Copy link
Member Author

jasongrout commented Apr 20, 2017

Another thing to consider: I think it might type better to have the widget manager state be an array of objects, each of which has an 'id' key, and absorb the model name/version/module back into the widget state, where it exists anyway...

state: [
  {id: 'uuid', state: {
    _module_name: 'name', 
    _model_module_version: '1', 
    _model_module: 'module', 
    other_state: 'some other state'
  }},
... 
]

Still thinking about these, though.

@maartenbreddels
Copy link
Member

I'm testing it, but it doesn't seem to work with this line (in embed-webpack.ts):

var widget_state_schema = require('jupyter-widgets-schema').v2.state;

only v1 is defined, any idea what is going on?

@jasongrout
Copy link
Member Author

jasongrout commented Apr 21, 2017 via email

@jasongrout
Copy link
Member Author

only v1 is defined, any idea what is going on?

Should hopefully be fixed now.

@maartenbreddels
Copy link
Member

I don't think I can modify or do a PR to this PR, in fact I did, but it went here

Thanks to @maartenbreddels for his changes at #1
@maartenbreddels
Copy link
Member

embedding is quite broken now
If I do

import ipywidgets as widgets
widgets.Button(description="click")

The the generated code gives me this:

<script src="https://unpkg.com/jupyter-js-widgets@~3.0.0/dist/embed.js"></script>
<script type="application/vnd.jupyter.widget-state+json">
{
    "version_major": 2,
    "version_minor": 0,
    "state": {
        "e496e580048740cfb6203a9d4c08d9cf": {
            "model_name": "LayoutModel",
            "model_module": "jupyter-js-widgets",
            "model_module_version": "3",
            "state": {
                "_view_module_version": "3",
                "_model_module_version": "3",
                "_view_count": 1
            }
        },
        "82db7d86f07e4eb783c40dd249c7ad4a": {
            "model_name": "ButtonStyleModel",
            "model_module": "jupyter-js-widgets",
            "model_module_version": "3",
            "state": {
                "_view_module_version": "3",
                "_model_module_version": "3",
                "_view_count": 1
            }
        },
        "1d5f7f75499049c98375918b8cfd3d2a": {
            "model_name": "ButtonModel",
            "model_module": "jupyter-js-widgets",
            "model_module_version": "3",
            "state": {
                "_view_module_version": "3",
                "description": "click",
                "style": "IPY_MODEL_82db7d86f07e4eb783c40dd249c7ad4a",
                "_view_count": 1,
                "layout": "IPY_MODEL_e496e580048740cfb6203a9d4c08d9cf",
                "_model_module_version": "3"
            }
        }
    }
}
</script>

For testing I just copy the embed.js to the directory where the html file where I put this code in resides , and change the src tag to 'embed.js'. What I missing are I believe the views, and the script tags for the visible views.

@jasongrout
Copy link
Member Author

Yes, it looks like we're missing some script tags that look like this in the old version:

<script type="application/vnd.jupyter.widget-view+json">
{
    "model_id": "21150b295bd84591adc9724e747e7f9f"
}
</script>

@maartenbreddels
Copy link
Member

Ok, couldn't resist, seems to work now using this PR. This doesn't really seem like a good workflow, where I do a PR to your repo, is there a proper way to do this in github?

jasongrout and others added 2 commits April 25, 2017 15:51
Also, make sure the display_data type actually conforms to the mimetype spec.
@jasongrout
Copy link
Member Author

You should be able to push directly to my embeddedoutput2 branch, since the checkbox up above to the right is checked (" Allow edits from maintainers.")

git push git@github.com:jasongrout/ipywidgets.git embeddedoutput2 I think should work.

@maartenbreddels
Copy link
Member

I don't think I am a maintainer right?

@maartenbreddels
Copy link
Member

Exciting to see this progress btw, I can clean up quite some code in ipyvolume soon, using binary transfer only from now on.

@jasongrout
Copy link
Member Author

I don't think I am a maintainer right?

Fixed!

@jasongrout
Copy link
Member Author

seems to work now

Feel free to merge when it looks good to you. I'm working on the package version PR, and then I think it will be time to release a beta (or alpha?) and shift focus to testing and docs.

@jasongrout
Copy link
Member Author

seems to work now

Great. I'll merge this and we can iterate.

@jasongrout jasongrout merged commit 30d4de0 into jupyter-widgets:master Apr 26, 2017
@maartenbreddels
Copy link
Member

Ok, I wanted to add some changes, but I'll do that in a separate PR, the Python version of this needs some changes.

@github-actions github-actions bot added the resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label Feb 15, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants