Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol based js<->clj key value mapping #94

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

vizanto
Copy link

@vizanto vizanto commented Nov 19, 2022

This PR introduces the BeanContext, mainly for use in our malli-ts library to enable low overhead (simple benchmarks show ~10%) mapping of custom/namespaced keywords to javascript properties.

See also:


We have 2 reasons to make this a protocol:

  1. Remove the overhead of allocating 3 partials on every access (prop->key, key->prop, transform)
  2. Add (a lot) more context to transform, allowing us to lookup the malli schema for the key that needs to be transformed

To remain backwards compatible, we didn't change the transform option signature.
Instead if a call to ->clj is supplied with the :context option, it's used instead.

The transform (transform [_ v p k n] ... gets called with:

  • value
  • js property
  • cljs keyword
  • nth index (array/vector access) or nil (map access)

In malli-ts we create a context implementation that wraps a "property/keyword → malli :map schema mapping" (also following :refs) and use the additional context to find the nested schema defined keyword mapping. Perhaps in the future we will add support for coercing values lazily.

The default options have a static reified instance, for example for the default keywordize behavior:

(def ^:private keywordize-ctx
  (reify BeanContext
    (keywords? [_] true)
    (key->prop [_ x] (default-key->prop x))
    (prop->key [_ prop] (keyword prop))
    (transform [ctx v _ _ _] (->val ctx v))))

@mfikes
Copy link
Owner

mfikes commented Nov 19, 2022

Cool. Will take a look at this. Seems like a mixture of perf enhancements along with some new public API.

The new public API being committed is the bit that I'd like to best understand. Two immediate thoughts:

  1. Was primitive? meant to be made public?
  2. I suppose this would affect doc strings and other documentation.

@vizanto
Copy link
Author

vizanto commented Nov 19, 2022

primitive? is pretty useful, but we ended up not needing it anymore.
We were also discussing to maybe move the gory details into an impl namespace and make everything public there.

I haven't updated the docstrings yet, but the only new option is :context.

@mfikes
Copy link
Owner

mfikes commented Nov 19, 2022

@vizanto Are there new public types being added? Can K-Transform, etc. be made private?

I suppose BeanContext and bean-context are meant to be public... they seem to be at the core of this PR.

@flow-danny flow-danny force-pushed the bean-context-protocol branch from 11bfbdd to 4d8c604 Compare November 19, 2022 17:57
@vizanto
Copy link
Author

vizanto commented Nov 19, 2022

Made the types private now, bean-context would be useful to keep public so that one can reuse the instance

(compatible-value? v recursive?))))

(deftype ^:private TransientBean [^:mutable ^boolean editable?
obj prop->key key->prop transform ^boolean recursive?
obj ^BeanContext ctx ^boolean recursive?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type hint is probably not doing anything

@mfikes
Copy link
Owner

mfikes commented Nov 19, 2022

I'm attempting to get my mind around the problems that this solves, or the extra ability that this affords by way of example use.

Let's say you had things set up where you had a couple of mapping functions (these are taken from the unit tests):

cljs.user=> (require '[cljs-bean.core :refer [->clj ->js]])
nil
cljs.user=> (defn prop->key [prop]
  (cond-> prop
    (some? (re-matches #"[A-Za-z_\*\+\?!\-'][\w\*\+\?!\-']*" prop)) keyword))
#'cljs.user/prop->key
cljs.user=> (defn key->prop [key]
  (cond
    (simple-keyword? key) (name key)
    (and (string? key) (string? (prop->key key))) key
    :else nil))
#'cljs.user/key->prop

And you had a value you wanted to convert from JavaScript to ClojureScript:

cljs.user=> (def js #js [#js {:a 1, "a/b" 2, "d" 3 "v" #js [#js {:c 2 "d" 4 "x/y" 7}]}])
#'cljs.user/js

With this setup, the default mapping would look like:

cljs.user=> (->clj js)
[{:a 1, :a/b 2, :d 3, :v [{:c 2, :d 4, :x/y 7}]}]

You can override it with the existing public API as follows:

cljs.user=>(->clj js :prop->key prop->key :key->prop key->prop)
[{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}]

If you wanted to define a single context value that encapsulates all of this, with the current public API you could do

cljs.user=> (def ctx {:prop->key prop->key :key->prop key->prop})
#'cljs.user/ctx

and then achieve the same result by simply passing this context map:

cljs.user=> (->clj js ctx)
[{:a 1, "a/b" 2, :d 3, :v [{:c 2, :d 4, "x/y" 7}]}]

Is it mainly that this PR, by introducing a protocol, speeds up things by 10%? (Really, a perf benefit.)

Or is having a protocol available adding some fundamental new capability over a map-based (or keyword abs-based) approach? I'm trying to think of new use cases that this enables.

(Even if it doesn't enable new use cases, it would be interesting if using a protocol speeds things up... an attempt was done with the ClojureScript compiler itself in the past where its internal data structure, which is a map, was heavily "protocolized" in an experiment to see if it sped up compilation, but interestingly the results with that experiment were negative.)

@flow-danny
Copy link

flow-danny commented Nov 19, 2022 via email

@mfikes
Copy link
Owner

mfikes commented Nov 19, 2022

Ahh, right... somehow I missed that this is more than just "protocolizing" things... it is the extra stuff going on with transform which is a big part of the PR.

Feels like two (separate?) things going on (sorry for being slow to comprehend the big picture):

  1. Whereas the existing :transform function takes a single argument, this PR is conceptually extending that, providing much more context that allows the transform to be more refined by leveraging that additional context.
  2. Introducing a protocol that can be used instead of a keyword args

The first appears to be mostly about new functionality and the second appears to be mostly about perf. Not suggesting this, but it makes me wonder if these could in theory be separate PRs. But, perhaps it is difficult to separate the two things because the protocol aspect is helping with the "extension of the transform" aspect.

@flow-danny
Copy link

The first commit in this PR is indeed passing the context as extra arguments to the existing transform option, but that would break existing library users. (transform arity change)

The added performance benefit and making it a non breaking change is why we decided to open the PR 😀

@mfikes
Copy link
Owner

mfikes commented Nov 20, 2022

If there is a clean way to supply additional contextual information to the transform function without breaking existing clients, that would be a small extension to the public API, much easier to assess, etc.

The notion of leveraging protocols for performance could be treated as a completely independent thing.

@vizanto
Copy link
Author

vizanto commented Nov 20, 2022

Do you have any projects where you could measure a change in performance with this branch? I'm curious 😄

@mfikes
Copy link
Owner

mfikes commented Nov 20, 2022

@vizanto No I don't have any such projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants