Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Implement Expression.serialize() (issue #10174) #11156

Merged
merged 3 commits into from
Feb 16, 2018
Merged

Conversation

ChrisLoer
Copy link
Contributor

@ChrisLoer ChrisLoer commented Feb 8, 2018

Fix to issue #10714.

  • Each expression stores its "operator" as a string, and default serialization is [operator, serialize(child1), ...]
  • For compound expressions, store the operator name in the Signature object
  • Custom implementations of serialize for Expression types that don't follow the pattern
  • expression::Value -> mbgl::Value converter
  • node_expression bindings to expose serialize

TODO:

  • Figure out right treatment of literals: are there times we need to wrap them in "literal" to preserve meaning (@anandthakker I think you already explained this to me but I'm having trouble finding where)
  • Figure out appropriate constant folding behavior -- current implementation saves off a serialization of expressions before applying constant folding, which is useful for debugging, but from discussion with @1ec5 it sounds like this may not be important and it's a perf optimization not to hold on to extra serializations).
  • Benchmarking: ideally this should impose almost no cost outside of the actual serialization calls
  • Assertions and coercions: the serialization is maximally verbose right now, which is a little bit ugly, but maybe it's fine to just leave them in since sometimes they're necessary for a correct round-trip?
  • Testing: the plan @anandthakker and I discussed was something along the lines of: build an expected serialization into each test.json and wire up the harness to check the serializations.
  • Testing: maybe re-evaluate after round-tripping through the serialization to make sure we get the same value?
  • Factor out common patterns in serialize implementations? Figure out what to do with unused getOperator? Do a pass to minimize unnecessary copy construction...

/cc @anandthakker @1ec5

@ChrisLoer ChrisLoer added the ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold label Feb 8, 2018
: Expression(type_)
, value(value_)
, serialized(std::move(serialized_))
{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say let's not add this extra bit of storage / complexity for now, and just serialized folded literals as-is.

@@ -35,6 +35,23 @@ ParseResult Assertion::parse(const Convertible& value, ParsingContext& ctx) {
return ParseResult(std::make_unique<Assertion>(it->second, std::move(parsed)));
}

const char* Assertion::getOperator() const {
// TODO: Should we strip assertions entirely from serialization?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should leave them -- as of now, we don't track whether an assertion was automatically inferred or actually written as part of the expression. If the latter, it could easily affect the expression's meaning.

} else {
assert(false);
return "";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be:

return type::toString(getType());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with the char * interface because there wouldn't be anyone to own the string. But yeah maybe getOperator should just return a std::string...

} else {
assert(false);
return "";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return getType().match(
  [](const type::Number&) { return "to-number"; },
  [](const type::Color&) { return "to-color"; },
  [](const auto&) { assert(false); return ""; });


EvaluationResult apply(const EvaluationContext& evaluationParameters, const Args& args) const {
return applyImpl(evaluationParameters, args, std::index_sequence_for<Params...>{});
}

std::unique_ptr<Expression> makeExpression(const std::string& name,
std::unique_ptr<Expression> makeExpression(const std::string& name_,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the name parameter from makeExpression now that Signatures know their own names.

serialized.emplace_back(std::vector<mbgl::Value>{{ cubicBezierTag, p1.first, p1.second, p2.first, p2.second }});
} else {
assert(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interpolator.match(
  [&](const ExponentialInterpolator& exponential) {
    serialized.emplace_back(std::vector<mbgl::Value>{{ exponentialTag, exponential.base }});
  },
  [&](const CubicBezierInterpolator& cubicBezier) {
    // ...
  });

if (interpolator.is<ExponentialInterpolator>()) {
const ExponentialInterpolator& exponential = interpolator.get<ExponentialInterpolator>();
// TODO: Could do a separate serialization for "linear", but probably no need?
serialized.emplace_back(std::vector<mbgl::Value>{{ exponentialTag, exponential.base }});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any downside to just saying std::vector<mbgl::Value>{{ "exponential", exponential.base }}? The string's gotta get copied into the vector anyway, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings are really just there to help the std::vector<mbgl::Value> constructor deduce the type -- a plain char* is ambiguous for it.

// static const std::string literalTag = "literal";
// result.emplace_back(literalTag);
// result.emplace_back(serialized ? *serialized : *fromExpressionValue<mbgl::Value>(value));
// return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ yeah, this is what we need for type::Array and type::Object literals.

};
serialized.emplace_back(otherwise->serialize());
return serialized;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this could make serialize some expressions much more verbosely:

[
  "match",
  [1, 2, 3, 4, 5], some_complex_expression,
  "otherwise"
]

would become:

[
  "match",
  1, some_complex_expression,
  2, some_complex_expression,
  3, some_complex_expression,
  4, some_complex_expression,
  5, some_complex_expression,
  "otherwise"
]

That said, I'm not sure whether it's really worth it to produce the more compact form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I bucketed this under "OK as long as it's semantically equivalent". I'm inclined to leave as is in the name of simplicity, unless you have a hunch that we may actually see a fair number of expressions that blow up in size with the flattened format...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless you have a hunch that we may actually see a fair number of expressions that blow up in size with the flattened format

I wasn't thinking we needed to worry about this, but on second thought, I think we actually probably should consolidate 😬

@ChrisLoer
Copy link
Contributor Author

I've hacked expression.test.js and GL JS's integration/lib/expression.js to test that every one of the integration tests has the same output when recompiled after round-tripping. That worked pretty well for catching two small mistakes (I hadn't implemented "!=" or Color serialization properly).

There's one test failure to deal with right now: array basic fails because the inferred typing makes it all the way up to the GetType public interface:

-   "type": "array"
+   "type": "array<number, 3>"

I could just make the test harness ignore types, but not sure if we want to treat a top-level type as being more meaningful.

@anandthakker
Copy link
Contributor

@ChrisLoer yeah, we can probably just ignore the type for the roundtripped expression. (We could use checkSubtype to check that its type satisfies the expected type without necessarily being identical to it... but that's probably unnecessary)

I still think we should add a "serializedExpression:" field to the test fixtures so we can explicitly verify that expressions are being serialized as expected.

@ChrisLoer ChrisLoer removed the ⚠️ DO NOT MERGE Work in progress, proof of concept, or on hold label Feb 12, 2018
@ChrisLoer ChrisLoer changed the title WIP: expression serialization Implement Expression.serialize() (issue #10174) Feb 12, 2018
@ChrisLoer
Copy link
Contributor Author

Nothing jumps out at me from the benchmark results:

Benchmark results on serialize-expression

API_queryRenderedFeaturesAll                  1677392083 ns 1671001000 ns          1
API_queryRenderedFeaturesLayerFromLowDensity         860 ns        851 ns     814436
API_queryRenderedFeaturesLayerFromHighDensity   36959291 ns   36555050 ns         20
API_renderStill_reuse_map                      188499123 ns  179758000 ns          3
API_renderStill_reuse_map_switch_styles        495481124 ns  437384000 ns          2
API_renderStill_recreate_map                   627526565 ns  551854000 ns          1
Parse_CameraFunction/1                             17420 ns      17299 ns      37274 1
Parse_CameraFunction/2                             20459 ns      20406 ns      33777 2
Parse_CameraFunction/4                             27162 ns      27178 ns      25943 4
Parse_CameraFunction/6                             34418 ns      34407 ns      20481 6
Parse_CameraFunction/8                             41021 ns      40985 ns      16968 8
Parse_CameraFunction/10                            47936 ns      47907 ns      14600 10
Parse_CameraFunction/12                            55362 ns      55243 ns      12859 12
Evaluate_CameraFunction/1                            388 ns        386 ns    1809033 1
Evaluate_CameraFunction/2                            532 ns        531 ns    1246572 2
Evaluate_CameraFunction/4                            621 ns        619 ns    1138582 4
Evaluate_CameraFunction/6                            637 ns        635 ns    1112842 6
Evaluate_CameraFunction/8                            640 ns        638 ns    1103701 8
Evaluate_CameraFunction/10                           648 ns        646 ns    1076178 10
Evaluate_CameraFunction/12                           651 ns        649 ns    1071992 12
Parse_CompositeFunction/1                          32756 ns      32845 ns      21304 1
Parse_CompositeFunction/2                          59061 ns      59014 ns      12035 2
Parse_CompositeFunction/4                         149438 ns     149154 ns       4638 4
Parse_CompositeFunction/6                         290048 ns     289436 ns       2392 6
Parse_CompositeFunction/8                         482943 ns     481977 ns       1469 8
Parse_CompositeFunction/10                        724908 ns     723302 ns        922 10
Parse_CompositeFunction/12                       1028208 ns    1025422 ns        680 12
Evaluate_CompositeFunction/1                        2244 ns       2237 ns     313157 1
Evaluate_CompositeFunction/2                        3083 ns       3075 ns     227839 2
Evaluate_CompositeFunction/4                        3542 ns       3530 ns     198521 4
Evaluate_CompositeFunction/6                        3718 ns       3708 ns     187109 6
Evaluate_CompositeFunction/8                        3867 ns       3855 ns     184629 8
Evaluate_CompositeFunction/10                       3777 ns       3768 ns     183696 10
Evaluate_CompositeFunction/12                       3922 ns       3912 ns     181017 12
Parse_SourceFunction/1                             22256 ns      22319 ns      31598 1
Parse_SourceFunction/2                             25312 ns      25381 ns      27225 2
Parse_SourceFunction/4                             31661 ns      31707 ns      21893 4
Parse_SourceFunction/6                             38011 ns      38041 ns      18187 6
Parse_SourceFunction/8                             44486 ns      44498 ns      15623 8
Parse_SourceFunction/10                            51073 ns      51080 ns      13628 10
Parse_SourceFunction/12                            57200 ns      57201 ns      11750 12
Evaluate_SourceFunction/1                           1881 ns       1874 ns     371189 1
Evaluate_SourceFunction/2                           2050 ns       2045 ns     336467 2
Evaluate_SourceFunction/4                           2083 ns       2078 ns     329031 4
Evaluate_SourceFunction/6                           2153 ns       2148 ns     327222 6
Evaluate_SourceFunction/8                           2162 ns       2157 ns     317555 8
Evaluate_SourceFunction/10                          2154 ns       2149 ns     323658 10
Evaluate_SourceFunction/12                          2182 ns       2174 ns     321749 12
Parse_Filter                                        2770 ns       2763 ns     252358
Parse_EvaluateFilter                                 322 ns        321 ns    2215183
TileMaskGeneration                                  8289 ns       8270 ns      84217
Parse_VectorTile                                 6539642 ns    6521743 ns        109
Util_dtoa                                           5215 ns       5199 ns     134455
Util_standardDtoa                                   2775 ns       2768 ns     255721
Util_dtoaLimits                                      845 ns        843 ns     823578
Util_standardDtoaLimits                            35911 ns      35803 ns      19942

Benchmark results on master

API_queryRenderedFeaturesAll                  2045191981 ns 2020557000 ns          1
API_queryRenderedFeaturesLayerFromLowDensity        1100 ns       1073 ns     601902
API_queryRenderedFeaturesLayerFromHighDensity   45573861 ns   44903667 ns         15
API_renderStill_reuse_map                      616806352 ns  578550000 ns          1
API_renderStill_reuse_map_switch_styles        632038727 ns  577438000 ns          1
API_renderStill_recreate_map                   639489023 ns  598645000 ns          1
Parse_CameraFunction/1                             23655 ns      23601 ns      29654 1
Parse_CameraFunction/2                             27152 ns      27107 ns      25326 2
Parse_CameraFunction/4                             34428 ns      34356 ns      20377 4
Parse_CameraFunction/6                             41619 ns      41491 ns      16784 6
Parse_CameraFunction/8                             50079 ns      49828 ns      13583 8
Parse_CameraFunction/10                            57831 ns      57594 ns      11007 10
Parse_CameraFunction/12                            66445 ns      66082 ns       9976 12
Evaluate_CameraFunction/1                            433 ns        430 ns    1622210 1
Evaluate_CameraFunction/2                            608 ns        605 ns    1109139 2
Evaluate_CameraFunction/4                            683 ns        678 ns     951798 4
Evaluate_CameraFunction/6                            736 ns        732 ns     957501 6
Evaluate_CameraFunction/8                            740 ns        735 ns     952459 8
Evaluate_CameraFunction/10                           743 ns        738 ns     890925 10
Evaluate_CameraFunction/12                           789 ns        781 ns     893336 12
Parse_CompositeFunction/1                          42278 ns      42187 ns      16676 1
Parse_CompositeFunction/2                          73031 ns      72800 ns       9252 2
Parse_CompositeFunction/4                         180944 ns     180247 ns       3820 4
Parse_CompositeFunction/6                         340672 ns     336690 ns       2015 6
Parse_CompositeFunction/8                         540393 ns     537132 ns       1247 8
Parse_CompositeFunction/10                        761852 ns     759030 ns        866 10
Parse_CompositeFunction/12                       1059845 ns    1055145 ns        653 12
Evaluate_CompositeFunction/1                        2322 ns       2308 ns     297471 1
Evaluate_CompositeFunction/2                        3256 ns       3239 ns     211478 2
Evaluate_CompositeFunction/4                        3802 ns       3781 ns     187518 4
Evaluate_CompositeFunction/6                        3965 ns       3942 ns     173135 6
Evaluate_CompositeFunction/8                        4118 ns       4091 ns     172193 8
Evaluate_CompositeFunction/10                       4033 ns       4013 ns     171778 10
Evaluate_CompositeFunction/12                       4136 ns       4111 ns     167271 12
Parse_SourceFunction/1                             24221 ns      24199 ns      27937 1
Parse_SourceFunction/2                             27493 ns      27524 ns      25561 2
Parse_SourceFunction/4                             34395 ns      34380 ns      20256 4
Parse_SourceFunction/6                             42039 ns      41938 ns      16975 6
Parse_SourceFunction/8                             48458 ns      48391 ns      14076 8
Parse_SourceFunction/10                            55795 ns      55662 ns      12046 10
Parse_SourceFunction/12                            60845 ns      60673 ns      11276 12
Evaluate_SourceFunction/1                           2006 ns       1993 ns     354256 1
Evaluate_SourceFunction/2                           2196 ns       2186 ns     320988 2
Evaluate_SourceFunction/4                           2239 ns       2229 ns     312032 4
Evaluate_SourceFunction/6                           2357 ns       2341 ns     305629 6
Evaluate_SourceFunction/8                           2301 ns       2287 ns     297495 8
Evaluate_SourceFunction/10                          2332 ns       2320 ns     299946 10
Evaluate_SourceFunction/12                          2445 ns       2436 ns     304324 12
Parse_Filter                                        2945 ns       2926 ns     234178
Parse_EvaluateFilter                                 330 ns        329 ns    2124283
TileMaskGeneration                                  8766 ns       8715 ns      80613
Parse_VectorTile                                 6822576 ns    6788694 ns         98
Util_dtoa                                           5335 ns       5306 ns     134241
Util_standardDtoa                                   2958 ns       2945 ns     235765
Util_dtoaLimits                                      882 ns        877 ns     799534
Util_standardDtoaLimits                            38164 ns      37982 ns      18650

@ChrisLoer
Copy link
Contributor Author

Even bootstrapping the expected serialization exposed what I think is a legitimate problem (surprisingly not related to the "weirdjsonissue" in the test):

  [
  "match",
  [
    "string",
    [
      "get",
      "x"
    ]
  ],
  "weird
json
issue",
  "match",
+   "a.b",
  "match",
    "{}",
  "match",
-   "a.b",
  "match",
    "0-1",
  "match",
  "otherwise"
]

Our serialization of Match depends on the order of iteration over a std::unordered_map (and that ended up being different on different CI platforms). While the order of the serialization doesn't change the meaning of the expression, we should probably sort the output just so that serializations stay more consistent.

@ChrisLoer
Copy link
Contributor Author

Possibly Match::eachChild should also be changed to have a defined branch iteration order?

I did some looking for cases where the order might matter and didn't find any, but I came across MGLJSONObjectFromMBGLExpression -- I feel silly, but I didn't realize you had already implemented a JSON serialization at the SDK level, @1ec5! Luckily our logic seems to mostly agree. I take it the plan is to re-implement that on top of Expression::serialize() once it's available?

Copy link
Contributor

@anandthakker anandthakker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisLoer this is looking great to me overall, but I'm having doubts about the match collapsing issue...

mbgl::Value ArrayAssertion::serialize() const {
std::vector<mbgl::Value> serialized;
serialized.emplace_back(getOperator());
getType().match(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, since we know it's a type::Array, you can do const auto& array = getType().get<type::Array>() instead of going through match

@@ -176,14 +175,14 @@ struct Signature<Lambda, std::enable_if_t<std::is_class<Lambda>::value>>
using Definition = CompoundExpressionRegistry::Definition;

template <typename Fn>
static std::unique_ptr<detail::SignatureBase> makeSignature(Fn evaluateFunction) {
return std::make_unique<detail::Signature<Fn>>(evaluateFunction);
static std::unique_ptr<detail::SignatureBase> makeSignature(Fn evaluateFunction, const std::string& name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a non-reference std::string and std::move here (and similarly in the Signature constructors above)?

},
[&](const auto&) {
assert(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This noop branch shouldn't be necessary, because the specific cases above cover every member type of the variant.

};
serialized.emplace_back(otherwise->serialize());
return serialized;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless you have a hunch that we may actually see a fair number of expressions that blow up in size with the flattened format

I wasn't thinking we needed to worry about this, but on second thought, I think we actually probably should consolidate 😬

@ChrisLoer
Copy link
Contributor Author

OK, I implemented label coalescing for shared output expressions in Match. The code turned out a bit funky because of the "predictable sort order" requirement, but hopefully it's not too confusing.

@1ec5
Copy link
Contributor

1ec5 commented Feb 15, 2018

I did some looking for cases where the order might matter and didn't find any, but I came across MGLJSONObjectFromMBGLExpression -- I feel silly, but I didn't realize you had already implemented a JSON serialization at the SDK level, @1ec5! Luckily our logic seems to mostly agree. I take it the plan is to re-implement that on top of Expression::serialize() once it's available?

Sorry, I mentioned this transformation in #10726 but should also mentioned it in #10714 and #10944. I view MGLJSONObjectFromMBGLExpression() as an incomplete, slapdash implementation that will be removed with prejudice once this PR lands.

@1ec5
Copy link
Contributor

1ec5 commented Feb 15, 2018

Do you think serialize() would be a good foundation for #10944, which is needed for #10713? For #10713, we’d like to traverse the expression tree, replacing occurrences of e.g. ["get", "name_en"] with ["var", "localized_name"].

@ChrisLoer
Copy link
Contributor Author

Do you think serialize() would be a good foundation for #10944

I think I'd have to talk to @anandthakker to get a brain dump on what "expression transformations" might mean -- definitely interested in taking that on after landing this PR. My first reaction would be to try to do something built on top of the serialization, as you suggested in #10944 (comment). But I think maybe I need to understand more use cases -- right now all I have in mind is "find and replace string values within the expression".

Issue #10714
- Each expression stores its operator as a string, and default serialization is [operator, serialize(child1), ...]
- Custom implementations of `serialize` for Expression types that don't follow the pattern
- expression::Value -> mbgl::Value converter
- node_expression bindings to expose `serialize`
 - Round-tripping expressions through serialization and checking that outputs don't change
 - Checking expression serialization against expected value from fixture
@friedbunny friedbunny added the Core The cross-platform C++ core, aka mbgl label Feb 19, 2018
@ChrisLoer ChrisLoer restored the serialize-expression branch February 20, 2018 20:19
@jfirebaugh jfirebaugh deleted the serialize-expression branch July 27, 2018 22:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Core The cross-platform C++ core, aka mbgl
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants