Skip to content

Commit

Permalink
Add some new documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
szeiger committed Jun 25, 2021
1 parent 4355af6 commit 972b41a
Show file tree
Hide file tree
Showing 7 changed files with 57 additions and 12 deletions.
51 changes: 43 additions & 8 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,17 +114,22 @@ The only stack size limit is the one of the JVM.

## Architecture

Sjsonnet is implementated as a straightforward AST interpreter. There are
roughly 3 phases:
Sjsonnet is implementated as an optimizing interpreter. There are roughly 4
phases:

- `sjsonnet.Parser`: parses an input `String` into a `sjsonnet.Expr`, which is a
[Syntax Tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree) representing
the Jsonnet document syntax, using the
[Fastparse](https://github.com/lihaoyi/fastparse) parsing library

- `sjsonnet.Evaluator`: recurses over the `sjsonnet.Expr` and converts it into a
`sjsonnet.Val`, a data structure representing the Jsonnet runtime values
(basically lazy JSON which can contain function values).
- `sjsonnet.StaticOptimizer` is a single AST transform that performs static
checking, essential rewriting (e.g. assigning indices in the symbol table for
variables) and optimizations. The result is another `jsonnet.Expr` per input
file that can be stored in the parse cache and reused.

- `sjsonnet.Evaluator`: recurses over the `sjsonnet.Expr` produced by the
optimizer and converts it into a `sjsonnet.Val`, a data structure representing
the Jsonnet runtime values (basically lazy JSON which can contain function values).

- `sjsonnet.Materializer`: recurses over the `sjsonnet.Val` and converts it into
an output `ujson.Expr`: a non-lazy JSON structure without any remaining
Expand All @@ -143,11 +148,15 @@ Some notes on the values used in parts of the pipeline:
- `sjsonnet.Val`: essentially the JSON structure (objects, arrays, primitives)
but with two modifications. The first is that functions like
`function(a){...}` can still be present in the structure: in Jsonnet you can
pass around functions as values and call then later on. The second is that and
pass around functions as values and call then later on. The second is that
object values & array entries are *lazy*: e.g. `[error 123, 456][1]` does not
raise an error because the first (erroneous) entry of the array is un-used and
thus not evaluated.

- Classes representing literals extend `sjsonnet.Val.Literal` which in turn extends
*both*, `Expr` and `Val`. This allows the evaluator to skip over them instead of
having to convert them from one representation to the other.

## Performance

Due to pervasive caching, sjsonnet is much faster than google/jsonnet. See
Expand All @@ -174,15 +183,38 @@ f59758d1904bccda99598990f582dd2e1e9ad263, while google/go-jsonnet was
benchmark in
[SjsonnetTestMain.scala](https://github.com/databricks/sjsonnet/blob/master/sjsonnet/test/src-jvm/sjsonnet/SjsonnetTestMain.scala)

Sjsonnet 0.4.0 and 0.5.0 further improve the performance significantly on our
internal benchmarks. A set of new JMH benchmarks provide detailed
performance data of an entire run (`MainBenchmark`) and the
non-evaluation-related parts (`MaterializerBenchmark`, `OptimizerBenchmark`,
`ParserBenchmark`). They can be run from the (JVM / Scala 2.13 only) sbt build.
The Sjsonnet profiler is located in the same sbt project:

The Sjsonnet command line which is run by all of these is defined in
`MainBenchmark.mainArgs`. You need to change it to point to a suitable input
before running a benchmark or the profiler. (For Databricks employees who
want to reproduce our benchmarks, the pre-configured command line is expected
to be run against databricks/universe @ 7cbd8d7cb071983077d41fcc34f0766d0d2a247d).

Benchmark example:
```
sbt bench/jmh:run -jvmArgs "-XX:+UseStringDeduplication" sjsonnet.MainBenchmark
```

Profiler:
```
sbt bench/run
```

## Laziness

The Jsonnet language is *lazy*: expressions don't get evaluated unless
their value is needed, and thus even erroneous expressions do not cause
a failure if un-used. This is represented in the Sjsonnet codebase by
`sjsonnet.Val.Lazy`: a wrapper type that encapsulates an arbitrary
`sjsonnet.Lazy`: a wrapper type that encapsulates an arbitrary
computation that returns a `sjsonnet.Val`.

`sjsonnet.Val.Lazy` is used in several places, representing where
`sjsonnet.Lazy` is used in several places, representing where
laziness is present in the language:

- Inside `sjsonnet.Scope`, representing local variable name bindings
Expand All @@ -191,6 +223,9 @@ laziness is present in the language:

- Inside `sjsonnet.Val.Obj`, representing the contents of object values

`Val` extends `Lazy` so that an already computed value can be treated as
lazy without having to wrap it.

Unlike [google/jsonnet](https://github.com/google/jsonnet), Sjsonnet caches the
results of lazy computations the first time they are evaluated, avoiding
wasteful re-computation when a value is used more than once.
Expand Down
1 change: 1 addition & 0 deletions sjsonnet/src/sjsonnet/ExprTransform.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package sjsonnet

import Expr._

/** Simple tree transformer for the AST. */
abstract class ExprTransform {

def transform(expr: Expr): Expr
Expand Down
1 change: 1 addition & 0 deletions sjsonnet/src/sjsonnet/Renderer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@ class PythonRenderer(out: Writer = new java.io.StringWriter(),
}
}

/** Renderer used by std.manifestJson and std.manifestJsonEx */
case class MaterializeJsonRenderer(indent: Int = 4, escapeUnicode: Boolean = false, out: StringWriter = new StringWriter())
extends BaseCharRenderer(out, indent, escapeUnicode) {

Expand Down
1 change: 1 addition & 0 deletions sjsonnet/src/sjsonnet/ScopedExprTransform.scala
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import sjsonnet.Expr._

import scala.collection.immutable.HashMap

/** Tree transformer that keeps track of the bindings in the static scope. */
class ScopedExprTransform extends ExprTransform {
import ScopedExprTransform._
var scope: Scope = emptyScope
Expand Down
2 changes: 2 additions & 0 deletions sjsonnet/src/sjsonnet/StaticOptimizer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ package sjsonnet
import Expr._
import ScopedExprTransform._

/** StaticOptimizer performs necessary transformations for the evaluator (assigning ValScope
* indices) plus additional optimizations (post-order) and static checking (pre-order). */
class StaticOptimizer(ev: EvalScope) extends ScopedExprTransform {
def optimize(e: Expr): Expr = transform(e)

Expand Down
12 changes: 8 additions & 4 deletions sjsonnet/src/sjsonnet/Val.scala
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ abstract class Lazy {
* the tree can contain functions.
*/
sealed abstract class Val extends Lazy {
cached = this
cached = this // avoid a megamorphic call to compute() when forcing
final def compute() = this

def pos: Position
Expand Down Expand Up @@ -419,15 +419,14 @@ object Val{
}
}

/** Superclass for standard library functions */
abstract class Builtin(paramNames: Array[String], defaults: Array[Expr] = null)
extends Func(null, ValScope.empty, Params(paramNames.toArray,
extends Func(null, ValScope.empty, Params(paramNames,
if(defaults == null) new Array[Expr](paramNames.length) else defaults)) {

override final def evalDefault(expr: Expr, vs: ValScope, es: EvalScope): Val = expr.asInstanceOf[Val]

final def evalRhs(scope: ValScope, ev: EvalScope, fs: FileScope, pos: Position): Val = {
// if(!(new Throwable).getStackTrace.map(_.toString).mkString("\n").contains("StaticOptimizer"))
// println(s"----------- generic evalRhs in $this via ${pos} ${ev.prettyIndex(pos)}")
val args = new Array[Val](params.names.length)
var i = 0
var j = scope.length - args.length
Expand All @@ -448,6 +447,11 @@ object Val{
if(params.names.length != 2) apply(Array(argVal1, argVal2), null, outerPos)
else evalRhs(Array(argVal1.force, argVal2.force), ev, outerPos)

/** Specialize a call to this function in the optimizer. Must return either `null` to leave the
* call-site as it is or a pair of a (possibly different) `Builtin` and the arguments to pass
* to it (usually a subset of the supplied arguments).
* @param args the positional arguments for this function call. Named arguments and defaults have
* already been resolved. */
def specialize(args: Array[Expr]): (Builtin, Array[Expr]) = null

/** Is this builtin safe to use in static evaluation */
Expand Down
1 change: 1 addition & 0 deletions sjsonnet/src/sjsonnet/ValVisitor.scala
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import upickle.core.{ArrVisitor, ObjVisitor, Visitor}

import scala.collection.mutable

/** Parse JSON directly into a literal `Val` */
class ValVisitor(pos: Position) extends JsVisitor[Val, Val] { self =>
def visitArray(length: Int, index: Int): ArrVisitor[Val, Val] = new ArrVisitor[Val, Val] {
val a = new mutable.ArrayBuilder.ofRef[Lazy]
Expand Down

0 comments on commit 972b41a

Please sign in to comment.