Tzvetan Mikov <tmikov@gmail.com>
An "ahead-of-time" compiler for the JavaScript programming language.
Warning
|
The compiler is still under heavy development and not even "beta" quality yet. Even though many things work, many still don’t. This document is mainly for potential contributors. |
JSC is a classic "ahead-of-time" compiler. At compile time it generates static executable code (currently C++). It implements ECMAScript 5 (with some minor exceptions) and provides an ever increasing level of Node.js API compatibility.
The informal goals of the project are:
-
Modern ECMAScript compatibility
-
Acceptable performance
-
Predictability, observability, simplicity of the execution model and small memory footprint
It has been tested on Mac and Debian Linux. Experimental Windows support is available under the cygwin branch; for Windows/Cygwin, please check-out that branch and follow the instructions in its README (note that it may be lagging behind the main development branch). Better Windows support is coming in the future (contributions are welcome).
To build JSC You need:
-
Node.js v0.12
-
a C++11 compiler (Apple Clang or gcc-4.9 both work)
-
CMake version at least 2.8
-
Clone the repository
-
From the root of the repository, build the runtime:
npm install make runtime
-
From the root of the repository, build the compiler:
make
-
Run the compiler (only from the project root):
./jsc
-
Compile and run an example
./jsc examples/factorial.js -o factorial ./factorial
It is not expected that the performance of compiled applications will ever rival 'v8'. JavaScript is an awful language for static compilation - it almost seems designed to foil any attempts at optimization, and so a JIT will always have a significant performance advantage.
With that said, performance still can be acceptable and even good. We have a number of potential optimizations planned, including whole program type inference.
JSC may feel slow at times. It generates a huge C++ source file containing all used modules and runtime. There are plans afoot to speed this process greatly, but even now it should be taken in context of the goals:
-
Produce executables with minimal runtime dependencies
-
Perform whole-program optimizations
JSC is still very young. It was started on Jun 6th, after coming back from JSConf 2015, while still riding high on JavaScript enthusiasm, and yet keeping strong a life-long aversion to interpreters :-)
-
Syntax: it uses a mature and tested JavaScript parser (Acorn), so syntactically it is able to handle the complete language.
-
Currently we parse and support ECMAScript 5, with emphasis on strict mode. We do support non-strict mode, but are not putting a lot of effort into testing it. The entire application runs either in strict or non-strict mode. According to the standard, the mode can be changed on a per-function basis, but we do not intend to support that ever - the cost and complexity is not worth it. If an application absolutely needs that, it should be fixed :-), or it should use 'v8'. (Note that strict code can usually run fine in non-strict mode).
-
Not implemented yet:
-
with : it is last on our list. We think we have come up with a reasonable way to implement it without a huge performance penalty, but since we don’t use with at all, our motivation is low.
-
Object.setPrototypeOf() and \_\_proto__ : the plan is not to support prototype modification. This is not a judgement, but if you really need that you should use a different JavaScript implementation.
-
-
Differing from the spec:
-
bind : it is a true implementation (not a shim), where the only difference from the spec is that it has a read-only prototype property (the spec says it shouldn’t have one at all). We need that property due to how our object allocation work and we believe that well behaved code shouldn’t be relying on its absence.
-
arguments : implemented only in its 'strict mode' functionality. In other words, changes to arguments do not propagate back to the actual arguments. There is also no caller/callee, properties. We can implement all of that (non-strict mode stuff), but it is not high on our current list of priorities.
-
Regular expressions: implemented using PCRE2. PCRE2 has some incompatibilities with JavaScript regexes, specifically the treatment of unicode whitespace. It is also too flexible for its own good - lots of options can be changed from within the pattern itself - which is naturally not compatible with with JavaScript. The implementation should be considered experimental and should not be used with "untrusted" regular expressions.
-
-
The object system is implemented, but some of the built-in constructors and many methods are not available. The plan is to implement as mush as possible in JavaScript.
-
Some built-in methods are still missing or have limited functionality.
The compiler generates C code that must be compiled with a C 11 compiler. The runtime is extremely inefficient - for example it uses std::map<> for property storage. Our excuses are:
-
We concentrated on correctness and wanted to get something working ASAP
-
We craftily left obvious optimizations opportunities alone because we wanted to get easy performance boosts later in the project (with the corresponding boosts in morale). See the TODO section.
Node.js compatibility is achieved by compiling unmodified Node.js built-in JavaScript modules (we use no C/C code from either Node or v8). This can be an occasionally painfull process, as these modules rely on internal C interfaces which must be reverse engineered and recreated. Since these modules are unmodified they serve a dual purpose - validate our compiler and environment as well as provide great Node.js compatibility.
Since this is a static compiler, connecting C/C++ and JavaScript is conceptually simpler than the interfaces provided by V8 and/or Node. However we are still working on defining interfaces which would be easier to use in practice without in-depth knowledge of the internals of the compiler and runtime system.
The __asm__ built-in is conceptually similar to its equivalent in GCC. Examples of its usage can be seen all over the runtime library (e.g. in runtime/js/core).
-
Transition the runtime to C
-
Use 'hidden classes' instead of property maps.
-
'NaN boxing' instead of explicit tagging
-
Copying generattional garbage collector (we believe it is important to do this work as early as possible as it has signigicant implications on code generation and the runtime).
-
Better implementation of Node.js 'Buffer' - currently we are using an inefficient implementation from Browserify.
-
Fill in missing runtime APIs (e.g. Date).
-
Speed up compilation by caching compiled modules
-
Better source-level debugging
-
Support for source maps
-
ES6 support
-
IR-level optimizations and register allocation
-
TypeScript integration
-
V8 compatibility layer for existing Node extensions
When released, jsc will support the ECMAScript 6 standard (or later), and will be compatible with 'Node.js' libraries and extensions. Module level eval will also be supported (with performance cost). The goal is to be able to recompile most existing 'Node.js' applications without changes.
As we mentioned, a static JavaScript compiler can never rival the performance of a JIT, due to the design of the language itself. But, it can still produce binaries with 'sufficient' or 'useful' performance.
Perhaps even more importantly, the statically compiled binaries will have very predictable performance, which doesn’t change. The produced code can be trivially examined, debugged, and reasoned about - it is not hidden in a huge opaque JIT compiler. 'v8' has excellent diagnostic and visualization tools, but by its very nature it is very complex and so are its tools. Even for an experienced assembler programmer (not to say a casual JavaScript developer), it can be very difficult to decipher or predict what 'v8' is doing.
A JIT, also by its very nature, has big and somewhat unpredictable memory requirements. Different versions of code are kept around, compiled, decompiled, etc. It can get very challenging especially when running multiple ones in parallel, given that nothing can be shared between them. A static compiler avoids all of these problems.
Lastly, the biggest and more important motivation is for fun. We like making compilers, languages and runtimes. So, why not?
Copyright (c) 2015 Tzvetan Mikov and contributors. See AUTHORS.
This project (with the exception of components with different licenses, listed below) is licensed under the Apache License v2.0. See LICENSE in the project root.
Components with different licenses:
-
Acorn is licensed under the terms of its license in acorn/LICENSE.
-
pcre2 is licensed under the terms of its license in runtime/deps/pcre2/LICENSE.
-
dtoa and g_fmt are licensed under the terms of the license in runtime/deps/dtoa/dtoa.c and runtime/deps/dtoa/g_fmt.c.
-
buffer is licensed under the terms of runtime/js/modules/buffer/LICENSE
-
base64-js is licensed under the terms of runtime/js/modules/buffer/node_modules/base64-js/LICENSE.MIT
-
ieee754 is licensed under the terms of runtime/js/modules/buffer/node_modules/ieee754/LICENSE
-
is-array is licensed under the terms of runtime/js/modules/buffer/node_modules/is-array/Readme.md
-
JSON-js (from https://github.com/douglascrockford/JSON-js) is in the public domain.
-
Node code is licensed under the terms of its license in "runtime/js/nodelib/LICENSE+.
-
libuv : runtime/deps/libuv/LICENSE
-
gyp: runtimr/deps/gyp/LICENSE