Skip to content
kripken edited this page Feb 5, 2012 · 46 revisions

Building Projects

The Tutorial shows how emcc, the drop-in replacement for gcc, can be used to compile single files very easily into JavaScript. Building large projects with Emscripten is also very simple: You basically use emcc instead of gcc in your makefiles. This can usually be done by setting CC to emcc, or with a flag to configure, but it can be even easier than that: For example, if you normally build with

   ./configure
   make

then the process with Emscripten looks like

   emconfigure ./configure
   make
   emcc [-Ox] project.bc -o project.js
  • The first change is to run emconfigure, with the normal configure command as an argument. emconfigure runs configure but tells it to use emcc instead of gcc, and a few other useful things (for details, since the docs inside emcc).
  • The second change is, once the project is built, to add a command to convert the compiled project into JavaScript: emcc is run on the compiled project bitcode, and told to generate JavaScript output (we will see later down why [-Ox] appears there). This additional command is necessary because emcc, when called from the makefile, will not automatically generate JavaScript during linking. If it did, there would be a lot of JavaScript generated in intermediary steps in many projects, which is unnecessary and inefficient to link and so forth. Instead, emcc when called from the makefile will generate bitcode. A single line, as shown above, then converts the bitcode into JavaScript.
  • In other words, a conventional native code build system will generate native code in object files as an intermediate form, while building with Emscripten uses LLVM bitcode as an intermediate form.
  • In general you don't need to care about this, except for needing one extra line for the last transformation to JavaScript. However, one potentially confusing situation can occur with optimization: Assume that you compile individual files to bitcode, then link them, then compile that to JavaScript. Then when you optimize matters: Optimizations specified when compiling individual files to bitcode will not affect the bitcode to JavaScript compilation process, since that doesn't happen at that stage. Optimizations specified during the last stage will affect the bitcode to JavaScript compilation process, and those optimizations are crucial for good performance. Therefore, when building projects, you should specify -O2 or some other optimization level (see Optimizing Code) in the final additional command so that the code is fully optimized. Note that because we require specifying optimization in the last stage anyhow, bitcode is not optimized until then either: In other words, optimization flags are ignored until the last stage (this prevents unneeded work to do bitcode optimizations that must be done later anyhow).

Manually Using emcc

As a drop-in replacement for gcc, emcc can be used in all the normal ways you would expect:

    emcc src.cpp
    # Generates a.out.js from C++. Can also take as input .ll (LLVM assembly) or .bc (LLVM bitcode)

    emcc src.cpp -c
    # Generates src.o containing LLVM bitcode.

    emcc src.cpp -o result.js
    # Generates result.js containing JavaScript.

    emcc src.cpp -o result.bc
    # Generates result.bc containing LLVM bitcode (the suffix matters).

    emcc src1.cpp src2.cpp
    # Generates a.out.js from two C++ sources.

    emcc src1.cpp src2.cpp -c
    # Generates src1.o and src2.o, containing LLVM bitcode

    emcc src1.o src2.o
    # Combine two LLVM bitcode files into a.out.js

    emcc src1.o src2.o -o combined.o
    # Combine two LLVM bitcode files into another LLVM bitcode file

For more on emcc's capabilites, do emcc --help (it can also optimize, change parameters to how Emscripten generates code, generate HTML instead of JavaScript, etc.).

Build System Self-Execution

Some large projects, as part of their build procedure, generate executables and run them in order to generate input for later parts of the build system (for example, a parser may be built and then run on a grammar, which generates C/C++ code that implements that grammar). This is a problem when cross-compiling, including with Emscripten, since you cannot directly run the code you are generating.

The simplest solution is usually to build the project twice: Once natively, and once to JavaScript. When the JavaScript build procedure then fails on not being able to run a generated executable, you then copy that executable from the native build, and continue to build normally. This works for Python, for example (for more details, see tests/python/readme.txt).

Another possible solution that makes sense in some cases is to modify the build scripts so that they build the generated executable natively. For example, this can be done by specifying two compilers in the build scripts, emcc and gcc, and using gcc just for generated executables. However, this can be more complicated than the previous solution because you need to modify the project build scripts, and also you need to work around cases where code is compiled and used both for the final result and for a generated executable (so you need to make sure it is built both natively and for JS).

Alternatives to emcc

You can in theory call clang, llvm-ld, etc. yourself. However, not using emcc is dangerous. One reason is that emcc will use the Emscripten bundled headers, while using Clang by itself will not, by default. This can lead to various errors. Also, using things like llvm-ld will result in unsafe/unportable LLVM optimizations being done by default. When you use emcc, it automatically handles all of that for you so that things work properly.

Examples

You can see how the large tests in tests/runner.py are built - the C/C++ projects there are built using their normal build systems, using emcc as detailed on this page. Specifically, the large tests include: freetype, openjpeg, zlib, bullet and poppler.

Also worth looking at the build scripts in the following projects, although several are not yet updated to use the new emcc tool: