"Note that running untrusted code is a tricky business requiring great care. -- Node.js - "vm"
Autoclave.js is a sandbox based on a transpiler and shim that enforces rules, restrictions, rate-limits, and caps on the behavior of code of unknown origin so that the host is protected from both intentional and unintentional malicious / spammy / inappropriate operations. Autoclave is intended to support the creation of a learning tool that, together with a web-based Git network (GitHub, GitLab, ...) gives visitors a complete toolchain for building a bot that runs on the server.
A student may program a bot to...
- respond to HTTP requests routed to their application's pathname
- access the file system in response to web visitor or other event
- fetch data from the web via HTTP with request (not implemented yet)
- depend on other code—including peer code—with require-by-URL
Since it ordinarily runs on a node.js server, there is no need for the student to learn HTML, the DOM, CSS, or any other web technology before diving into JavaScript programming. The term bot code refers to content the server receives with the intention of compiling it to the sterilized form and executing it inside a virtual machine context.
Bot code must refrain from using
- object-oriented features (
new
andconstructor
) - exception handling (
throw
,try
, andcatch
) - the slash-delimited regular expression literal syntax (
/[a-zA-Z0-9_]*/
) - invalid identifiers listed in token.js
Programs must define all global variables and require modules by one of...
- identifier, such as
fs
,path
, ... - "git file url" which joins a git URL for a repository and a pathname relative
to the repository root:
git@github.com:autoclave/fs.js
- current working directory:
./module.js
Both code transformations and runtime library functions support the following
change in program behavior: with a few exceptions, access to x[y]
is translated
to x[\
y`]`. Without this translation (which is invisible to the programmer) it
would be necessary to maintain a list of prohibited members names for reading and
writing.
The host isolates (protects itself) from bot code with
- static transformations
- Built-in globals are re-named (table.js)
- property names are enclosed in backticks (table.js, tmpl/header.js)
- network and disk access is restricted (tmpl/header.js)
- global variable and property shim (tmpl/header.js)
- route property access
- mock versions of built-in modules
- VM context (not implemented)
Bot code runs in the host process.
It is possible for a program to read from and write to the filesystem (see test/helloFile.js). This data is stored in filesys/user/repo, so a path such as "path/to/file" will effectively read and write filesys/user/repo/path/to/file. All reads and writes in translated code must use relative pathnames.
With require("git@github.com:ghuser/ghrepo/path/to/file.js")
. When using
require
this way, source files are first compiled to the sandbox version and
then executed in the
Node virtual machine
(not yet implemented).