The first four Val Town runtimes
Tom MacWrighton
The core of Val Town is running untrusted code.We take your TypeScript code and run it on a server, safely, quickly, and flexibly.
The default way to run untrusted code is dangerously. If you set up aserver, like Val Town’s, that receives bits of JavaScript code and simply runsall of the submitted code with Node.js, you’d be mining crypto and running a botnetwithin the hour.
Code in Node.js has unfettered access to your system’s environment variables,filesystem, network, and much more. This accessis routinely abused in supply-chain attacks.In response, the Node.js project is adding a permissions systemthat can allow & deny access to the filesystem and other resources,but it’s still experimental.
To safely run multiple vals, we need to sandbox and contain them:
- Vals need to be isolated from the system they’re running on: they shouldn’tbe able to read from the filesystem or learn about environment variables.Vals should be isolated from each other: you shouldn’t be able to affectanyone else’s val’s behavior, or read anyone else’s information.
On the other hand, if you were to make every Val run in a full-fledged Dockercontainer, it’d be tremendously slow. You’re booting up a whole Linux machinewith all the bells and whistles just to evaluate a few lines of code. It’d bepretty secure, butpretty slow.You could create a pool of Docker containers, but the memory and devops overheadwould be significant.
Val Town gives you the flexibility of using modules and APIs from a numberof different ecoystems. If you’ve usedNode.js, great - your NPM modules should mostly work. Thanks to our current system,a lot of DOM methods work too - you can parse query strings with URLSearchParams and use new APIslike web-standard streams.
A way to square the safety + speed requirementswould be to use QuickJS, a standalone JavaScriptengine that’s fast to start up and easily sandboxed with WebAssembly. Butwe’d be miles away from NPM or Node compatibility.
Safety, speed, and flexibility are often in conflict with each other. Some approacheswould be secure but limited, like QuickJS. Others would be flexible but bringcomplexity and performance questions, like Dockerizing every Val. We need to finda system that balances these concerns.
So, those are the properties we’re aiming for. What have we tried so far? Buckle up.
The very first version of Val Town used Node.js’s vm module.Schematically, running a val looked like this:
const [returnValue, logs, error] = vm.runInContext(`
console.log = (log) => { logs.push(log); return undefined };
[returnValue, logs, error];
Now, the documentation for the vm module warns that “The node
The Node.js vm module is incredibly convenient to use, but it was dangerous anddidn’t check our boxes:
- Infinite loops in user code will take down your serverIt’s easy to escape the sandbox and access sensitive environment variablesYou can’t install NPM modules within the sandboxScripts that run in the Node.js VM sandbox run in script mode, not modulemode, so you can’t use
import or export.Due to the security problems with vm, we quickly switchedto vm2, a now-deprecated NPM modulethat, unlike vm, aimed to be a security sandbox. VM2 still usedvm “under the hood”, but it formalized all of the security workaroundsthat people used to secure it. It used Proxies to try and prevent peoplefrom escaping the sandbox. A familiar exploit in vm is something likethis, taken from the vm2 readme:
vm.runInNewContext('this.constructor.constructor("return process")().exit()');
vm2 prevented this attack by tweaking all of the objects exposed to theVM context.
The vm2 maintainer valiantly fought to make this strategy robust, but inthe end it was impossible to add security to an unsecure construct,and the sandbox escapes kept coming.Eventually vm2 was deprecated by its maintainer and isolated-vmis the recommended replacement for Node.js users.
At this point, we started to learn some lessons. In broad strokes:
- The Node.js vm module isn’t a good base to build on.It’d sure be nice to run vals as modules, not scripts.Attempts to sandbox JavaScript using JavaScript, like vm2,are probably not going to work out.
Of course, we hadn’t learned these lessons all the way - but we weregetting there.
At this point, I had become really interested in Deno. Ever since2018, when Ryan Dahl gave his excellent presentation on Node.js regrets,the Deno project had prioritized the exact kind of sandboxing and securitywe were interested in. A lot had changed since 2018 - they had evenported the whole project from Go to Rust -but the permissions system was a consistent focus, andall signs pointed to it being very high-quality.
With Deno you can import modules and approve or disallow, one by one, theirrequests to access the filesystem or environment. It’s like going from coderunning as root by default, to a capabilities-based system: iOS asking youwhether you want some app to see your location.
Better yet, Deno had a Workers APIthat mimics the Web Workers API in browsers: you can spin up separate threads,with their own permissions, within an existing Deno process. It’s super fast –much faster than spinning up a new process. I idolize how Cloudflare Workers works - spinningup a V8 isolate per execution instead of a whole process. This lookedlike the way to do it.
So we implemented a separate Deno server that only handled Val evaluation: givingus three servers total (Remix for the frontend, Express backend, Deno server).This elegantly provided isolation between Deno and Node.js, so Vals were muchfarther away from environment variables or anything sensitive.
All in all, this was an improvement, but we still had further to go.
- Security problems were far fewer and had much less impact. Deno’s sandboxingwas a huge unlock. However, it was still possible to escape the sandbox, bydoing things like spawning a worker inside of the worker. We were also stilltrying to preserve a complex security modelin which JavaScript functions had their own environment variables andsecurity isolation.
Our server repeatedly restarting because of memory issues
- However, Workers triggered memory problemsthat made the evaluation server slower and slower over time until it restarted.We were spawning and killing a lot of workers, and understandably thiswasn’t a typical usecase.Server isolation had a performance cost: for a val that powers a website, for eachrequest we’d receive the request in the Express server, serialize it, sendit to Deno, which would deserialize and operate on it, send the response to Express,which would then send it to the user.
Despite these issues, the switch to Deno was a big upgrade, and we were significantlyless affected by anything that untrusted code could do.
The current iteration of our runtime learns from the lessons so far:
- You can’t really use JavaScript to build a JavaScript sandbox.Deno is a powerful tool for simple containment of functionality.Memory leaks are scary.
Right now, Vals run as temporary Deno subprocesses of a Node.js server.The subprocess’s code is tiny: it uses node-deno-vm,a terrific module from Kallyn Gowdy, which works by exposing a WebSocketconnection between Deno and Node.js and wrapping it ina postMessage-style API.The “guest” code that wraps vals is also now public,though it’s still an implementation detail and could change at any time.
While this gives far better isolation between user code and system codethan our previous vm and Worker-based strategies, it takes a little moretime to start up, so we use a pool of pre-initialized processes toavoid that time penalty.
This strategy also gives the ability to have more fine-grained communicationwith Deno, because it’s based on WebSockets instead of a simple RESTHTTP request cycle. We can send multiple messages to control vals whilethey’re running, and receive multiple messages from them as they run.This is key to how we just cut 100ms from typical val runtimes - more onthat on a future post.
Along with Runtime v4 was Val Town v3, which included a slew of improvements,like static import support, JSX, and web-standard JavaScript.It was a vastly expanded set of JavaScript that “just worked,”and worked pretty well.
| v1 | v2 | v3 | v4 | |
|---|---|---|---|---|
| Platform | Node.js | Node.js | Deno | Deno |
| Module support | None | None | Dynamic imports only | Static + dynamic ES Modules |
| Isolation primitive | V8 Context | V8 Context | V8 Context | Process |
| Val syntax | Custom JS | Custom TS+JS | Custom TS+JS | Standard TS+JS |
| Importing vals | @ syntax | @ syntax | @ syntax | ESM-standard import |
As you can tell, we’ve been iterating fast. v4 is not the last version ofthe Val Town runtime: we still have more ground to cover on all of the dimensions that wecare about. It’s heartening to see others contribute to this discussion - Figma’sadoption of QuickJS, afterusing Realms was not secure enough,Amazon open-sourcing Firecrackerone of the components of Lambda, and Denosharing details of how their Deploy product works.
We want to learn from everyone. Vals should run instantly, securely, and everything should justwork. Try out Val Town and see for yourself howan instantly-deployed, zero-configuration script can work.
Oh, and if you read this and you have the answer and the experience and are theone to unlock the next-gen runtime: we’re hiring.
Edit this page