V8 Isolates
You can probably infer from the title that this is about isolated processes. I came across this when I was looking for something to deploy this site on. Initially, I planned to write about the site but that would be less interesting, so let's talk about what this site runs on Cloudflare Workers.
Table of Contents
What are Cloudflare Workers? Let's just say they are super lightweight serverless runtimes that can run your code at the edge.
Let's have some background before we jump into V8 isolates.
What is V8?
If you're reading this article, I'm gonna assume you've come across "V8" sometime when learning Javascript or messing around with web browsers. Let's look at Javascript first. What is it? It is a programming language. So, it must have a standard defined behavior. If we don't have some kind of standard to adhere to, we might as well be making just about anything. Because it won't be compatible with the other stuff that people build. The standard for JS is ECMAScript (ES in short). It is the spec that defines the behavior of JavaScript code. Anything that runs JavaScript must produce output that is in accordance with ES.
Javascript is just one of the languages that follows ECMA-262. The others would be Typescript (superset to JS), ActionScript (Adobe's implementation), JScript (by Microsoft), Google Apps Script.
Let's see the pipeline for standard JS execution, as per ES:
- lexing: our source code is converted to tokens
- parsing: tokens to AST
- evaluation: run the code
It does not specify how you run the code or what you do behind the scenes to make sure the standard is followed. This is where V8 comes in. V8 is one such engine that runs the JavaScript code.
For user experience, we would prefer our application to be running smoothly and fast. V8 handles this.
Let's see how V8 runs our code.
Pipeline for V8 execution:
- Scanning: Source code to tokens
- Parsing: tokens to AST (lazy parsing)
- Ignition: takes AST and generates the ByteCode. (ByteCode starts running immediately)
- SparkPlug (Baseline Compiler): compiles bytecode to machine code quickly, without spending time on complex optimizations.
- Maglev (Mid-Tier Compiler): It starts as code runs more ("warms up"). It does a few optimizations but not the heavy ones.
- TurboFan (Optimizing Compiler): For hot code (code that runs too frequently), this generates highly optimized machine code. This optimization may not always lead to correct behavior as per the spec. In such a case, V8 performs a DeOpt (DeOptimization). It discards the highly optimized code and turns back to the interpreted version (Ignition) to ensure correctness.
During the ignition phase, V8 uses Inline Caches to collect "type feedback". For example, if a function only sees integers, the engine records this pattern. For compilers like TurboFan, it sees this and strips away generic checks and generates optimized machine code for this particular case. This may not always be right. So we have the "DeOpt" option as well. You could call this a speculative compiler.
Isolation
But what does this all have to do with isolated processes?
Cloudflare has a service called Cloudflare Workers that help in deploying serverless applications. This site is also deployed on workers. Being such a huge company and handling a major chunk of the internet traffic, they have multiple customers and a lot of locations throughout the world that they have to serve.
If you have your site deployed on a single server, and it gets a request from the opposite side of the globe, the response would be pretty slow. This calls for edge servers. Your application is deployed in multiple locations for fast access. To have an application deployed at all those locations, they would have a massive resource constraint. You can't really spin up a million Linux instances for a million customers in over 200 locations throughout the world. These Linux instances would be full-on VMs or somewhat stripped down (still costly). VMs would be "hardware-level" isolation. This would boot up a tiny OS kernel which would also have its own "Cold-Start" penalty.
If not VMs, why not containers? Although lighter than VMs, they are still costly for edge computing. This would be "process-level" isolation. A single Node.js process might idle at around 30MB of RAM. Make it a thousand and you'll have 30GB+ of idle ram. Making it a new process requires its own memory address space, own threads, etc. The CPU also has to context switch which has a high cost in such a demanding environment.
But what if we could isolate these different running applications without using process level isolation. What if it was all within a single process? "Runtime-level isolation"?
Somehow, they were able to achieve this isolation with the V8 engine. Let's look into that.
In the world of V8, the unit of isolation is an Isolate (OS -> Process). The V8 pipeline we went through above runs per-isolate. Each isolate has its own heap memory, own bytecode and a garbage collector. How does this help? You can spin up a 1000 isolates within a single process and switching between them is as simple as changing a pointer in userspace. They also share the machine code for the underlying JIT. The compilers themselves are shared across isolates, which plays a big part in them being cheap.
V8 isolates cannot access memory outside of their own heap, so Isolate A cannot read Isolate B's objects. Now yes, this is a security concern since the host process will have access to the memory of every isolate. I won't go into the details of that. It is already described in this amazing article.
I won't go through the entire V8 Isolate API in just one article. So, how about we look at some sample code and see how our code runs isolated from others?
Sample Code:
Take a look at this. This is the hello-world sample for V8 isolates. You can open this in split view for your reference. We'll go through this to understand the different parts of the code.
Initialization:
v8::V8::InitializeICUDefaultLocation(argv[0]);
v8::V8::InitializeExternalStartupData(argv[0]);
V8 is not a single binary. It has external dependencies and data files. These lines tell V8 where to look for those files on the disk relative to argv[0].
std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();
v8::V8::InitializePlatform(platform.get());
v8::V8::Initialize();
V8 doesn't want to manage the OS threads directly, instead it asks for a platform to the embedder. The embedder is the host environment like an application (Chrome or Node.js). This creates a thread pool and a task scheduler.
Initialize() sets up global static structures like an internal hash table or a source of randomness for Math.random.
Isolate Creation:
v8::Isolate::CreateParams create_params;
create_params.array_buffer_allocator =
v8::ArrayBuffer::Allocator::NewDefaultAllocator();
v8::Isolate* isolate = v8::Isolate::New(create_params);
create_params is a struct as you can see (stack allocated). When V8 needs raw memory, let's say for something like new Uint8Array(1024), it calls an allocator (array_buffer_allocator).
Why so? Because it allows us (the host) to have more control. We can insert a custom allocator here and deny allocation if it exceeds our memory quotas.
Isolate::New() creates the Heap. It reserves a large chunk of virtual address spaces where all the JS objects live. It also initializes V8's garbage collector Orinoco, which uses concurrent and parallel techniques to minimize pause time. isolate is a pointer which holds the state.
Scopes:
v8::Isolate::Scope isolate_scope(isolate);
This is a really nice part. In the host environment, you would have multiple threads running, right? Now for isolates, they can only be run on a single thread at a time. But, if it ran on a single thread, do we have to pin the isolate to the current thread (say Thread A)? NO. Isolates have a special structure called ThreadLocalTop. When Thread A tries to execute the isolate, it would require a lock over the isolate. We do that with v8::Locker locker(isolate). It's not in the given code, since it is single-threaded.
When Thread A "enters" an isolate, the ThreadLocalTop is moved from the waiting area (this is within the isolate object itself) to a special storage called the Thread Local Storage (TLS). This is unique to a thread. You could say it is bound to the TLS.
When we do v8::Isolate::Scope isolate_scope(isolate), it writes something like "for this specific thread, the active isolate is 0x1234" to the TLS.
When Thread A moves on from the current isolate, V8 takes the ThreadLocalTop struct from the TLS into the waiting area. Thread B comes in, V8 looks into the waiting area and copies the struct to Thread B's TLS.
Why do we do this? We have the constraint that only one thread can enter the isolate at any given time. Every time V8 performs an action, it needs to know which isolate it's currently in. Passing a pointer to the current isolate, to every single function in the codebase is slow and painful. Instead we use the TLS to pin the current isolate to the current thread.
When some internal V8 code needs the isolate, it can just look at a specific offset in the current thread's memory. This is essentially zero-cost compared to a global lookup.
Because V8 allows a single thread to jump between different isolates, it maintains a Thread Id mapping. It uses an internal struct to store thread-specific metadata like execution stack limits, scope handles, scheduled exceptions, etc.
Why single-threaded? How does that help?
V8 is obsessed with performance. This single-threaded approach removes lock contention (NO LOCKS), cache friendly (the TLS is accessed very frequently so it mostly stays in the L1 cache), and fast context switching (the engine just updates one pointer and suddenly the entire C++ backend knows what memory heap to work on).
v8::HandleScope handle_scope(isolate);
This is the translation between C++ which manages memory manually and JS which uses a GC. If your C++ pointer points to some object and the GC moves it to a new memory address, your C++ pointer would be pointing to some garbage value. To prevent this, we need some kind of translation layer.
Inside the TLS, there is a pointer to the Handle Stack, which is a dedicated list of memory addresses managed by V8.
When we hold a Local<span> handle in C++, we aren't pointing to the actual memory address, we're pointing to an address (a slot) within this HandleStack. This slot contains the actual memory address of the object on heap.
To prevent memory fragmentation, the GC constantly moves objects around. Every time it moves the objects, it finds their respective slots in the Handle Stack and updates them.
Why is the HandleStack within TLS? Again, to avoid lock contention.
There is v8::Persistent<span> to store things outside of this stack-based system.
So, when you do handle_scope(isolate), V8 pushes a marker onto this Handle Stack. When it's time, this marker is popped and the GC is free to delete any objects we created with the scope earlier.
v8::Local<v8::Context> context = v8::Context::New(isolate);
This sets up our context. What does this mean? It's kinda like setting an ecosystem for the JS code that we're going to run. It allocates a large structure on the heap called NativeContext. This is an array-like object which holds pointers to every built-in function JS requires.
It contains a Global Object (window/global), "Global Proxy" which is the actual object the JS code interacts with, and hidden classes for basic objects like Array, String and Object.
This is expensive, since it has to allocate memory and instantiate built-ins every time we call New().
v8::Context::Scope context_scope(context);
Inside the isolate, there is a field called context_. This line takes the current context pointer and updates that field.
Isolates share nothing, they have separate heaps. Contexts share heap, if context A and context B belong to the same isolate, they share the same GC and memory pool. V8 also does "Context Tagging" as a security measure.
Running JS Code
Alright, so we set up the isolate. Now, let's run some JS code.
v8::Local<v8::String> source =
v8::String::NewFromUtf8(isolate, "'Hello' + ', World!'",
v8::NewStringType::kNormal)
.ToLocalChecked();
Local creates the pointer on the stack, NewFromUtf8 copies the C-string into the V8 heap. ToLocalChecked(): if allocation fails, this crashes (of course, you wouldn't use this in production — you'd need to handle the errors).
v8::Local<v8::Script> script =
v8::Script::Compile(context, source).ToLocalChecked();
Converts string to AST and then AST to bytecode. It allocates a Script object in the heap containing the bytecode.
v8::Local<v8::Value> result = script->Run(context).ToLocalChecked();
The ignition interpreter executes the bytecode. JS is dynamically typed, so the return type is Value.
The second snippet shows the same with WASM code. It goes through a different compiler pipeline. You can check it out if you're interested.
You would've noticed that right after the v8::Isolate* isolate = v8::Isolate::New(create_params), we had a {. After the second wasm code snippet, this was closed with }.
This triggers the destructors. Current context is cleared. All handles created inside are cleared, GC roots die. The thread is unlocked from the isolate.
Cleanup
isolate->Dispose()
Destroys the C++ Isolate object.
v8::V8::Dispose();
v8::V8::ShutdownPlatform();
delete create_params.array_buffer_allocator;
return 0;
This is cleanup. It stops background threads, frees the platform and deletes the allocator we created on the heap.
Hmmm, I hope that was fun to go through. Until next time!
If you find any inconsistencies or something misleading, please reach out so that I can fix it.