Posted on Thu Mar 21 2024

Node.js Multithreading: Beyond Non-Blocking I/O

Node.js Multithreading: Beyond Non-Blocking I/O
#nodejs#javascript

Introduction

NodeJs is a V8 engine based runtime environment for JavaScript. It is a single-threaded, non-blocking, asynchronous runtime environment. This means that it uses a single thread to handle multiple requests. This is where the concept of multi-threading comes into play. Multi-threading is a programming concept that allows a program to run multiple threads concurrently. This allows a program to perform multiple tasks at the same time.

Note: a thread basically represents a single process that is running on the CPU. A program can have multiple threads running at the same time.

There are many ways to implement multi-threading in NodeJs. In this article, I'll be using the worker_threads module

Node.js: Evented and Non-Blocking, Not Multithreaded

A common misconception about Node.js is that it's multithreaded because it can handle non-blocking I/O operations. While Node.js does leverage threads under the hood, it remains fundamentally single-threaded when it comes to JavaScript execution. Let's break down this concept.

Single-Threaded JavaScript

JavaScript, by nature, is single-threaded. This means your program's code and any included libraries all run in one thread. If your program attempts an I/O operation like reading a file or making a network request, this blocking operation can halt the entire thread.

Enter libuv and the Event Loop

Here's where Node.js shines. It utilizes the libuv library, which provides a thread pool (not extra threads per se) to handle I/O operations asynchronously. These background threads take care of the file or network requests, freeing up the main thread to continue executing JavaScript.

Callbacks and the Event Loop Queue

Once an I/O operation finishes in a background thread, Node.js doesn't directly run the associated callback function in parallel. Instead, it places the callback in a special queue within the event loop. The event loop continuously monitors the call stack (the list of currently executing functions) in the main thread. When the call stack is empty, the event loop pulls callbacks from its queue and pushes them onto the call stack for execution.

In essence:

  • I/O tasks are handled concurrently by background threads (provided by libuv). JavaScript execution remains single-threaded in the main thread.
  • Callbacks for completed I/O operations are queued and executed sequentially in the main thread.
  • This asynchronous approach allows Node.js to handle many concurrent requests without blocking the main thread. It's this event-driven, non-blocking architecture that makes Node.js ideal for I/O-intensive applications.

Notes:

  • While libuv provides a thread pool, the actual number of threads can vary depending on the system.

  • The V8 engine used by Node.js also utilizes separate threads for tasks like garbage collection, but these don't directly affect JavaScript execution.

When Non-Blocking Isn't Enough

We've established that Node.js excels at handling I/O operations asynchronously, thanks to its event loop and background threads. But what happens when your application encounters CPU-bound tasks, like complex calculations or heavy data processing? These tasks, unlike file reads or network requests, can't be easily offloaded to background threads. If your main thread gets bogged down with CPU-bound work, it can grind your entire application to a halt, negating the benefits of non-blocking I/O.

The Blocking Thread Problem

Imagine you have an Express server that, upon receiving a request, performs a CPU-bound task like image processing, or a big fat nested loop. While this task is running in the main thread, the entire event loop comes to a standstill. No other requests can be processed until the CPU-bound task finishes. This can lead to slow response times and a poor user experience.

Here's a simplified example of a blocking thread scenario:

const express = require("express");

const app = express();

app.get("/calculate", (req, res) => {
    // Simulate a CPU-bound task (e.g., image processing)
    let result = 0;
    for (let i = 0; i < 10_000_000_000; i++) {
        // Do some calculations
        result += i;
    }
    res.send(`Request processed!, result = ${result}`);
});

// a non-blocking route
app.get("/info", (req, res) => {
    res.send("This should be a fast response!");
});

app.listen(3000, () => console.log("Server listening on port 3000"));

Here's what happens when multiple requests are made:

  • Accessing /info: The response is quick because it doesn't involve heavy computations.

  • Accessing /calculate: The long-running calculation blocks the main thread, delaying any subsequent requests to both routes until the calculation finishes.

let's run the server and see the blocking behavior in action:

$ node app.js
Server listening on port 3000

now I'll use curl to send a requests to the server, and I'll use time to measure the time it takes to get a response:

let start by using curl to send a request to the /info route:

$ time curl localhost:3000/info
This should be a fast response!
real    0m0.067s
user    0m0.015s
sys     0m0.031s

As expected, the response is fast. Now let's send a request to the /calculate route and at the same time send a request to the /info route:

$ time curl localhost:3000/calculate
Request processed!, result = 49999999990067860000
real    0m5.949s
user    0m0.000s
sys     0m0.031s
$ time curl http://localhost:3000/info
This should be a fast response!

real    0m5.410s
user    0m0.000s
sys     0m0.031s

As you can see, the request to the /info route was blocked until the request to the /calculate route finished. This is because the CPU-bound task in the /calculate route was running in the main thread, blocking the event loop.

Worker Threads to the Rescue

To address this issue, Node.js offers the worker_threads module. Worker threads allow you to create separate threads (child processes) that can handle CPU-intensive tasks independently, preventing main thread blockage.

How Worker Threads Work:

  1. Creating a Worker Thread: You use the worker_threads.Worker constructor to create a new thread and specify a JavaScript file for it to execute.

  2. Communication: The main thread and worker threads communicate via message passing. They can send data back and forth to coordinate tasks and receive results.

  3. Isolation: Worker threads have their own isolated memory and execution context, ensuring that tasks in one thread don't interfere with others.

Example with Worker Threads:

// index.js
const express = require("express");
const { Worker } = require("worker_threads");

const app = express();

app.get("/calculate", (req, res) => {
    const worker = new Worker("./cpu-task.js"); // Spawn a worker thread

    worker.on("message", (result) => {
        res.send(`Request processed!, result = ${result}`);
    });

    worker.on("error", (error) => {
        console.error("Worker error:", error);
        res.status(500).send("Internal server error");
    });

    worker.postMessage({ data: "Start calculation" }); // Send data to worker
});

app.get("/info", (req, res) => {
    res.send("This should be a fast response!");
});

app.listen(3000, () => console.log("Server listening on port 3000"));
// cpu-task.js
const { parentPort } = require("worker_threads");

let result = 0;
for (let i = 0; i < 10_000_000_000; i++) {
    // Do some calculations
    result += i;
}

parentPort.postMessage(result); // Send result back to main thread

In this example:

  • The CPU-intensive task is moved to a separate file (cpu-task.js).
  • A new worker thread is created for each "/calculate" request.
  • The main thread sends a message to the worker to initiate the task.
  • When the worker finishes, it sends the result back to the main thread, which then sends the response.

result:

$ time curl localhost:3000/calculate
Request processed!, result = 49999999990067860000
real    0m5.035s
user    0m0.000s
sys     0m0.031s

let's send a request to the /info route while the request to the /calculate route is still running:

$ time curl localhost:3000/info
This should be a fast response!
real    0m0.067s
user    0m0.015s
sys     0m0.031s

As you can see, the request to the /info route was processed quickly, even while the CPU-bound task in the /calculate route was still running. This is because the CPU-bound task was offloaded to a worker thread, allowing the main thread to continue processing other requests.

Important Considerations:

  • Worker threads introduce some overhead, so use them judiciously for tasks that truly benefit from multithreading.
  • Communication between threads involves message passing, which can add latency compared to direct function calls.
  • Carefully manage shared resources and synchronization if needed to avoid race conditions.

Conclusion

Node.js is single-threaded by default, but it can leverage worker threads to handle CPU-bound tasks concurrently. By offloading intensive computations to separate threads, you can prevent main thread blockage and maintain the benefits of Node.js's non-blocking I/O model. Worker threads provide a powerful tool for optimizing performance in CPU-intensive applications.

Resources