Applying Sandbox Security to Node.JS Unikernels with OpenBSD Pledge and Unveil

We recently added support for both OpenBSD's pledge and unveil syscalls. The first one can apply security policies for disallowing certain syscalls while the latter one can apply a restricted filesystem view. In this tutorial we'll show you how you can easily apply a sandbox to your prod workloads effortlessly.

I really enjoyed writing this article as I feel it really showcases the potential of what you can do with Nanos and how we view the world. It's interesting that these syscalls were originally written for obsd but no one actually deploys to openbsd - everyone deploys to linux and linux has long suffered from a lack of compelling security modules. Well, with the Nanos unikernel you can run "linux workloads" but get instant access to superior protection (not to mention performance, ease of use, and half a dozen other benefits) with no code modifications. It isn't enough to be able to add ad-hoc kernel functionality on the fly to an application. You also have to expose an interface to the end-user that enables them to consume it in a sane fashion. Thank you virtualization.

This relatively innocuous chunk of work allows even non-developers to clamp down hard on applications abilities to be exploited. You see h4x0rs do not care about what language your program is written in. All they see is an abstract machine. They don't care what your program is supposed to do - they are only there to take control - of the machine.

Why not seccomp/seccomp-bpf et al? Seccomp-bpf suffers from a very brittle design and leaves lowly application developers or devops end-users with little to no choices in actually implementing desired functionality (eg: pointer dereferencing). (We might look at landlock in the future. It's actually been around for a while and has changed a lot.) You have to realize that most security teams out there, that don't work at billion dollar tech companies, are merely reactive vs proactive (eg: left alone to scan for hacked systems or systems that are going to be hacked) - no defense whatsoever. It isn't their fault. It's not necessarily that they are dumb - they don't have the proper tools to arm themselves with.

One of the cool things about unikernels is that it allows you to test radical system design changes quickly. This sort of functionality gets magnified when you are using interpreted languages such as javascript or python as end-users are never compiling those interpreters to begin with. That is, a unikernel package maintainer can simply offer new packages with added kernel functionality and the end-user can seamlessly upgrade to that support just by setting a different build target in their ci/cd loop. How fantastic is that?

Pledge

Today, we are going to be going over the pledge and unveil npm package as direct examples of this.

With the Nanos unikernel you can today, for the first time, as a non-developer, apply real security controls to the applications running in your infrastructure.

Let's take a look at this simple webserver that writes to bob.txt. Seems pretty straight-forward right? A request comes in and we write some content to bob.txt:

var http = require('http');
const fs = require('fs');

http.createServer(function (req, res) {

    const content = 'what about bob?';

    try {
      fs.writeFileSync('bob.txt', content);
    } catch (err) {
      console.error(err);
    }

    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end('Hello World\n');
}).listen(8083, "0.0.0.0");
console.log('Server running at http://127.0.0.1:8083/');

We'll run it using this config:

{
  "BaseVolumeSz": "200m",
  "RunConfig": {
    "Ports": ["8083"]
  },
  "Dirs": ["node_modules"],
  "Files": ["hi.js"],
  "Args": ["hi.js"]
}
eyberg@box:~/p$  ops pkg load eyberg/node:v18.12.1 -p 8383 -c config.json --nanos-version=563425d
booting /home/eyberg/.ops/images/node ...
en1: assigned 10.0.2.15
Server running at http://127.0.0.1:8083/
en1: assigned FE80::B012:53FF:FE96:7593

Ok that works - but what if we don't want anything to be written to the filesystem at all? Perhaps there is no need to write to it so let's just remove the capability all together to prevent an attacker from writing to it.

We adjust our config to include the pledge klib:

{
  "BaseVolumeSz": "200m",
  "RunConfig": {
    "Klibs": ["sandbox"],
    "Ports": ["8083"]
  },
  "ManifestPassthrough": {
    "sandbox": {
      "pledge":{}
    }
  },
  "Dirs": ["node_modules"],
  "Files": ["hi.js"],
  "Args": ["hi.js"]
}

Then we tell Nanos that we are pledging only to call this group of syscalls. Anything else kills the process:

var http = require('http');
const fs = require('fs');

http.createServer(function (req, res) {

    const content = 'what about bob?';

    try {
      fs.writeFileSync('bob.txt', content);
    } catch (err) {
      console.error(err);
    }

    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end('Hello World\n');
}).listen(8083, "0.0.0.0");
console.log('Server running at http://127.0.0.1:8083/');

pledge = require('nanos-pledge');
pledge.pledge("stdio rpath inet prot_exec", null);

Now, the next time someone tries to write to the disk we crash on purpose with a SIGABRT because we explicitly disallowed it:

eyberg@box:~/p$  ops pkg load eyberg/node:v18.12.1 -p 8383 -c sandbox.json --nanos-version=563425d
booting /home/eyberg/.ops/images/node ...
en1: assigned 10.0.2.15
Server running at http://127.0.0.1:8083/
signal 6 (no core generated: limit 0)
en1: assigned FE80::D027:ECFF:FEB9:E5B

Now, you might be pointing out that if we want to apply the rule of 'thou shalt not write to disk' why do we even have the write call in the code? This and other examples here are slightly contrived to make things clear in our examples. In the real world even if you are't trying to issue writes an attacker doesn't care. Maybe they find some memory they can write to. Maybe they find some they can execute on. That's all they need. That's also why in Nanos why you can't execute on the stack or heap by default.

Remember how I said that using the interpreters can make this super seamless? The code in the npm packages are actually fairly straight-forward and rely on ffi-napi.

var ffi = require("ffi-napi");
var syscallLib = ffi.Library(null, {
    "syscall": ['long', ['long', 'string', 'string']]
});

exports.pledge = function(promises, execpromises) {
    const pledgeSyscallNum = 335;
    var ret = syscallLib.syscall(pledgeSyscallNum, promises, execpromises);
    if (ret == 0)
        return 0;
    else
        return -ffi.errno();
}

exports.errPerm = -1
exports.errInval = -22

The discerning reader will notice we'e been passing in null to execpromises. Why? The reason is that Nanos doesn't allow you to exec other processes unlike GPOS.

We are mostly showing javascript examples here to show the ease-of-use but what about go? Same thing but you'll notice here we call out explicitly. On a side note, one of our users made this example before I even had time to look at the original pull request so I'm just going to re-use their code here (thanks @rinor!).

package main

import (
  "fmt"
  "log"
  "os"
  "syscall"
  "unsafe"
)

func pledgePromises(promises string) error {
  var exptr unsafe.Pointer

  pptr, err := syscall.BytePtrFromString(promises)
  if err != nil {
    return err
  }

  _, _, e := syscall.Syscall(335, uintptr(unsafe.Pointer(pptr)), uintptr(exptr), 0)
  if e != 0 {
    return e
  }

  return nil
}

func main() {
  err := pledgePromises("stdio error rpath")
  if err != nil {
    log.Fatalf("pledgePromises - %q", err)
  }

  log.Print("Readir should work - (rpath - enabled)")
  files, err := os.ReadDir(".")
  if err != nil {
    log.Fatal(err)
  }

  for _, file := range files {
    fmt.Println(file.Name())
  }

  err = pledgePromises("stdio error")
  if err != nil {
    log.Fatalf("pledgePromises - %q", err)
  }

  log.Print("Readir should fail - (rpath - disabled)")
  files, err = os.ReadDir(".")
  if err != nil {
    log.Fatal(err)
  }

  for _, file := range files {
    fmt.Println(file.Name())
  }

}

As you can see the readdir against '/' works until it gets disabled:

➜  uz ops run -c config.json --nanos-version=563425d uz
 100% |████████████████████████████████████████|
booting /Users/eyberg/.ops/images/uz ...
2023/03/26 17:41:55 Readir should work - (rpath - enabled)
dev
etc
lib
proc
sys
uz
2023/03/26 17:41:55 Readir should fail - (rpath - disabled)
2023/03/26 17:41:55 open .: function not implemented
exit status 3

Unveil

Now let's look at unveil. We start off by installing the nanos-unveil package:

npm install nanos-unveil

Then lets create a basic config where we include node_modules:

{
  "RunConfig": {
    "Ports": ["8083"]
  },
  "Dirs": ["node_modules"],
  "Files": ["sandbox.js"],
  "Args": ["sandbox.js"]
}

In this example we show the list of files in '/'.

var http = require('http');
const path = require('path');
const fs = require('fs');

http.createServer(function (req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});

    out = '';

    filenames = fs.readdirSync('/');
    filenames.forEach(file => {
      out += file + "\n";
    });

    res.end(out);
}).listen(8083, "0.0.0.0");
console.log('Server running at http://127.0.0.1:8083/');
eyberg@box:~/unveil$ ops pkg load eyberg/node:v18.12.1 -p 8383 -c
config.json --nanos-version=563425d
booting /home/eyberg/.ops/images/node ...
en1: assigned 10.0.2.15
Server running at http://127.0.0.1:8083/
en1: assigned FE80::CC0C:F8FF:FE9E:500C

We curl our webserver and it responds as we expect:

eyberg@box:~$ curl -XGET http://127.0.0.1:8083/
dev
etc
lib
lib64
node_modules
node_v18.12.1
proc
sandbox.js
sys

Now let's try reading the file with unveil:

var http = require('http');
const path = require('path');
const fs = require('fs');

http.createServer(function (req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});

    out = '';

    filenames = fs.readdirSync('/');
    filenames.forEach(file => {
      out += file + "\n";
    });

    res.end(out);
}).listen(8083, "0.0.0.0");
console.log('Server running at http://127.0.0.1:8083/');

unveil = require('nanos-unveil');
unveil.unveil("/node_modules", "r");

Then we use the unveil configuration:

{
  "RunConfig": {
    "Klibs": ["sandbox"],
    "Ports": ["8083"]
  },
  "ManifestPassthrough": {
    "sandbox": {
      "unveil":{}
    }
  },
  "Dirs": ["node_modules"],
  "Files": ["sandbox.js"],
  "Args": ["sandbox.js"]
}

However, this time when we hit it with curl we find we can't scan the directory any more:

eyberg@box:~/unveil$ ops pkg load eyberg/node:v18.12.1 -p 8383 -c unveil.json --nanos-version=563425d
booting /home/eyberg/.ops/images/node ...
en1: assigned 10.0.2.15
Server running at http://127.0.0.1:8083/
en1: assigned FE80::CCA:69FF:FE48:137D
node:internal/fs/utils:348
    throw err;
    ^

Error: ENOENT: no such file or directory, scandir '/'
    at Object.readdirSync (node:fs:1451:3)
    at Server. (/sandbox.js:10:20)
    at Server.emit (node:events:513:28)
    at parserOnIncoming (node:_http_server:1068:12)
    at HTTPParser.parserOnHeadersComplete (node:_http_common:117:17) {
  errno: -2,
  syscall: 'scandir',
  code: 'ENOENT',
  path: '/'
}

Node.js v18.12.1
exit status 3

I hope this peeks your curiousity in exploring how you can leverage pledge and unveil in your applications and infrastructure. I also hope it highlights the significant super powers that unikernels can bring to your devops/sre toolkit.

Deploy Your First Open Source Unikernel In Seconds

Get Started Now.