Introducing Console Drivers to Stream to GCP Logging and AWS CloudWatch

There are about a billion ways to get logs out of your nanos instance. By default when ran locally via 'ops run' or 'ops pkg load' stdout/stderr go straight out through the serial port and that is fine for most use-cases locally. If deployed to prod the most simple and straight-forward method would be to have your application itself ship everything over via syslog as described in logging go unikernels to papertrail. Most shops that care about logs already have a preferred logging solution such as ElasticSearch which can be ran as its own unikernel.

If you are a polyglot shop (and most organizations are at scale) you might wish to use the syslog klib using this syntax:

{
  "ManifestPassthrough": {
    "syslog": {
      "server": "10.0.2.2"
    }
  },
 "RunConfig": {
    "Klibs": ["syslog"]
  }
}

This allows you to add a simple common klib to your unikernels and it won't matter whether the api team is writing Go, or the data team is using python. To date, this is probably the most common and popular method to handle logs.

There are many other methods though. You can of course simply copy them out if running locally via this method:

ops image cp  /var/log/mylog .

If you are debugging a unikernel for performance reasons, and as we all know "unikernels are completely undebuggable"™, (ignore the hundreds of github tickets where someone reports a bug, it is debugged/fixed and the ticket is closed) you might want to use the net console feature and sidestep the performance hit from using serial. This is a great method for performance tuning.

Now let us introduce you to two more ways of handling your logs. These new features come packaged as 'klibs' which you can think of them as plugins to your system and is how nanos embraces the 'library operating system' idea. The idea is to pick what you need for your deployment and then make it as simple as possible.

GCP Cloud Logging

Since we had several people request it, we implemented a new console driver that ships output to GCP Cloud Logging. Essentially the way it works is that when the instance boots up it checks into the metadata server at 169.254.169.254 to obtain an access token and then it can interact with the server at logging.googleapis.com.

Hostname, project_id, and access_token are taken from the metadata server and we actually have code, as shown below, to wait for it to be accessible because nanos guests boot a lot faster than linux guests do. (see this post where we reboot the same GCP instance with a new memory layout on each new request)

timestamp retry_backoff = bound(retry_backoff);
/* Do not print error messages for transient issues in the metadata server which can be seen * right after instance startup (e.g. HTTP 500 response with "Failed to authenticate request * (Type 0)". */ if (retry_backoff > seconds(1)) msg_err("setup failed: %v\n", s); timm_dealloc(s); if (retry_backoff < seconds(3600)) bound(retry_backoff) <<= 1; struct timer t = {0}; init_timer(&t); timer_handler setup_retry = closure(gcp.h, gcp_log_setup_retry, t);

For the next two examples I'll use this Go webserver that I use in many of our examples and videos. We simply spin up a webserver to listen on 8080, increment the count for each request and log out the request count line:

package main

import (
  "fmt"
  "net/http"
)

func main() {
  fmt.Println("just a test")

  icnt := 0

  http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
    icnt++

    fmt.Printf("log %d\n", icnt)
    fmt.Fprintf(w, "Welcome to my website!")
  })

  fs := http.FileServer(http.Dir("static/"))
  http.Handle("/static/", http.StripPrefix("/static/", fs))

  http.ListenAndServe(":8080", nil)
}

In this example I'll show you how to use the GCP cli toolset to tail your logs. If this is the first time you've tried to deploy your unikernel to Google Cloud you'll want to read through the docs or watch a youtube video. We first need to install the alpha tail tools first by following along with the instructions.

gcloud components install alpha
gcloud alpha logging tail
sudo pip3 install grpcio
export CLOUDSDK_PYTHON_SITEPACKAGES=1

Here is the config I used. You'll want to include both the gcp and tls klibs. You'll want to associate a a service account as it will need to talk to the metadata service at 169.254.169.254.

If 'log_id' is set in the logging tuple it will be used in the path of the log name otherwise the instance hostname is used. The rest of this configuration is pretty vanilla for GCP. GCP requires a project id and the requested zone and bucket to use.

{
  "RunConfig": {
    "Klibs": ["gcp", "tls"],
    "Ports": ["8080"]
  },
  "CloudConfig" :{
    "ProjectID" :"myproject-1234",
    "Zone": "us-west2-a",
    "BucketName":"nanos-test",
    "InstanceProfile": "default"
  },
  "ManifestPassthrough": {
    "gcp": {
      "logging": {
        "log_id": "my_log"
      }
    }
  }
}

To try this out you'll want to use the nightly build like so:

ops image create -t gcp -c gcp.config.json -n my_program
ops instance create -t gcp -c gcp.config.json -p 8080 my_program

(Note: we keep builds of every PR for 20 days and can be used simply by specifying the commit hash (eg: `ops run --nanos-version c98bbdc`)

Now we can tail our logs using the gcloud log tool. In this example I've opted to have the smallest buffer window in exchange for potentially out of order logging and I'm only interested in the actual log lines being produced versus all the metadata that can be returned:

gcloud alpha logging tail --buffer-window=0s --format="value(textPayload)"

Bam. Now you can tell your boss that you've got your unikernels hooked into the GCP Cloud Logging pipeline.

AWS CloudWatch

Now let's move on to AWS Cloudwatch. We already had existing CloudWatch support on AWS in the form of memory metrics but now we have full logging support as well.

As you can see, just like the GCP integration we attach a console driver through the aptly named 'attach_console_driver function:

void attach_console_driver(struct console_driver *driver)
{
    spin_lock(&write_lock);
    list_insert_before(driver->disabled ? list_end(&console_drivers) : list_begin(&console_drivers),
            &driver->l);
    spin_unlock(&write_lock);
}

At this point, if you haven't looked at the code you might be wondering how we query the metadata server itself. For instance how do we issue http requests and parse the response? Good question, we have http helper functions available via https://github.com/nanovms/nanos/blob/master/src/http/http.c.

In particular allocate_http_parser has a definition that returns a closure to parse the data:

buffer_handler allocate_http_parser(heap h, value_handler each);

and http_request creates the request:

status http_request(heap h, buffer_handler bh, http_method method, tuple headers, buffer body)
{
    buffer b = allocate_buffer(h, 100);
    buffer url = get(headers, sym(url));
    bprintf(b, "%s %b HTTP/1.1\r\n", http_request_methods[method], url);
    if (body) {
        buffer content_len = little_stack_buffer(16);
        bprintf(content_len, "%ld", buffer_length(body));
        set(headers, sym(Content-Length), content_len);
    }
    http_header(b, headers);
    status s = apply(bh, b);
    if (!is_ok(s)) {
        deallocate_buffer(b);
        return timm_up(s, "result", "%s failed to send", __func__);
    }
    if (body)
        s = apply(bh, body);
    return s;
}

The AWS implementation leans on mbedtls for crypto primitives.

Just like the Google Cloud example AWS needs a IAM role to chat to the metadata server to.

In this example you can configure both the log_group and log_stream that the logs can be sent to. The log_group will default to the IMAGE_NAME env var that ops injects upon building your image and log_stream defaults to the internal instance identifier. Both are auto-created if not specified providing a truly seamless experience.

{
  "RunConfig": {
    "Klibs": ["cloudwatch", "tls"],
    "Ports": ["8080"]
  },
  "CloudConfig" :{
    "BucketName":"nanos-test",
    "InstanceProfile": "CloudWatchAgentServerRole"
  },
  "ManifestPassthrough": {
    "cloudwatch": {
      "mem_metrics_interval": "5",
      "logging": {"log_group":"ian-test","log_stream":"my_log_stream"}
    }
  }
}

Now we can create our image and spin the instance up:

ops image create -t aws -c aws.config.json -z us-west-1 -n my_program
ops instance create -t aws -c  aws.config.json -z us-west-1 my_program

Finally we can tail our logs and see that we are getting the output we expect:

➜ aws logs tail ian-test --follow
2023-01-21T19:43:06.213000+00:00 my_log_stream just a test
2023-01-21T19:43:08.133000+00:00 my_log_stream en1: assigned FE80::4E6:E6FF:FE9E:52F9
2023-01-21T19:43:48.866000+00:00 my_log_stream log 1
2023-01-21T19:43:50.859000+00:00 my_log_stream log 2
2023-01-21T19:43:51.292000+00:00 my_log_stream log 3
2023-01-21T19:43:51.619000+00:00 my_log_stream log 4
2023-01-21T19:43:51.977000+00:00 my_log_stream log 5
2023-01-21T19:43:52.354000+00:00 my_log_stream log 6

Now you got your unikernels deployed to both AWS and GCP and their respective logging solutions hooked up like a boss - rolling your own sky computing.

Do you learn better via video versus written text? We have a very large set of video tutorials on using the nanos unikernel. Let us know if you'd like to see something in particular.

Also, have a favorite service that works but you think we need tighter integration with? We take both pull requests and feature development requests.

Deploy Your First Open Source Unikernel In Seconds

Get Started Now.