A while ago we wanted a pluggable method to instrument various stats in Nanos and we wanted to be able to apply this functionality to ad-hoc applications without having to re-write the code. So we took a page from the library operating system playbook and produced an inital draft for klibs in Nanos. It was only about a month or two later when we produced another klib - this time to address the missing cloud init feature that was necessary for Azure deployments to actually tell the internal Azure metadata service that it had completed the deploy.
Azure Cloud Init
Before, you could deploy Nanos unikernels to Azure but the deployment operation itself would time out confusing users as they suspected their deploy did not in fact finish when it actually did. Now, when you deploy using ops we automatically include an Azure klib to each deploy if you are deploying to Azure.
This service that listens on 168.63.129.16 expects an XML health state like the one below, after the instance boots, to be sent to: /machine?comp=health.
<Health xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\">
<GoalStateIncarnation>1</GoalStateIncarnation>
<Container>
<ContainerId>%s</ContainerId>
<RoleInstanceList>
<Role>
<InstanceId>%s</InstanceId>
<Health>
<State>Ready</State>
</Health>
</Role>
</RoleInstanceList>
</Container>
</Health>
There are many other things that this service provides such as DHCP, heartbeats, and DNS but this is the main one. All clouds have something similar but Azure was the first cloud that we really needed to do something for.
One of the reasons we ended up going down this route was we wanted to add some extensibility, however, as many of our users would agree, we don't want to force these extensions on anyone even if the code is not active in the kernel. In other words we don't even want this code inside the kernel unless the user chooses it to be so. A lot of Linux kernel modules provide support for various hardware devices. That's not really the case for Nanos though as Nanos is only intended to run as a virtualized payload on top of the hypervisor of choice. The virtualization aspect is actually a very important architectural aspect of Nanos itself. Without it we can't produce the other aspects it has. In fact if you look at the nearly 30M lines of code in linux half of it is drivers. This makes a lot of sense as Linux has been around for 30 years and is specifically intended to run on real hardware. In that sense you can view Nanos as a preferred replacement for Linux vms but not Linux on hardware.
Diving into Klibs
Anyways, keeping these klibs as a base module that could be specified at unikernel build-time allows us to easily consume them from OPS. Indeed all it takes is one line of code and you can load your own klibs:
c.RunConfig.Klibs = append(c.RunConfig.Klibs, "my_klib")
Finally, it gives an alternate path for other 3rd parties to extend the system in manners that they choose. For instance many APM providers have ways of instrumenting applications inside the application but many also have agents that run outside of the app. This makes a lot of sense when running on something like Linux where your base distro might have a hundred processes running even before you start your application but doesn't make a ton of sense inside of a unikernel. That doesn't mean that there aren't other non-application specific metrics that one would want to collect and measure and this gives a mechanism to do so. Similarily, many different users might wish to extend Nanos in ways we'd never support in the base kernel. Just spitballing here but imagine someone wanted to stream a framebuffer so they could have a virtual application like a browser instantitated for each new session from an end user in a browser. The entire industry/ecosystem of VDI could greatly use something like this as right now they run full blown desktops to do this when it probably shouldn't be necessary. This would be an example of where someone could write their own klib, utilize Nanos, but it doesn't touch the rest of the code.
Each klib has an init function and is currently tied to specific kernel releases (which we probably won't bend on that anytime soon). Also, it should be noted that this interface is subject to heavy change in the future as we experiment and get feedback from other developers. Here's the init for the Azure cloud init:
int init(void *md, klib_get_sym get_sym, klib_add_sym add_sym)
{
void *(*get_kernel_heaps)(void) = get_sym("get_kernel_heaps");
boolean (*first_boot)(void) = get_sym("first_boot");
if (!get_kernel_heaps || !first_boot)
return KLIB_INIT_FAILED;
heap h = heap_general(get_kernel_heaps());
if (first_boot()) {
enum cloud c = cloud_detect(get_sym);
switch (c) {
case CLOUD_ERROR:
return KLIB_INIT_FAILED;
case CLOUD_AZURE:
if (!azure_cloud_init(h, get_sym))
return KLIB_INIT_FAILED;
break;
default:
break;
}
}
return KLIB_INIT_OK;
}
Klibs shouldn't be confused with linux kernel modules. They serve separate purposes and are definitely implemented differently as they don't export symbols to be called directly like Linux kernel modules do. Also there's no equivalent to insmod or rmmod for a few reasons. First, we don't want the capability to inject these at run-time. I think you are just asking for trouble to support something like that. Every klib must be present in the manifest upon load or the entire system pukes. Second, there is no 'interactive' shell in unikernels by design as that implies that you are executing multiple different processes which is definitely not the case and not something we will ever support by design. One of the unikernel arguments is that we are now at a point in time where it is becoming increasingly untenable to ssh into random systems to poke/peek around simply because of the scale of most software deployments today. We firmly believe edge compute is going to solidify this position. Even if these logins/agents are done in an automated fashion like you would find with traditional configuration management software it doesn't matter. Finally, this "management interface" is the number one prize for an attacker. Attackers don't care at all about the software they are attacking - they just want accesss to the system to run their own code be it a cryptominer or mysqldump.
The more proper way of viewing these is seeing them more like plugins.
Here we show a sample nanos manifest. You'll note that if you run the nanos dump command on this image, klibs that are auto-added in ops live on a separate partition from the application and it is not readable by the end user application. Here bootfs is supplied instead of rootfs. Technically, you can change the partition if you so wish but that might be enforced in the future.
(
boot:(
children:(
klib:(children:(test:(contents:(host:output/klib/bin/test))))
)
)
children:(
klibs:(contents:(host:output/test/runtime/bin/klibs))
etc:(children:(ld.so.cache:(contents:(host:/etc/ld.so.cache))))
lib:(children:(x86_64-linux-gnu:(children:(libc.so.6:(contents:(host:/lib/x86_64-linux-gnu/libc.so.6))))))
lib64:(children:(ld-linux-x86-64.so.2:(contents:(host:/lib64/ld-linux-x86-64.so.2))))
)
program:/klibs
fault:t
klibs:bootfs
arguments:[klibs poppy]
environment:(USER:bobby PWD:/)
)
As you can see only a few extra stanzas are neeed in the manifest. For those of you new to Nanos - almost no one actually edits this besides core kernel developers. It's all auto-generated by tools like OPS. We are considering adding more cli specific commands to work with klibs as their usecases arise.
That's a quick brief intro to running and working with klibs in the Nanos unikernel.