Oxidised eBPF I: Building a toolchain

This is the first of two posts about ingraind and the Rust BPF library that powers it, RedBPF. In this post I’m going to give you an overview of what ingraind is, and how it led to the development of RedBPF. Then in the next post, I’m going to get a little more technical and show how we compile Rust code to BPF binary code.

In case you missed it, at the end of last year I already blogged about RedBPF, explaining what its main components are and giving a simple example of how it can be used.

Why Ingraind

ingraind is the open-source security monitoring agent developed by RedSift. It comes with pre-built probes to monitor file and network activity, including in-depth analysis of DNS and TLS data, syscalls, process execution and more. It can consume and produce StatsD metrics, and it integrates with osquery. In addition to producing StatsD metrics, it can provide output using its own custom data format via HTTP or on Amazon S3.

The agent can be used to monitor traditional servers, Docker containers or entire Kubernetes clusters. Finally — thanks to RedBPF — it is dead easy to extend ingraind with your own observability modules.

Traditional security monitoring agents use a combination of techniques to detect potentially malicious activity, from periodically scanning the file system, to continuously parsing log files to periodically running commands and checking their output (for example listing the running processes and looking for well known malicious programs).

Ingraind instead uses the Linux BPF API. BPF provides hundreds of observability hook points inside the kernel, giving visibility to pretty much anything that happens on a Linux system. Thanks to BPF, ingraind is very fast, and it has near-zero overhead since it implements a push-based model where processing only happens in response to events produced by BPF code running in the kernel.

Why RedBPF

ingraind deployments can range from a single server to clusters of hundreds or thousands of nodes. High performance, low overhead and ease of deployment have always been primary goals. Therefore when the project started, Peter chose Rust as the programming language to develop the agent with.

At the time, BCC was the de-facto standard toolkit to work with BPF (and in many respects, it still is). BCC includes a clang frontend that allows you to write BPF code in a restricted subset of C, a user-space C API to load and interact with BPF code, along with first- and third-party bindings for virtually every popular language to the user-space API.

Here’s an example of a BPF program written with BCC. This particular example uses the Python user space bindings to load BPF code written in C. When executed, the program compiles the BPF C code on the fly, then loads it. This on the fly compilation is at the same very convenient during development, and a burden during deployment. Having to install llvm, libclang, kernel headers and all the other required dependencies on each node of a potentially large cluster was deemed unacceptable for ingraind.

Therefore instead of developing Rust bindings for BCC, RedSift decided to develop RedBPF, a Rust library and toolchain that would provide a build-load-run workflow that would allow Rust programs to build BPF tools and run them without having to compile anything on target machines.

RedBPF – the early days

In the early days, the API comprised of two main parts: a build API and a loader API, both written in Rust. The build API allowed you to build BPF code written in C, and to save the compiled output as an ELF object file. Then the load API allowed you to load those ELF object files in the kernel.

The layout of the ELF object files produced by RedBPF was inspired by gobpf, and compatibility with gobpf is still maintained today.

In addition to compiling the BPF code, the build API also allowed to generate Rust bindings for C data structures (using bindgen), so that BPF C code and the user space Rust code in ingraind could exchange data through BPF maps and perf events.

The BPF C code for the Files probe looked very different at this stage. If you skim quickly through the code, you’ll see that it’s a mix of regular C, plus some somewhat obscure stuff like that SEC macro (used to specify ELF sections) and strange bpf_probe_read calls.

RedBPF – today

The “Rust with C for the BPF code” version of RedBPF worked pretty well. There were however a couple of pain points.

Generating the Rust bindings for the data to be shared between the C code and the Rust code was cumbersome. The resulting bindings were necessarily not idiomatic, so we often ended up having to convert between the generated bindings and something less painful to work with from Rust.

Most importantly, having to write this kind of BPF code wasn’t fun:

struct path path;
struct inode *inode;
umode_t mode;
int check = 0;

check |= bpf_probe_read(&path, sizeof(path), (void *)&file->f_path);
check |= bpf_probe_read(&inode, sizeof(inode), (void *)&file->f_inode);
check |= bpf_probe_read(&mode, sizeof(mode), (void *)&inode->i_mode);
if (check != 0) {
    return 0;
}

This kind of error checking was error-prone, and having to use bpf_probe_read to access struct fields was annoying. To mitigate this and other quirks of the BPF platform, BCC comes with a custom clang plugin that, among other things, is often (but not always) able to automatically insert bpf_probe_read calls.

The network parsing code wasn’t much better either:

dns = buffer + sizeof(struct ethhdr)
    + sizeof(struct udphdr)
    + (ip->ihl * 4);
if (dns + 12 > data_end) {
    return -5;
}

query->id = *(u16 *) dns;
dns += 2;

if (*((u8*) dns) >> 3 != 0x10) {
    return -4;
}

There’s nothing particularly wrong with the code above, it’s how parsing in C works: you have a buffer and do pointer math as you scan it. Being used to Rust tho, we were never really satisfied with it.

The idea of writing eBPF programs in Rust has been floating around for a while, and we got just the right toolchain as a stepping stone to get there. The question remained: How would idiomatic Rust code for eBPF actually work?

At first we thought we’d create some kind of Rust based DSL, using a number of macros or even writing a custom parser. We took two weeks to experiment with ideas, at the end of which we concluded that yes, we could feasibly rewrite the BPF kernel code from C to Rust.

On September 5 last year I committed the first version of ingraind-probes. At that point the code was pretty rough, but it did show promise. And the problem of generating C to Rust bindings was gone: we could just pass #[repr(C)] structs around.

As we kept iterating on both ingraind and RedBPF, the code got better and better, until we realized that not only we could make it work, but we would eventually end up with perfectly idiomatic Rust code.

Here’s today’s Rust equivalent of the file parsing code above:

let path = file.f_path()?;
let inode = file.f_inode()?;
let mode = inode.i_mode()?;

And here’s the DNS code:

let transport = ctx.transport()?;
let data = ctx.data()?;
// DNS is at least 12 bytes
let header = data.slice(12)?;
if header[2] >> 3 & 0xF != 0u8 {
    return Ok(XdpAction::Pass);
}

A few weeks before we released ingraind 1.0, I removed the last bits of C code from the repository. Everything is written in Rust now, and extending ingraind by writing new BPF components is easier than ever. RedBPF went from being a somewhat crazy idea, to something that we now believe has the potential to become a central component of the whole BPF ecosystem.

Conclusion

Rust was the perfect language to write ingraind with. Earlier versions of the agent used Rust for the user space components, and C for the BPF kernel code. The BPF code was hard to develop and maintain, so we developed a toolchain to write BPF code in Rust. This allowed us to ship ingraind 1.0 containing only idiomatic Rust code that is easy to maintain and extend. In the next post I’ll show you some cool hacks we do to compile the Rust code to BPF binary code.

PUBLISHED BY

alessandro

21 May. 2020

SHARE ARTICLE:

Categories

Recent Posts

VIEW ALL
News

Winter wins: Red Sift OnDMARC wraps up 2024 as a G2 DMARC…

Francesca Rünger-Field

The season of giving has brought us another reason to celebrate! Red Sift OnDMARC continues its winning streak in G2’s Winter 2025 report, earning Leader status in the DMARC category for another consecutive season. This recognition reflects our strong market presence and the unwavering satisfaction of our customers. Cheers to wrapping up 2024 on…

Read more
AI

Text classification in the age of LLMs

Phong Nguyen

As natural language processing (NLP) advances, text classification remains a foundational task with applications in spam detection, sentiment analysis, topic categorization, and more. Traditionally, this task depended on rule-based systems and classical machine learning algorithms. However, the emergence of deep learning, transformer architectures, and Large Language Models (LLMs) has transformed text classification, allowing for…

Read more
Security

How to drive cybersecurity as a top business priority

Jack Lilley

Everyone has a role to play in protecting the enterprise. Whether you’re shaping strategy or implementing solutions, aligning efforts to mitigate critical risks ensures a stronger, more resilient enterprise. If you missed Red Sift’s recent webinar on “From Data to Buy-In: Driving Cybersecurity as a Top Business Priority” we’ve got you covered. The session…

Read more
DMARC

BreakSPF: How to mitigate the attack

Red Sift

BreakSPF is a newly identified attack framework that exploits misconfigurations in the Sender Policy Framework (SPF) a widely used email authentication protocol. A common misconfiguration involves overly permissive IP ranges, where SPF records allow large blocks of IP addresses to send emails on behalf of a domain. These ranges often include shared infrastructures like…

Read more