• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Red Sift Blog

Red Sift Blog
  • redsift.com
  • Featured
  • Who are we?
  • Get in touch
You are here: Home / Labs / eBPF, ingrained in Rust

eBPF, ingrained in Rust

by Peter Parkanyi
September 25, 2018September 1, 2022Filed under:
  • Labs
  • Work at Red Sift

Today we are releasing RedBPF and ingraind, our eBPF toolkit that integrates with StatsD and S3, to gather feedback, and see where others in the Rust community might take this framework. If you are looking to up your company’s monitoring game, gather more data about your Raspberry Pi cluster at home, or just have a strong academic interest in Rust and low-level bit shepherding, you might want to read on.

Rusty gadgets on a car's dashboard

At Red Sift we run our own Kubernetes cluster and our own data processing pipeline on top of it. We’re dealing with a whole lot of email and wanted a more in-depth understanding of the communication patterns of our own systems. This type of data ultimately helps us develop a baseline and notice any irregularities that might indicate a security incident or malfunction.

RedBPF is a Rust library for eBPF, a flexible VM subsystem inside the Linux kernel. Using RedBPF, we built ingraind, a metric collector agent we can distribute as a single binary and a system-specific config file.

Existing eBPF solutions tend to focus primarily on performance and debugging, but we wanted to see if eBPF can help us understand our security profile better.

We started off with 4 primary things we want to know about: DNS activity and TLS connection details without having to resort to port filtering, UDP/TCP traffic volume per process, and file access patterns.

Since we’re already using Datadog for performance monitoring, it seemed logical to expand our dashboards with security-related metrics using a relatively lightweight agent that plugs easily into a StatsD backend.

Which processes accessed /etc/passwd in the past day and when?

eBPF

A relatively new subsystem in the Linux kernel, eBPF is making waves across the industry, and for good reasons. It’s at the peculiar point on the hype curve where it’s simultaneously new enough and mature enough to be taken seriously.

eBPF is a virtual machine within the kernel that allows hooking up the network interfaces, or syscalls, or even specific in-kernel functions behind the syscall gate. At the moment of writing, there are around 200-odd calls that we can monitor using tiny programs to get deep intelligence from the Linux kernel.

Due to its flexibility, people are using eBPF modules for various things: packet filtering, software defined networks, performance monitoring, or generally extracting real-time diagnostics to better understand a system.

However, because space and attention span in a technical blog post is finite, for a technical dive into eBPF, check out the wiki page we just opened up, or the amazing write-up by Cilium.

Enter Rust

Most of the existing eBPF ecosystem is based on BCC, which has convenient Python and Go bindings. However, we wanted something that doesn’t need a full-blown compiler toolchain and kernel sources on the target box, so using the traditional BCC workflow was out of question, and its convenient bindings along with it.

Our self-imposed limitation left us with one option: GoBPF has support for bringing your own bytecode, and some great software already use it.

Due to Go’s threading model, however, it can’t match Rust’s performance doing C FFI. Interacting with eBPF modules generally involves a lot of low level memory management, so this was important. We wanted to run ingraind in production, so it shouldn’t consume more resources than absolutely necessary. It seemed like building our own Rust BPF framework was a solid choice.

RedBPF

When dealing with eBPF modules in the kernel, there are a lot of intricacies to which one has to pay attention, and one needs to dig through a lot of code to figure out how it’s done. BCC is still the most comprehensive suite that deals with eBPF. GoBPF and a few other libraries started diverging, while maintaining some resemblance, and simultaneously adding features that can’t be found elsewhere.

I strongly consider this an anti-pattern: because it’s not a straight-up fork, but a semi-port, fixes and new features are increasingly difficult to integrate from the original code base. While rule #1 of the Linux kernel development is “You don’t break userland”, in practice it just means that legacy systems are kept around while new, preferred ways of achieving the same results are added.

To get the best of both worlds, we chose to go with a mixed approach: re-use whatever we can from BCC’s libbpf component, an amazing library that can be statically linked and exclusively deals with the runtime management of eBPF modules, and implement high-level bindings, ELF parsing, and other workflow-specific code in Rust.

The end result is RedBPF, a library that’s highly focused on providing low-friction, idiomatic bindings for libbpf, consuming perf events, and provides a build-load-run workflow for eBPF while keeping the workflow not too opinionated. Let’s walk through the details!

Building modules

If the build feature is enabled, RedBPF provides a toolkit for building C code into eBPF modules, generating Rust bindings from C header files through Bindgen, and using a build cache to speed things up when all of this is used through build.rs.

[rust]
​​use redbpf::build::{build, generate_bindings, cache::BuildCache, headers::headers};

fn main() -> Result<(), Error> {
let out_dir = PathBuf::from(env::var(“OUT_DIR”)?);

// generate the build & bindgen flags here. skipped for brevity.

let mut cache = BuildCache::new(&out_dir);
for file in source_files(“./bpf”, “c”)? {
if cache.file_changed(&file) {
build(&flags[..], &out_dir, &file).expect(“Failed building BPF plugin!”);
}
}
for file in source_files(“./bpf”, “h”)? {
if cache.file_changed(&file) {
generate_bindings(&bindgen_flags[..], &out_dir, &file)
.expect(“Failed generating data bindings!”);
}
}

cache.save();
Ok(())
}
[/rust]

Generating the Rust bindings for in-kernel data structures currently depends on structs having a definition matching a regex struct _data_[^{}]* which is not incredibly sophisticated, but worked well enough so far. One advantage of this approach is that data structures that are sent to userland are clearly marked in the C BPF code at every use. 

[c]
// the data structure is originally defined in `connection.h`
// the build step generates `$OUT_DIR/connection.rs`

struct _data_connect {
u64 id;
u64 ts;
char comm[TASK_COMM_LEN];
u32 saddr;
u32 daddr;
u16 dport;
u16 sport;
};
[/c]

In ingraind we read the raw data from a perf event stream, and convert it to a safe, high-level structure. Technically, there’s still a fair bit of repetition in this process, something that may be optimized further with a few clever procedural macros.

[rust]
include!(concat!(env!(“OUT_DIR”), “/connection.rs”));
#[derive(Debug, Serialize, Deserialize)]
pub struct Connection {
pub task_id: u64,
pub name: String,
pub destination_ip: Ipv4Addr,
pub destination_port: u16,
pub source_ip: Ipv4Addr,
pub source_port: u16,
pub proto: String,
}

impl From<_data_connect> for Connection {
fn from(data: _data_connect) -> Connection {
Connection {
task_id: data.id,
name: to_string(unsafe { &*(&data.comm as *const [i8] as *const [u8]) }),
source_ip: to_ipv4(data.saddr),
destination_ip: to_ipv4(data.daddr),
destination_port: to_le(data.dport),
source_port: to_le(data.sport),
proto: “”.to_string(),
}
}
}
[/rust]

While there could be multiple ways of distributing the resulting eBPF bytecode, we chose to include it in the main binary for simplicity of distribution.

[rust]
impl EBPFGrain<‘static> for UDP {
fn code() -> &’static [u8] {
include_bytes!(concat!(env!(“OUT_DIR”), “/udp.elf”))
}
…
}
[/rust]

One thing that’s important to note here is that the resulting binary may be kernel-dependent. Although the kernel ABI is unstable, certain parts of it change rarely. Another consideration is RANDSTRUCT, or struct layout randomization. This is a hardening feature that mangles the layout of marked structs inside the kernel to make exploitation more difficult. Distribution kernels often have this disabled, but if you build your own kernel, make sure you aren’t surprised.

Loading and maps

Without the build feature, RedBPF is runtime-only. It reads and parses ELF that contains eBPF bytecode, returns with the list of components in the object file, and provides bindings to manage programs, use BPF maps, consume perf events, and some other related utilities where we did not want to introduce another dependency, like uname, or parsing the IDs of online processors.

After loading a module, all BPF maps are automatically initialized, so we only need to deal with programs.

[rust]
use redbpf::Module;
let mut module = Module::parse(Self::code())?;
for prog in module.programs.iter_mut() {
prog.load(module.version, module.license.clone()).unwrap();
}
[/rust]

Because a BPF program can be of many types, attaching these correctly is up to the user. One way of attaching all XDP filters would be like so. Notice the unwrap: it is deliberately crashing the program if the load failed for any reason.

[rust]
use redbpf::ProgramKind::*;
for prog in self
.module
.programs
.iter_mut()
.filter(|p| p.kind == Kprobe || p.kind == Kretprobe)
{
println!(“Program: {}, {:?}”, prog.name, prog.kind);
prog.attach_probe().unwrap();
}

self.bind_perf(backends)
[/rust]

Perf events

Transferring a large amount of trace data from the kernel can be done primarily two ways: aggregate it in kernel maps, and then periodically dump them in userspace, or user perf events to stream them in real-time from the kernel, using a ring buffer. Perf events do not require a syscall, so, if for instance, we would like to aggregate data based on arbitrary properties, which we do, it would be easier to pump out the raw event stream to userland and only then do the aggregation. This has the peculiar result of seemingly high CPU use in top, but it comes with very low instruction count per cycle, which means that our program is mostly waiting on memory access, as explained in Brendan Gregg’s blog post.

The perf event interface in RedBPF was designed so that we can use a single epoll loop to listen to every perf event source. This also means that for improved throughput, we could even start an epoll thread for every CPU.

[rust]
fn bind_perf(&mut self, backends: &[BackendHandler]) -> Vec> {
let online_cpus = cpus::get_online().unwrap();
let mut output: Vec> = vec![];
for ref mut m in self.module.maps.iter_mut().filter(|m| m.kind == 4) {
for cpuid in online_cpus.iter() {
let pm = PerfMap::bind(m, -1, *cpuid, 16, -1, 0).unwrap();
output.push(Box::new(PerfHandler {
name: m.name.clone(),
perfmap: pm,
callback: T::get_handler(m.name.as_str()),
backends: backends.to_vec(),
}));
}
}

output
}
[/rust]

The above code sample will generate a PerfMap object for every online CPU, since in the kernel we also organize data on a per-CPU basis. In fact, PerfHandler can be found in ingraind, not RedBPF, and is a simple wrapper around reading the ring buffer. The interesting bit looks like this:

[rust]
use redbpf::Event;

while let Some(ev) = self.perfmap.read() {
match ev {
Event::Lost(lost) => {
println!(“Possibly lost {} samples for {}”, lost.count, &self.name);
}
Event::Sample(sample) => {
let msg = unsafe {
(self.callback)(slice::from_raw_parts(
sample.data.as_ptr(),
sample.size as usize,
))
};
msg.and_then(|m| Some(send_to(&self.backends, m)));
}
}
}
[/rust]

Overall, RedBPF tries to find a balance between low level enough so you get a grasp of what’s going on, but high level enough to get out of the way and allow the user to focus on what they need to.

ingraind

Along with the RedBPF library, today we are releasing ingraind, the agent we built around the library. And because we love data pipelines (who doesn’t?!), ingraind is built around processing the raw metric data that’s coming from the BPF probes.

The backbone of this pipeline is the great actix library. Raw data originating from the BPF probes is picked up by an event handler, turned into a safe data structure, and then passed into the actor event loop. It travels through a pipe of what we call “aggregators” and ends up at a “backend”.

All of these are implemented as actors, and the chaining of them can be defined in the configuration file. The following example tracks TLS handshakes across all IPv4 ports, and file access everywhere on the root filesystem. It whitelists the process, s_ip, d_port, and path tags, which may or may not make sense for every measurement, then replaces the dynamic process names generated by Docker’s userland proxy to a uniform one. The destination is a S3 bucket, and the AWS credentials are taken through command line.

[ini]
[[probe]]
pipelines = [“s3”]
[probe.config]
type = “Files”
monitor_dirs = [“/”]

[[probe]]
pipelines = [“s3”]
[probe.config]
type = “TLS”

[pipeline.s3.config]
backend = “S3”

[[pipeline.s3.steps]]
type = “Whitelist”
allow = [“process”, “s_ip”, “d_port”, “path”]

[[pipeline.s3.steps]]
type = “Regex”
patterns = [
{ key = “process”, regex = “conn\\d+”, replace_with = “docker_conn_proxy”},
]
[/ini]

Conclusion

We believe that the architecture and unique features of RedBPF and ingraind are making them a powerful combination for security monitoring. In the future, there’s definitely some room for improvement around the ergonomics of sharing data structures between BPF code and Rust, and there’s always more room for optimization.

To make monitoring containerized environments even easier, we will be adding container awareness to the agent shortly.

We are excited to give back something to the amazing Rust community and to hear feedback, issues, and contributions to provide a one-stop solution for production security monitoring. So feel free to get in touch with us below! Happy hacking!

Get in touch

Share this:

  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)

Related

Tagged:
  • Cybersecurity
  • Development

Post navigation

Previous Post Red Sift welcomes Kevin to its advisory board
Next Post C+ ‘Must try harder: Cybersecurity in Education

Primary Sidebar

Subscribe to our blog and be the first to get updates!

Categories

  • AI
  • BEC
  • BIMI
  • Brand Protection
  • Coronavirus
  • Cybersecurity
  • Deliverability
  • DMARC
  • DORA
  • Email
  • Finance
  • Labs
  • News
  • OnINBOX
  • Partner Program
  • Red Sift Tools
  • Work at Red Sift
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • October 2016

Copyright © 2023 · Red Sift