eBPF Malware Techniques Part 1 - Introduction
eBPF Malware Techniques Series:
- eBPF Malware Techniques Part 1 - Introduction
- eBPF Malware Techniques Part 2 - Setting Appropriate Hooks
- eBPF Malware Techniques Part 3 - Hiding BPF Traces
- eBPF Malware Techniques Part 4 - Hiding Processes
1. Background
As a cybersecurity researcher, understanding both traditional and modern methods of kernel-level code execution is crucial. Two prominent techniques you’ll encounter are LKM (Loadable Kernel Module) rootkits and eBPF-based applications.
eBPF (extended Berkeley Packet Filter) is a Linux kernel technology that allows safe, user-defined code to run in kernel context. Originally intended for packet filtering, eBPF has evolved into a generic in-kernel virtual machine — enabling dynamic tracing, monitoring, networking, and even security enforcement.
Unlike traditional rootkits that load kernel modules (LKMs), eBPF programs are loaded from user space and executed within a sandboxed, verified environment in the kernel.
2. Portability: Why eBPF Wins Over LKMs
One of the biggest advantages of eBPF over LKM-based approaches is portability. Traditional rootkits written as LKMs often rely on specific internal kernel symbols and structures that vary between kernel versions, making them fragile and version-locked.
eBPF solves this through:
-
CO-RE (Compile Once, Run Everywhere): With the help of BTF (BPF Type Format), an eBPF binary can automatically adapt to the target kernel’s data structures at load time.
-
Stable Kernel Interfaces: eBPF relies on stable hooks like tracepoints, kprobes, and cgroups, avoiding the need to patch or hook private kernel symbols.
-
User-space Loaders: eBPF programs are loaded via syscalls (e.g., bpf()), removing the need for privileged LKM loading mechanisms like
insmodormodprobe.
3. Limitations of eBPF
Despite its flexibility and growing popularity, eBPF is not a silver bullet for kernel-level control. It is tightly restricted by design:
-
Instruction Limit: Modern Linux kernels (e.g., 5.8+) support a maximum of 1 million instructions per program, though earlier versions had a hard limit of just 4096. This constraint exists to prevent DoS conditions from long-running programs.
-
No Loops (until recently): Older kernels rejected any loops to avoid non-terminating programs. Newer kernels (5.3+) allow bounded loops — but they must pass static verification.
-
No Arbitrary Memory Access: You can’t just dereference random kernel pointers. All memory accesses must be checked or derived from known safe helpers.
-
Strict Verifier Checks: The eBPF verifier enforces constraints like:
- No null pointer dereferencing
- Stack bounds checking
- No uninitialized memory use
- Guaranteed program termination
4. Frameworks
When creating eBPF applications, there are 3 major frameworks to consider. Depending on your usage, you may prefer one over the other. The table below summarizes the points to consider for eBPF development.
| Framework | Language | Kernel Requirement | Package Needed | Best For |
|---|---|---|---|---|
| BCC | Python/C++ | ≥ 4.1 (≥ 4.9 ideal) | bcc, python3-bcc |
Quick development, scripting |
| bpftrace | Custom DSL | ≥ 4.9 (≥ 5.5 ideal) | *bpftrace* |
One-liners, dynamic tracing |
| libbpf | C | ≥ 4.18 (BTF needed) | libbpf-dev, Clang, LLVM |
Performance, production systems |
4.1 BCC (BPF Compiler Collection)
BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of eBPF, a new feature that was first added to Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.
BCC makes BPF programs easier to write, with kernel instrumentation in C (and includes a C wrapper around LLVM), and front-ends in Python and lua. It is suited for many tasks, including performance analysis and network traffic control.
To get started, we need to first get its dependencies.
In my case, I am using Ubuntu 24.04 and therefore, I will need the following packages:
# python3-pyelftools needed for following example
$ sudo apt install bpfcc-tools python3-bcc python3-pyelftools linux-headers-$(uname -r)
BCC Packages for Ubuntu
However, if you are using another Linux distro, please refer to the official installation guide. Once this step is done, we can test out many of the provided tools written by iovisor and many of its contributors.
I am simply going to choose bashreadline.py as my example.
Looking at the code, it searches for the symbol readline_internal_teardown() in either /bin/bash or /lib/libreadline.so.
If the --shared option is not specified, then it defaults to /bin/bash.
...
parser = argparse.ArgumentParser(
description="Print entered bash commands from all running shells",
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument("-s", "--shared", nargs="?",
const="/lib/libreadline.so", type=str,
help="specify the location of libreadline.so library.\
Default is /lib/libreadline.so")
args = parser.parse_args()
name = args.shared if args.shared else "/bin/bash"
...
It then defines the main logic of the eBPF program in C as defined by bpf_text. This will act as the main logic to our eBPF program. This code simply gets the PID that calls on readline_internal_teardown(). If the process that triggers this is “bash”, then it will send the context data from kernel space to user space via the perf ring buffer.
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>
struct str_t {
u32 pid;
char str[80];
};
BPF_PERF_OUTPUT(events);
int printret(struct pt_regs *ctx) {
struct str_t data = {};
char comm[TASK_COMM_LEN] = {};
if (!PT_REGS_RC(ctx))
return 0;
data.pid = bpf_get_current_pid_tgid() >> 32;
bpf_probe_read_user(&data.str, sizeof(data.str), (void *)PT_REGS_RC(ctx));
bpf_get_current_comm(&comm, sizeof(comm));
if (comm[0] == 'b' && comm[1] == 'a' && comm[2] == 's' && comm[3] == 'h' && comm[4] == 0 ) {
events.perf_submit(ctx,&data,sizeof(data));
}
return 0;
};
Lastly, it uses uretprobe to hook onto the return of readline_internal_teardown() function. It also points the logic of our eBPF program to the printret() function defined in bpf_text.
...
b = BPF(text=bpf_text)
b.attach_uretprobe(name=name, sym=sym, fn_name="printret")
...
Alright, enough babbling. Let’s see this in action.
$ sudo python3 bashreadline.py
4.2 bptrace
bpftrace is a high-level tracing language for Linux. bpftrace uses LLVM as a backend to compile scripts to eBPF-bytecode and makes use of libbpf and BCC for interacting with the Linux BPF subsystem, as well as existing Linux tracing capabilities: kernel dynamic tracing (kprobes), user-level dynamic tracing (uprobes), tracepoints, etc. The bpftrace language is inspired by awk, C, and predecessor tracers such as DTrace and SystemTap. bpftrace was created by Alastair Robertson.
There are several ways to install bpftrace. Depending on your Linux distro, you may want to refer to the installation guide.
In my case, I will install bpftrace with apt.
$ sudo apt-get install bpftrace
Alternatively, you may also get a statically compiled bpftrace via the latest release.
To use this tool properly, we can either supply it with a one-liner program or a file.
# One-Liner Usage
$ bpftrace [options] -e 'program'
# Example
$ bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s\n", comm, str(args->filename)); }'
# File Usage
$ bpftrace [options] filename.bt
# Example
$ bpftrace bashreadline.bt
Relating back to the previous example where we used BCC to read the user’s bash terminal inputs, we can also do so with bpfrace! Using the bashreadline.bt example provided by the bpftrace repository, we can achieve the same results as before.
However, let’s take a look at what the code is doing first.
- After this pull request, bpftrace now supports automatic resolution of library paths and you no longer need to provide absolute or relative paths.
...
uretprobe:/bin/bash:readline,
uretprobe:libreadline:readline
/comm == "bash"/
{
time("%H:%M:%S ");
printf("%-6d %s\n", pid, str(retval));
}
The above snippet shows that it uses uretprobe to attach to the return of the readline() function in both /bin/bash and libreadline.so.*. It then prints the time, PID and the input command if the process name is “bash”.
This is almost exactly the same implementation as BCC bashreadline.py. Let’s test this out and see it for ourselves!
$ sudo bpftrace bashreadline.bt
bpftrace Demo - bashreadline.bt
4.3 libbpf
Libbpf supports building eBPF CO-RE(Compile Once, Run Everywhere)-enabled applications, which, in contrast to BCC, do not require Clang/LLVM runtime being deployed to target servers and doesn’t rely on kernel-devel headers being available.
It does rely on kernel to be built with BTF type information, though. Some major Linux distributions come with kernel BTF already built in:
- Fedora 31+
- RHEL 8.2+
- OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)
- Arch Linux (from kernel 5.7.1.arch1-1)
- Manjaro (from kernel 5.4 if compiled after 2021-06-18)
- Ubuntu 20.10
- Debian 11 (amd64/arm64)
You can check if your kernel has BTF built-in by looking for /sys/kernel/btf/vmlinux file.
$ ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 3541561 Jun 2 18:16 /sys/kernel/btf/vmlinux
If you are running on an older kernel version and absolutely want to run a CO-RE eBPF application on your system, you can override the BTF file path by with the BTF_FILE bash environment variable. You may retrieve the appropriate BTF file from the BTF Archive by looking up your distro and kernel version.
The easiest way to get started with this framework is to leverage on the examples provided by libbpf-bootstrap.
$ git clone --recurse-submodules https://github.com/libbpf/libbpf-bootstrap
For simplicity’s sake (not because I’m lazy 🤭), I am only going to go through the minimal example written in C.
When developing an eBPF application using libbpf, every application requires a .c and a corresponding .bpf.c file.
You can think of it as whatever we write in .c is in user-space while the rest of the logic in .bpf.c resides in kernel space.
Let’s start by analyzing the user-space portion of our eBPF application starting from the headers.
...
#include <bpf/libbpf.h>
#include "minimal.skel.h"
...
Looking at the listing above, libbpf.h refers to the user-space API for interacting with the BPF syscall while minimal.skel.h is an auto-generated skeleton header from the eBPF program (minimal.bpf.c) when compiling with make. This simplifies the loading/attaching lifecycle.
...
/* Open BPF application */
skel = minimal_bpf__open();
if (!skel) {
fprintf(stderr, "Failed to open BPF skeleton\n");
return 1;
}
...
Next the call to minimal_bpf__open() uses the BPF skeleton API, introduced in libbpf to simplify the boilerplate required for loading and managing BPF programs. When you compile your BPF program with bpftool gen skeleton, it creates a .skel.h file that wraps up all maps, programs, and sections into a single C interface.
Note that this call only prepares the skeleton in memory, maps out the BPF sections, but does not yet load or attach anything to the kernel.
Once the skeleton is opened, we can configure its memory-mapped .bss section — a region that is shared between the user-space process and the BPF program in the kernel.
...
/* ensure BPF program only handles write() syscalls from our process */
skel->bss->my_pid = getpid();
...
In this case, we’re telling the BPF program: “Only handle events related to my PID.”
This is a common trick to reduce noise and verifier complexity, especially in minimal test setups. The BPF code (not shown here) is expected to check my_pid and only act on syscall events if they come from this process.
Once that is done, we will proceed to load our eBPF program.
...
/* Load & verify BPF programs */
err = minimal_bpf__load(skel);
if (err) {
fprintf(stderr, "Failed to load and verify BPF skeleton\n");
goto cleanup;
}
...
This is where the eBPF bytecode is actually injected into the kernel. libbpf invokes the bpf() syscall to:
- Load maps
- Load programs
- Run the verifier
If the verifier rejects your eBPF program due to complexity, memory safety, or access issues, this step will fail. All eBPF programs must pass this strict validation step before they are allowed to run in kernel space.
Once loaded, we attach the BPF program to a kernel event.
...
/* Attach tracepoint handler */
err = minimal_bpf__attach(skel);
if (err) {
fprintf(stderr, "Failed to attach BPF skeleton\n");
goto cleanup;
}
...
As to what is going on in this program example, we must take a look at minimal.bpf.c. The source code for it isn’t too lengthy and here it is in its full glory:
// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
/* Copyright (c) 2020 Facebook */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
char LICENSE[] SEC("license") = "Dual BSD/GPL";
int my_pid = 0;
SEC("tp/syscalls/sys_enter_write")
int handle_tp(void *ctx)
{
int pid = bpf_get_current_pid_tgid() >> 32;
if (pid != my_pid)
return 0;
bpf_printk("BPF triggered from PID %d.\n", pid);
return 0;
}
Remember our my_pid variable that resides in the .bss section? It is in this section because it has been declared as a global variable here. It also declares a BPF program that is attached to the sys_enter_write tracepoint, using the SEC() macro to place the function into the proper ELF section.
This means that every time any process on the system invokes write(), this handler will be called. Tracepoints are stable instrumentation hooks provided by the kernel — great for observability without worrying about kernel version breakage like raw kprobes might cause.
I’ve talked quite a lot at this point so let’s just compile this and finally get to it.
- For static builds, edit the Makefile:
CFLAGS := -g -Wall -static ALL_LDFLAGS := $(LDFLAGS) $(EXTRA_LDFLAGS) -static -lelf -lz -lzstd
# Get dependencies
$ sudo apt-get install clang llvm build-essential libelf1 libelf-dev zlib1g-dev libzstd-dev
# Compile minimal eBPF application
$ cd examples/c
$ make minimal
Once done, we just run it and open a separate terminal to look at the trace_pipe file.
$ sudo ./minimal
# Open another terminal
$ sudo cat /sys/kernel/debug/tracing/trace_pipe
5. Conclusion
There’s more to come in part 2 of this series! In the meantime, stay frosty and stay safe!
eBPF Malware Techniques Series:
- eBPF Malware Techniques Part 1 - Introduction
- eBPF Malware Techniques Part 2 - Setting Appropriate Hooks
- eBPF Malware Techniques Part 3 - Hiding BPF Traces
- eBPF Malware Techniques Part 4 - Hiding Processes