Zoom has been under scrutiny lately, for a lot of good reasons. Since their product has quickly escalated to being a public critical infrastructure, we decided to play around with our observability stack to see how the Zoom Linux client actually works, and how the different pieces help us with analysis. This write-up is about using eBPF for research and blackbox testing, and provides hands-on examples using ingraind and RedBPF. Our intent was to see how far we can push eBPF in this domain, while also demonstrating some of the amazing engineering behind Zoom.
ingraind and RedBPF
At Red Sift, we developed ingraind, our security observability agent for our cloud, based on Rust and eBPF using RedBPF. This means we can gather run-time information about arbitrary system processes and containers to validate if they are doing anything nefarious in our cloud, who do they talk to, and which files they access. However, the cloud is just a fancy word for someone else’s computer, and ingraind runs perfectly fine on my laptop, too.
So we fired up ingraind to monitor our daily Zoom “standup” meeting, and decided to analyse the results in this blog post. We then deviated a little, and ended up writing some Rust code that helps us decrypt TLS traffic by instrumenting the binary using eBPF uprobes.
Let’s dig in, first look at the binary, then the results. At the end of the post, I share the configuration file that I used.
To get a basic idea of what to expect, I looked at the
zoom binary using
strings, and quickly found some pinned certificates for
*.zoom.us, and mentions of
xmpp.zoomgov.com. This is great!
The binary is stripped, however, some debug symbols are there for the dependency libraries, not the proprietary code. The directory also contains the Qt library, and a few other standard open source bits and pieces.
Interestingly, the package comes with a wrapper shell script
zoom.sh that manages core dumps.
Network traffic analysis
The most interesting thing I wanted to see was how Zoom handles network traffic. I expected at least a partially peer to peer setup with many participants, instead I found that only a handful of IP ranges were used during any call. Most interestingly, one call about two weeks ago showed some amount of TCP traffic hitting a broadcast IP address that ends with
.255. My recent tests showed a larger number of individual IPs within a few different IP blocks, but the client version was the same.
This is how I found out that Zoom actually operates their own ISP, which strikes as an ingenious way of managing the amount of traffic they have to deal with.
Zoom reaches out to Google’s CDN to fetch the images for the user profiles who are logged in with Google’s SSO. I suspect the same is true for participants who use Facebook logins, but I didn’t see any activity towards Facebook’s ranges.
Another thing that’s interesting to see is that there is a dedicated thread for networking. I can only hope that this means there is at least some sort of privilege separation, but I did not do syscall analysis this time.
File access patterns
The list of files Zoom touched using my call was nothing surprising. Apart from the usual X libraries, and the dependencies they ship, they maintain a local configuration file, and access PulseAudio-related resources on the file system. Stripping out the boring bits leaves us with this:
$ rg '"process_str":"zoom.*"' zoom_file.log |jq '.tags.path_str' |sort |uniq ... "etc/ca-certificates/extracted/tls-ca-bundle.pem" ... "p2501/.config/zoomus.conf" "p2501/.local/share/fonts/.uuid" "p2501/.Xauthority" "p2501/.zoom/data/conf_avatar_045a3a19053421428ff" "p2501/.zoom/data/conf_avatar_763f5840c57564bca16" "p2501/.zoom/data/conf_avatar_7cdb80036953ea86e83" "p2501/.zoom/data/conf_avatar_9426e77c9128d50079d" "p2501/.zoom/data/conf_avatar_aa2a71a3e0a424e451b" "p2501/.zoom/data/conf_avatar_b46ff5ad22374cdd56d" "p2501/.zoom/data/conf_avatar_b61879be31e3fce14ee" "p2501/.zoom/data/conf_avatar_c29cc2bde8058cb0093" "p2501/.zoom/data/conf_avatar_e3ef3d0218f29d518dd" "p2501/.zoom/data/conf_avatar_e8dc9d76cae1c2a5f3e" "p2501/.zoom/data/conf_avatar_ef7b17310c83f908b39" "p2501/.zoom/data/conf_avatar_f26b44b634fceb21d7e" "p2501/.zoom/data/conf_avatar_f28df485f9132b47c75" "p2501/.zoom/data/zoommeeting.db" "p2501/.zoom/data/zoomus.db" "p2501/.zoom/data/zoomus.tmp.db" ...
The local cache of avatars is certainly a good call. As we’ve seen above, Zoom downloads the avatars from Google if the user is logged in through the single sign-on service, and it makes sense to maintain a local cache of these. The files are PNGs and JPEGs, and I have found pictures of people I do not recognise, so it looks like there’s no automatic cleanup.
The local databases are more interesting. They are SQLite databases that seem to contain not much information at all. There is no local copy of conversations or chats.
However, the access to the TLS CA bundle is a bit baffling given the pinning certificates in the binary, but I suspect one of the linked libraries might be auto-loading this store.
Accessing the unencrypted data
For this, we had to bring out the big guns, uprobes, so I’ll hand it over to Alessandro.
As discovered by Peter looking at the network connections logged by ingraind, Zoom uses Transport Layer Security (TLS) to secure some of its connections.
There are several libraries that can be used by applications to implement TLS, including the popular OpenSSL, LibreSSL, BoringSSL, NSS etc. Having recently implemented uprobes support for RedBPF, I thought it could be fun to try and use it to hook into whatever library Zoom uses to implement TLS and intercept the unencrypted data.
Uprobes allow you to instrument user space code by attaching to arbitrary addresses inside a running program. I’m planning to talk about uprobes in a separate post, for the moment it suffices to know that the API to attach custom code to a running program is the following:
pub fn attach_uprobe( &mut self, fn_name: Option<&str>, offset: u64, target: &str, pid: Option<pid_t>, ) -> Result<()>;
attach_uprobe() parses the
target binary or library, finds the function
fn_name, and injects the BPF code at its address. If
offset is not zero, its value is added to the address of
offset is interpreted as an offset from the start of the target’s
.text section. Finally if a
pid is given, the custom code will only run for the
target loaded by the program with the given pid.
While doing this is certainly possible on a running process, it is a bit of a rabbit hole. Instrumenting Zoom to access decrypted data turned out to be a challenge, but ad-hoc attaching to existing processes within your control should be easily possible using this infrastructure.
Based on the data we’ve been able to collect with
ingraind, there are no screaming issues we’ve found. It is certainly good to see the Zoom app uses pinned certificates, and that they do not keep logs of the messaging history, even as a side-effect.
Using a Qt-based app is a great way to balance performance and security for a cross-platform audience. On top of that, it was really interesting to see how the infrastructure works in action, with connections going to over 100 target IPs in a few blocks, during a single call, being routed through Zoom’s ISP.
I highlighted that they keep a cache of profile pictures from Google accounts. This seems futile as Zoom hits Google’s CDN whenever somebody with a Google account joins.
To make sure we were looking at the right traffic and only picked up on Zoom’s DNS queries, I made sure that nothing else was running during the call but the Zoom Linux client. We used the following ingraind config to monitor the system during a Zoom call.
[[probe]] pipelines = ["console"] [probe.config] type = "Network" [[probe]] pipelines = ["console"] [probe.config] type = "DNS" interface = "wlp61s0" [[probe]] pipelines = ["console"] [probe.config] type = "Files" monitor_dirs = ["/usr/bin"] [[probe]] pipelines = ["console"] [probe.config] type = "TLS" interface = "wlp61s0" [[pipeline.console.steps]] type = "Container" [[pipeline.console.steps]] type = "Buffer" interval_s = 1 enable_histograms = false [pipeline.console.config] backend = "Console"
Using this config, we can redirect all the output into a file and use command line tools like
jq to process the results for a superficial look. Let’s take a look at the config file piece by piece.
[probe.config] type = "Network"
Network probe enable network traffic analysis. This means we get low level information on every read and write on a network socket, whether TCP, UDP, v4 or v6.
[probe.config] type = "DNS" interface = "wlp61s0"
The DNS probe collects incoming DNS traffic data. Incoming DNS traffic also includes DNS answers, so we will know exactly what our queries are and what they resolve to, even if an application chooses to craft the packets themselves and bypass the
gethostbyname(3) libc call. Due to an implementation details, we need to specify the network interface, which, using systemd, looks a bit ugly.
[probe.config] type = "Files" monitor_dirs = ["/"]
The file probe gives us information about file read/write events that happen in a directory. For monitoring Zoom, I decided I want to see all filesystem activity, so anything that happens under
/ will show up in my logs. Most applications tend to show fairly conservative access patterns after the loader caters for all the dynamic dependencies, and this is what I expect here, too.
[probe.config] type = "TLS" interface = "wlp61s0"
I want to see the details of TLS connections. Since we know Zoom uses certificate pinning, it will be interesting to see the properties and cipher suits the connections actually use.
[[pipeline.console.steps]] type = "Container"
This is a long shot, but we would be able to pick up cgroup information using the
Container post-processor. I didn’t pick up any cgroup information, though, so there’s no network-facing sandbox.
[[pipeline.console.steps]] type = "Buffer" interval_s = 1 enable_histograms = false [pipeline.console.config] backend = "Console"
And finally, aggregate events by every second to reduce the amount of data we have to analyse, then print it to the console.
Enabling aggregation is a good idea if you want to load the results into your preferred analytics stack as I did, because a raw event dump get very large very quickly.
This was a quick-ish and fun exercise to see how far we can push our tools in security research. It’s great seeing the amount of effort the community puts into securing critical infrastructure, and in these unprecedented times, privacy and security of video conferencing is definitely at the top of the list as companies are still figuring out how to transition to more sustainable remote environments.
More importantly, it shows that programs are just programs, it doesn’t matter whether we call them containers or desktop applications, the same methodology applies to monitoring them.
A large benefit of deploying an observability layer that doesn’t require opt-in from the applications is the immediate increase in coverage, whether it’s for tracing, or security-related work. The layers of data can be aggregated using ingraind’s powerful tagging system, which allows in-depth analysis across the different abstraction layers that make up the environment.
If you’d like to find out more information about ingraind, visit our information page below. Happy hacking in your sandboxes!