ipcdump: tracing interprocess communication (IPC) on Linux

ipcdump

ipcdump

ipcdump is a tool for tracing interprocess communication (IPC) on Linux. It covers most of the common IPC mechanisms — pipes, fifos, signals, Unix sockets, loopback-based networking, and pseudoterminals. It’s a useful tool for debugging multi-process applications, and it’s also a simple way to understand how the different moving parts in your system communicate with one another. ipcdump can trace both the metadata and the contents of this communication, and it’s particularly well-suited to tracing IPC between short-lived processes, which can be difficult using traditional debugging tools, like strace or gdb. It also has some basic filtering capabilities to help you sift through large quantities of events. Most of the information ipcdump collects comes from BPF hooks placed on kprobes and tracepoints at key functions in the kernel, although it also fills in some bookkeeping from the /proc filesystem. To this end, ipcdump makes heavy use of gobpf, which provides golang binding for the bcc framework.

Features

  • Support for pipes and FIFOs
  • Loopback IPC
  • Signals (regular and real-time)
  • Unix streams and datagrams
  • Pseudoterminal-based IPC
  • Event filtering based on process PID or name
  • Human-friendly or JSON-formatted output

Design

ipcdump is built of a series of collectors, each of which is in charge of a particular type of IPC event. For example, IPC_EVENT_LOOPBACK_SOCK_UDP or IPC_EVENT_SIGNAL.

In practice, all of the collectors are built using bpf hooks attached to kprobes and tracepoints. Their implementations are entirely separate, though — there’s no particular reason to assume our information will always come from bpf. That said, the different collectors do have to share a single bpf module because there’s some common code that they need to share. To this end, we share a single BpfBuilder (which is essentially a wrapper around concatenating strings of bcc code) and each collector registers its own code with that builder. The full bcc script is then loaded with gobpf, and each module places the hooks it needs.

There are currently two kinds of bookkeeping that are shared between IPC collectors:

  • SocketIdentifier (internal/collection/sock_id.go) — maps between kernel struct sock* and the processes that use them.
  • CommIdentifier (internal/collection/comm_id.go) — maps between pid numbers and the corresponding process name (/proc/<pid>/comm). The bookkeeping done in each of these is particularly important for short-lived processes; while this information can be filled out later in usermode by parsing /proc, often the relevant process will have disappeared by the time the event hits the handler. That said, we do sometimes fill in information from /proc. This happens mostly for processes that existed before ipcdump was run; we won’t catch events like process naming in this case. SocketIdentifier and CommIdentifier sort of try and abstract this duality between bcc code and /proc parsing behind a single API, although it’s not super-clean. By the way, in super-new versions of Linux (5.8), bpf iterators can entirely replace this bookkeeping, although for backwards compatibility we should probably stick to the hooks-and-procfs paradigm for now.

Event output is done through the common EmitIpcEvent() function, which takes a standard event format (source process, dest process, metadata key-value pairs, and contents) and outputs it in a unified format. To save event bandwidth, collectors typically don’t output IPC contents if the -x flag isn’t specified. This is done with some fancy preprocessing magic in internal/collection/ipc_bytes.go.

Install & Use

Copyright (c) 2020, Guardicore
All rights reserved.