Strelka: real-time file scanning system

by do son · December 26, 2018

Strelka

Strelka is a real-time file scanning system used for threat hunting, threat detection, and incident response. Based on the design established by Lockheed Martin’s Laika BOSS and similar projects (see: related projects), Strelka’s purpose is to perform file extraction and metadata collection on a huge scale.

Strelka differs from its sibling projects in a few significant ways:

Codebase is Python 3 (minimum supported version is 3.6)
Designed for non-interactive, distributed systems (network security monitoring sensors, live response scripts, disk/memory extraction, etc.)
Supports direct and remote file requests (Amazon S3, Google Cloud Storage, etc.) with optional encryption and authentication
Uses widely supported networking, messaging, and data libraries/formats (ZeroMQ, protocol buffers, YAML, JSON)
Built-in scan result logging and log management (compatible with Filebeat/ElasticStack, Splunk, etc.)

Architecture

Overview

Strelka’s architecture allows clients (“clients”) to submit file requests to a single intake server (“broker”) which distributes the requests as tasks to multiple processing servers (“workers”). A series of workers connected to a broker creates a “cluster.” During file processing, files are sent through a series of metadata and file extraction modules (“scanners”) via a user-defined distribution system (“tastes” and “flavors”); file scan results are logged to disk and can be sent to downstream analytics platforms (e.g. ElasticStack, Splunk, etc.).

This architecture makes the following deployments possible:

1-to-1 cluster (one client to one worker)
1-to-N cluster (one client to N workers)
N-to-1 cluster (N clients to one worker)
N-to-N cluster (N clients to N workers)

The most practical deployment is an N-to-N cluster — this creates a fully scalable deployment that can be modified in-place without requiring cluster downtime.

Networking

Clients, brokers, and workers communicate using TCP sockets in the ZeroMQ (ZMQ) networking library.

Messaging

File requests are encoded as protocol buffers (protobuf). protobufs have a maximum message size of 2GB — any attempts to send file requests bigger than the maximum message size will fail and we have observed inconsistent behavior with direct file requests larger than 1.5GB. We do not recommend scanning extremely large files (>1GB), but if you must, then we suggest using remote file requests to do so.

Data

Configuration files are written in YAML format. Internal file metadata is written in JSON format according to Google’s JSON style guide. Timestamp metadata is formatted according to ISO 8601 (UTC in seconds).

Strelka: real-time file scanning system

Search

Brilliantly

Content & Links

Strelka: real-time file scanning system

Strelka

Architecture

Overview

Networking

Messaging

Data

Download && Use

Search

Brilliantly

Content & Links