C0deVari4nt: scans codebases for similar vulnerabilities
C0deVari4nt
C0deVari4nt is a variant analysis and visualisation tool that inspects codebases for similar vulnerabilities. It leverages CodeQL, a semantic code analysis engine, to query code based on user-controlled CodeQL query templates and passes the results to Neo4j for further exploration and visualisation. This enables quick and comprehensive variant analysis based on previous vulnerability reports. The Neo4j visualisation feature provides additional insight for developers into vulnerable code paths and allows them to effectively triage potential variants.
Tool Components
C0deVari4nt is built using python, CodeQL, and Neo4j to create an interactive GUI application to take user input and showcase relationships between different vulnerable code paths.
C0deVari4nt consists of the following 3 main components:
- Variant-Inputs: This module takes in a target CodeQL database zip file and user inputs of known source and sink functions to generate CodeQL queries to find bug variants. Alternatively, the module can generate queries from all source functions to Microsoft’s Security Development Lifecycle’s Banned String Copy functions.
- Variant-Query: This module forms CodeQL query files from a set of predefined query templates using the inputted sources and sinks. The query files are then run on the provided target database to generate a sarif file object of potentially vulnerable code paths.
- Parse2Neo: This module parses the extracted code paths to create nodes and relationships within the Neo4j interactive graph database. Nodes are categorised into their respective paths, sources and sinks through labels. The module is able to discern recurring steps between multiple code pathways and link them together to display any overlapping of steps.
Additional Features
Currently, our tool uses a CLI interface built with Python to take user input. Since Neo4j supports integration with UI interfaces, we are looking to migrate the project to a desktop GUI application that will adopt Neo4j’s graph database view and also be able to query custom path perspectives relevant to CodeQL. In a way, our tool will work similarly to Bloodhound, an Active Directory path management solution also built upon Neo4j.
C0deVari4nt will be available as an open-source project to facilitate additional plugins and contributions in the form of query variant templates by the community to bring the tool to greater heights.