semgrep v1.5.5 releases: Fast and syntax-aware semantic code pattern search
Semgrep
Semgrep is a command-line tool for offline static analysis. Use pre-built or custom rules to enforce code and security standards in your codebase. You can try it now with our interactive live editor.
Semgrep combines the convenient and iterative style of grep with the powerful features of an Abstract Syntax Tree (AST) matcher and limited dataflow. Easily find function calls, class or method definitions, and more without having to understand ASTs or wrestle with regexes.
Motivation
Semgrep exists because:
- Insecure code is easy to write
- The future of security involves automatically guiding developers towards a “paved road” made of default-safe frameworks (i.e. React or Object-relational Mappers)
- grep isn’t expressive enough and traditional static analysis tools (SAST) are too complicated/slow for paved road automation
The AppSec, Developer, and DevOps communities deserve a static analysis tool that is fast, easy to use, code-aware, multi-lingual and open source!
Overview
Semgrep is optimized for:
- Speed: Fast enough to run on every build, commit, or file save
- Finding bugs that matter: Run your own specialized rules or choose OWASP 10 checks from the Semgrep Registry. Rules match source code at the Abstract Syntax Tree (AST) level, unlike regexes that match strings and aren’t semantically aware.
- Ease of customization: Rules look like the code you’re searching for, no static analysis Ph.D. required. They don’t require compiled code, only source, reducing iteration time.
- Ease of integration. Highly portable and many CI and git-hook integrations already exist. Output –json and pipe results into your existing systems.
- Polyglot environments: Don’t learn and maintain multiple tools for your polyglot environment (e.g. ESLint, find-sec-bugs, RuboCop, Gosec). Use the same syntax and concepts independent of language.
Language Support
Go · Java · JavaScript · JSX · JSON · Python · Ruby · TypeScript · TSX
Pattern Syntax Teaser
One of the most unique and useful things about Semgrep is how easy it is to write and iterate on queries.
The goal is to make it as easy as possible to go from an idea in your head to find the code patterns you intend to.
Example: Say you want to find all calls to a function named exec, and you don’t care about the arguments. With Semgrep, you could simply supply the pattern exec(…) and you’d match:
Use case | Semgrep rule |
---|---|
Ban dangerous APIs | Prevent use of exec |
Search routes and authentication | Extract Spring routes |
Enforce the use secure defaults | Securely set Flask cookies |
Tainted data flowing into sinks | ExpressJS dataflow into sandbox.run |
Enforce project best-practices | Use assertEqual for == checks, Always check subprocess calls |
Codify project-specific knowledge | Verify transactions before making them |
Audit security hotspots | Finding XSS in Apache Airflow, Hardcoded credentials |
Audit configuration files | Find S3 ARN uses |
Migrate from deprecated APIs | DES is deprecated, Deprecated Flask APIs, Deprecated Bokeh APIs |
Apply automatic fixes | Use listenAndServeTLS |
Changelog v1.5.1
Added
- Python: Constant propagation will now recognize the idiom
cond and X or Y
,
as well asTrue and X
andFalse or X
. So e.g.cond and "a" or "b"
will
be identified as a constant string. (gh-6079) - Julia: Julia is now experimental (pa-2366)
Changed
- DeepSemgrep is now Semgrep PRO! To install the Semgrep PRO engine run:
semgrep install-semgrep-pro
. This engine is still invoked using the
--deep
flag, but please expect changes to the CLI in the near future.
The new Semgrep PRO engine adds support for Apex! (pa-2389)
Fixed
- New ‘transform:’ field in extract mode rules, with ‘concat_json_string_array’
option useful to extract python code from jupyter notebooks.
Thanks to Jose Selvi for his contribution! (gh-4477) - Java: Fixed regression introduced in 0.123.0 that could cause a private class
attribute to be incorrectly regarded as a constant. (gh-6793) - Make
$F(x)
matcheval(x)
. Previously,eval
was special-cased and metavariable function call patterns would not match it. (gh-6877) - DeepSemgrep: Enabled
--dataflow-traces
by default when--deep
is specified (pa-2274) - In rare situations, mainly in DeepSemgrep and related to naming bugs, the use of
symbolic propagation could make Semgrep fall into an infinite loop during matching.
This has been fixed by bounding the number of times that Semgrep can follow
symbolically-propagated values. (pa-2324) - CLI: Made an error message for when two autofix matches overlap have a more helpful message, as well as be displayed as a debug message. (pa-2393)
- CLI: Made the warning message when using Semgrep Pro more friendly when logged in (pa-2396)
Install & Use
Copyright (C) 2020 returntocorp