How Facebook’s Pysa Static Analyzer Secures Millions of Python Lines
Facebook’s open‑source Pysa tool statically scans Python code to detect data‑flow vulnerabilities, XSS and SQL‑injection risks, leveraging Pyre and Zoncolan techniques, achieving rapid analysis of millions of lines and uncovering 44% of Instagram’s security flaws in early 2020.
Facebook announced the open‑source static analysis tool Pysa . It is an internal tool used on Instagram to detect and fix errors in large Python codebases, automatically identifying vulnerable code written by Facebook engineers and integrating the findings into the social network’s system.
Pysa works by scanning code statically before runtime or compilation, searching for known error patterns and helping developers flag potential issues.
Facebook claims that Pysa has matured through continuous improvement; in the first half of 2020, the tool detected 44% of security vulnerabilities in Instagram’s server‑side Python code.
Pysa stands for Python Static Analyzer and is built on the open‑source Pyre project. It can analyze data flow in Python applications and detect common web‑app security issues such as XSS and SQL injection.
The development of Pysa draws on the experience of Facebook’s earlier static analyzer Zoncolan . Both tools use the same algorithm for static analysis and share some code. Like Zoncolan, Pysa can track program data flow. Zoncolan, released in August 2019, targets the Hack language, which is similar to PHP.
Both Pysa and Zoncolan locate “sources” and “sinks” in code, tracking how data moves and identifying dangerous sinks—functions that can execute code or retrieve sensitive user data. When a connection between a source and a dangerous sink is found, the tools alert developers for investigation.
Pysa is also built for speed; it can process millions of lines of code in 30 minutes to a few hours.
A key feature is extensibility. Facebook security engineer Graham Bleaney noted that because Facebook’s own products use open‑source Python server frameworks such as Django and Tornado, Pysa can start finding security issues in projects using those frameworks from its first run.
Applying Pysa to frameworks not yet covered generally only requires adding a few lines of configuration to tell Pysa where data enters the server.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
