A conceptual view of this hybrid social engineering attack | Image: Okta
A new generation of cyber threats has emerged, transforming the clunky art of voice phishing (vishing) into a sleek, real-time performance. Okta Threat Intelligence has dissected a wave of custom phishing kits designed specifically to weaponize the phone call, allowing attackers to “drive” a victim’s browser session with high precision.
These aren’t your standard, static login pages. These are sophisticated, “as-a-service” platforms that empower attackers to synchronize their verbal deception with digital manipulation, targeting users of Google, Microsoft, Okta, and cryptocurrency platforms.
The core innovation of these kits is real-time session orchestration. In traditional phishing, an attacker might send a link and hope for the best. In this evolved landscape, the attacker is on the phone with the victim, acting as technical support, while simultaneously controlling exactly what the victim sees on their screen.
Moussa Diallo, a threat researcher at Okta Threat Intelligence, describes the visceral power of these tools: “Once you get into the driver’s seat of one of these tools, you can immediately see why we are observing higher volumes of voice-based social engineering”.
This “driver’s seat” allows the attacker to adapt on the fly. As the victim navigates the fake site, the attacker receives their credentials in real-time. They then input those credentials into the legitimate service. When the real service asks for MFA—be it a push notification, an SMS code, or a number match—the attacker instantly updates the victim’s phishing page to mirror that exact challenge.
“Using these kits, an attacker on the phone to a targeted user can control the authentication flow as that user interacts with credential phishing pages,” Diallo explains. “They can control what pages the target sees in their browser in perfect synchronization with the instructions they are providing on the call”.
The implications for Multi-Factor Authentication (MFA) are stark. Many organizations rely on “number matching” (where a user must tap the correct number on their phone) to prevent fatigue attacks. However, these new kits render that defense moot. Because the attacker is on the phone, they can simply tell the victim which number to press.
The report notes that “Push with number matching/challenge is not phishing-resistant by definition, as a social engineer interacting on the phone with a targeted user can simply request a user to choose or enter a specific number”.
By bridging the gap between verbal instruction and digital feedback, “The threat actor can use this synchronization to defeat any form of MFA that is not phishing-resistant”.
This level of sophistication signals a maturing black market. Just as legitimate software has moved to SaaS (Software-as-a-Service), cybercrime is following suit. We are moving away from generic, one-size-fits-all kits toward bespoke panels tailored for specific services.
“Vishing is becoming such an in-demand area of expertise that, much like access to these kits, that expertise is also sold on an as-a-service basis,” said Diallo.
The report predicts we are only at the beginning of this wave. As threat actors successfully monetize these attacks, they are reinvesting in tools that provide even greater control, moving from generic templates to “bespoke panels for each targeted service”.
Related Posts:
- Streamlining Enterprise Updates: Microsoft’s Unified Orchestration Platform
- Securing Container Orchestration Against Kubernetes Misconfigurations
- Voice Phishing on Microsoft Teams Facilitates DarkGate Malware Attack
- Beyond Email: Why Your Microsoft Teams Chat Is Now a Phishing Danger Zone
Support Our Threat Intelligence
If you find our CVE report and cybersecurity news helpful, consider supporting our work.