OpenAI Is Now Scanning Conversations and Reporting Harmful Content to Police

Do Son August 28, 2025 2 minutes read

ChatGPT has become the fastest-growing application in history and remains the most widely used AI-powered app to date. Yet, an increasing number of reports now link AI chatbots to cases of self-harm, delusions, hospitalizations, arrests, and even suicides.

In its latest blog post, OpenAI acknowledged certain shortcomings in how the company has handled user mental health crises. It also revealed that it is now scanning user conversations for specific categories of harmful content, with some cases—when deemed necessary—being reported to law enforcement.

More specifically, when the system detects dialogue suggesting potential harm to others, those conversations are routed into a dedicated channel for review by a specially trained internal team.

This small team holds the authority to take action against user accounts, including immediate suspension. In cases where human reviewers conclude that a conversation involves an imminent threat of serious physical harm, the information may be escalated directly to law enforcement.

Although OpenAI has not disclosed the finer details of this review system, it did state that ChatGPT must not be used to inflict harm on oneself or others. Prohibited uses also include promoting suicide or self-harm, developing or deploying weapons, causing injury to individuals, or damaging property.

At present, only messages involving threats to others are forwarded to law enforcement. OpenAI clarified that, given the uniquely private nature of ChatGPT interactions, self-harm disclosures are not currently shared with authorities, in order to respect user privacy.

The company also admitted awareness of a troubling reliability issue: its safety safeguards tend to weaken during extended conversations. For example, if a user declares suicidal intent at the beginning of a session, ChatGPT typically responds with appropriate guidance, such as hotline numbers and encouragement to seek professional help.

However, after multiple conversational turns, if the user again expresses intent to commit suicide, ChatGPT may sometimes fail to maintain safeguards and produce inappropriate responses. OpenAI confirmed that it is actively working to improve system reliability and prevent such lapses, ensuring that the model consistently delivers responsible and safe guidance in crisis scenarios.

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Related Posts:

Get Zero-Hour Vulnerability Alerts

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply