Do AI Text Detectors Work?
As AI-generated text proliferates, so do the tools claiming to detect it. Learn which tools can help reliably separate AI-generated text from human content.
As the amount of AI-generated content (and suspect content) on the internet has increased, the demand for AI detectors has grown in tandem. Everyone, from teachers hoping to catch cheating students to publishers looking for lazy writers, wants to ensure the work before them is human-generated. And, on the other side of that coin, are the GenAI enthusiasts who want help disguising their work from detectors.
Somewhat ironically, the chief weapon in the detection of AI is AI itself. Trained on examples of human- and machine-generated content, AI detectors aim to discern which content came from their Gen AI cousins.
How do AI detectors work?
While there is no shortage of theories about how to detect AI — think too many em dashes or a plethora of emojis — actual AI detectors base their decisions on a deeper statistical analysis. “They’re built to answer one critical question: does this text have the statistical fingerprint of a machine?” explains NaturalWrite.com.
AI detectors deploy machine learning and statistical analysis to identify the subtle traces that AI leaves behind. According to the California Learning Resource Network, AI tends to repeat phrases, can’t replicate “the nuanced variations and stylistic flourishes characteristic of human writing,” lacks depth and emotion, and the overall product is incoherent, even if individual sentences are grammatically correct. Altogether the CLRN post lists a dozen or so tell-tale signs that can reveal the work of an LLM.
Can you trust AI-detectors?
But do the detectors work? Early models certainly left a lot to be desired. Even OpenAI quickly realized that its original attempt to create an AI detection tool lacked the necessary accuracy for success. Digital Ocean reports that the tool could only identify a quarter of AI-generated examples, while it turned up false positives about 9% of the time.
In fact, false positives are the most common problem afflicting AI detectors. According to Digital Ocean, “These false positives occur because the detectors rely on statistical patterns and word frequency analysis, flagging text that contains specific phrases or structures common in AI writing—words like ‘delves,’ ‘showcasing,’ and ‘crucial’ that appear much more frequently in machine-generated text.”
To make matters worse, detectors have historically tended to be more unreliable on shorter writing samples. So, while it may be able to tell you if a book is AI-generated, it may have a hard time telling you if a social media caption came from a human or a machine.
This problem is exacerbated by the fact that most detectors are free for small chunks of text, and many users break up larger texts to circumvent the word or character limits on free tools, penalizing the reliability of the assessment.
But things are improving and, if the user is prepared to pay a subscription, reliable detectors are now available on the market.
Testing the detectors
Determining whether a detector is reliable means going beyond a few casual trials and using a systematic approach. In July 2025 David Gewirtz, Senior Contributing Editor at ZDNet repeated a set of tests he had carried out a couple of years earlier. He found that there had been a significant improvement with some products delivering usable performances.
“To test the AI detectors, I'm using five blocks of text. Two were written by me and three were written by ChatGPT,” wrote Gewirtz. “To test a content detector, I feed each block to the detector separately and record the result. If the detector is correct, I consider the test passed; if it's wrong, I consider it failed.”
The best AI text detectors
At the end of his test of 10 detectors, Gewirtz found several that correctly identified AI text 100% of the time:
So, yes there are AI detectors out there that work, and can be used reliably, though Gewirtz advises caution:
“While there have been some perfect scores, I don't recommend relying solely on these tools to validate human-written content … I would advocate caution before relying on the results of any -- or all -- of these tools.”