Every survey result you have ever read carries a hidden assumption: that the people who answered were real, were paying attention, and were telling the truth. For most of the last two decades, that assumption has quietly broken down. Panel fraud, answer farms, and increasingly capable bots have turned a meaningful share of survey responses into noise dressed up as signal.
Cipher is our answer to that problem. It is the detection engine inside Surbee, and on our internal benchmark it correctly separates legitimate responses from fraudulent and low-effort ones 99.4% of the time. This post explains how we got there, what that number actually means, and where it does not.
A detection accuracy number is meaningless without the test set behind it. We built ours from three sources: a large corpus of verified-human responses collected through identity-checked panels, a library of known-fraud submissions gathered from answer farms and automated agents, and a set of adversarial examples we generated ourselves to probe the edges of the model.
We deliberately weighted the test set toward hard cases. Easy fraud, like a bot that answers a fifty-question survey in nine seconds, is trivial to catch and inflates accuracy. The interesting question is whether you can catch a paid human click-worker who is racing through a study, or a language model prompted to behave like a thoughtful respondent. Those are the cases our benchmark is built around.
Cipher does not rely on any single tell. Fraud that defeats one check usually leaves fingerprints on another, so we score every response across several independent signal families and combine them:
- Behavioral timing. How long each answer took, how that compares to the response's complexity, and whether the rhythm across a session looks human or mechanical.
- Linguistic consistency. Whether open-text answers are internally coherent, on-topic, and free of the statistical fingerprints left by generated text.
- Attention and consistency checks. Embedded validations and cross-question logic that a careful human passes and a careless or automated respondent does not.
- Environmental and network signals. Device, session, and origin patterns that surface coordinated panel fraud without relying on any single fragile identifier.
- Population-level anomaly detection. Patterns that are invisible in one response but obvious across a batch, like clusters of submissions that are suspiciously similar.
No individual signal is trusted on its own. A fast response is not fraud; a fast response with generated open-text and a failed attention check almost certainly is.
We treated detection as a calibration problem, not just a classification one. It is not enough for Cipher to label a response fraudulent; it has to express how confident it is, so that researchers can set their own thresholds for how aggressive they want to be.
We validated the model with k-fold cross-validation across time-separated cohorts, so that the test data always came from a period the model had never seen. This matters because fraud evolves. A model that memorizes last quarter's bot behavior will look excellent in a naive test and fail the moment the tactics shift. Holding out future cohorts is the closest we can get to measuring real-world durability.
We also ran the model against a continuously refreshed adversarial set, including responses produced by the latest publicly available language models prompted specifically to evade detection. The 99.4% figure is the model's performance on that combined, deliberately difficult set.
Accuracy alone can hide as much as it reveals, so here is the fuller picture. On our benchmark, Cipher's false-positive rate, the share of genuine responses it wrongly flags, sits well below one percent. That number matters more than the headline accuracy, because a fraud detector that discards real data is its own kind of failure.
We tune Cipher to be conservative by default: when a response is genuinely ambiguous, it is surfaced for review rather than silently dropped. You always keep the ability to see what was flagged and why, and to override it. The goal is not a black box that deletes data on your behalf. It is a transparent second opinion you can audit.
We are deliberate about the limits. 99.4% is a benchmark figure on a known, if difficult, test set; live performance varies with survey design, incentive structure, and audience. High-incentive studies attract more sophisticated fraud, and rejection rates there are higher because the underlying fraud rate is higher. Cipher also does not verify legal identity or claim to read intent; it measures whether a response behaves like genuine, attentive human input.
Every flagged pattern and every refined check makes Cipher sharper for every team that uses Surbee, and we publish our methodology notes as we go. We think a detection engine that asks for your trust should be willing to show its work. This post is the first of those notes, and there will be more.
If you want to go deeper, talk to our research team. We are happy to walk through the benchmark, the signal families, and how Cipher would perform on the kind of studies you run.