QUALITY EVALUATION FRAMEWORK

The 17-criteria moderation framework.

The industry's only documented standard for AI moderation quality. Every interview is scored against 17 criteria before it counts toward your study. Underperforming interviews get re-run or excluded automatically.

← Back to AI-Moderated Interviews

Why this layer exists.

“Most AI research products treat the interview as the deliverable. We treat it as the input — every interview is scored against 17 criteria before it earns its place in the synthesis.”

— Kane Callaghan, PhD · VP of Research, GetWhy

THE 17 CRITERIA

Every criterion, every interview.

Authenticity
Real human, not a bot or coached respondent.

Profile match
Demographic and target-profile match.

Engagement quality
Genuine attention, not speed-running for incentive.

Language clarity
Parseable, complete answers.

Comprehension
Participant understood the question.

Specificity
Concrete examples, not vague platitudes.

Coherence
Consistent answers, no internal contradictions.

Time on task
Appropriate depth, not rushed.

Bias check
No leading or confirmation-skewing flow.

Probing depth
AI followed up on nuanced answers.

Probing follow-through
Every probe yielded substance.

Moderation control
Conversation stayed on-topic.

Stimulus interaction
Every stimulus addressed appropriately.

Topic coverage
Every study question addressed.

Emotional signal
Tone, hesitation, contradiction recorded.

Multimedia signal
Video and audio cues captured.

Synthesis-ready
Yields enough substance for narrative analysis.

Frequently asked questions.

What happens when an interview fails the framework?

It depends on which criteria slipped. Moderation or probing issues trigger an automatic re-run on a fresh participant from the same panel. Participant-signal failures (fraud, profile mismatch, disengagement) flag the interview for exclusion from the synthesis. The threshold per criterion is calibrated against what is recoverable, not what is theoretically possible. Either way, you only see interviews that earned their place in the study.

How were the 17 criteria chosen?

Eight years of moderating qualitative interviews — first human, then AI-assisted — and cataloguing what separated a useful interview from a wasted one. Each criterion maps to a failure mode our Senior Researchers caught repeatedly in legacy work. The framework is reviewed every quarter; criteria get added or retired as we see new failure modes emerge across our 1M+ interview library.

How is this different from human-moderation QC?

Human QC happens after the fact: a Senior Researcher reviews a sample of interviews, decides which ones held up, and re-runs the gaps. The 17-criteria framework runs in real time, on every interview, against the same standards a Senior Researcher would apply — with our researchers staying involved on the edge cases the model flags. The framework does not replace human judgment; it scales it across every interview in the study.

SEE IT IN ACTION

Run a study on your real brief.

Book a 30-minute walkthrough. Bring a real research question. Walk out with a working study setup.

Book a demo