Every score on AI Consensus Index is the product of a structured, repeatable process — four AI models, one standardised prompt, nine scored dimensions, and a human editorial layer that verifies facts without touching numbers. This page documents that process in full. If you are evaluating whether to trust our rankings, this is where to start.
Our methodology was designed around a single constraint: the scores had to be structurally protected from commercial influence. That meant the people responsible for editorial content could not be the same people producing the scores. The solution was to make the scoring machine-generated and the process transparent enough that any reader could audit whether it had been followed.
The process runs in five sequential steps. Each step is described in detail in the sections below.
The prompt below is submitted verbatim to each AI model for every platform evaluation. The only variable that changes between evaluations is the product name in the final line. Nothing else is altered — not the framing, not the dimension order, not the instructions on scoring distribution. This is the complete prompt as used in the March 2026 index cycle.
Publishing the prompt is not standard practice in the review industry. We do it because it is the most direct way to demonstrate that no vendor receives preferential framing. Any reader can submit this prompt to any of the four models themselves and compare the output to what we have published. Discrepancies are grounds for a legitimate correction request.
You are a senior HR technology research analyst writing an independent review for the AI Consensus Index, a multi-model AI evaluation platform. Your review will be published alongside reviews from other AI models and compared directly. Write with authority, precision, and without marketing language. Be willing to criticize where warranted.
Base the analysis on publicly available information about the product, industry knowledge, and typical ATS capabilities. If specific details are uncertain, state the assumption rather than inventing features.
Review this ATS product: [Product Name]
Use exactly this structure:
OVERVIEW
A 3–4 sentence introduction covering what this ATS is, who makes it, and what market segment it targets.
BEST FOR
One clear sentence. Who is the ideal user or company for this product?
PRICING SUMMARY
Summarize the pricing tiers, contract requirements, and overall value positioning. Note any pricing transparency issues.
STANDOUT FEATURES
3 to 5 features that genuinely differentiate this product from competitors. Be specific, not generic.
SCORED DIMENSIONS
Score each dimension out of 10 with 2–3 sentences of justification per score. Do not round all scores to similar numbers — differentiate clearly based on actual product strengths and weaknesses.
Ease of Use: X/10
AI & Automation Features: X/10
Integrations: X/10
Pricing & Value: X/10
Customer Support: X/10
Scalability: X/10
Reporting & Analytics: X/10
Compliance: X/10
Performance / Time to Hire Impact: X/10
OVERALL SCORE: X/10
The arithmetic mean of the 9 dimension scores above.
PROS
4 to 6 bullet points. Specific and evidence-based, not generic praise.
CONS
3 to 5 bullet points. Be direct. Do not soften legitimate weaknesses.
VERDICT
A 4–5 sentence closing recommendation. State clearly who should buy this, who should avoid it, and whether the product represents good value in the current ATS market. End with one sentence on its outlook for 2026 and beyond.
The dimensions were chosen to reflect the decision criteria most relevant to our target buyer: HR Directors, founders, and operational leads at startups and SMBs making a first or second ATS purchase, often without a dedicated procurement function. Each dimension is described below so that readers understand what is and is not being measured.
The prompt explicitly instructs models not to cluster scores. A platform that is strong across most dimensions but has a material weakness in one area should receive a low score in that dimension — not a softened 7.0. Readers can compare dimension scores to identify a platform's specific strengths and weaknesses, not just its overall position.
Model selection was governed by two criteria: public availability at the time of evaluation, and demonstrated capability on structured analytical tasks. The four models used in the current index cycle are listed below. Each brings a different weighting toward different types of evidence, which is part of why aggregation across four models produces a more reliable output than any single model alone.
No single AI model has complete or perfectly balanced knowledge of the ATS market. Each reflects the distribution of information available in its training data. Aggregating across four independent models reduces the impact of any single model's blind spots, overconfidence, or training data gaps. Where all four models agree, the score is robust. Where they diverge significantly, that variance is itself informative — it usually reflects genuine ambiguity about a platform's positioning or a recent product change that some models have more exposure to than others.
The Consensus Score for each platform is the arithmetic mean of the four individual model scores, which are themselves each the arithmetic mean of that model's nine dimension scores. The calculation is straightforward by design — no dimension is weighted above any other, and no model's output is weighted above any other.
Scores are displayed to two decimal places. No rounding to the nearest half-point is applied. A platform scoring 7.87 scores 7.87 — not 7.9 or 8.0. This precision matters when comparing platforms whose consensus scores sit close together in the rankings.
The index is sorted in descending order by Consensus Score. Where two platforms share an identical score to two decimal places, they are listed alphabetically. Rank positions are recalculated after every re-evaluation cycle.
Human editorial oversight exists to protect the accuracy of factual claims — not to influence scores. The boundary between what editors can and cannot do is precise and structural.
Editors are responsible for:
Editors cannot:
If a score is considered incorrect — because of a factual error in the underlying data, a material product change since evaluation, or a documented prompt flaw — the correct response is to re-run the evaluation with an improved prompt and publish both the old and new scores with a change note. Manual adjustment of a published score is not permitted under any circumstance.
Scores are not permanent. The ATS market moves quickly — pricing structures change, AI features are added, compliance certifications lapse or are obtained, and acquisitions alter product trajectories. Our policy is to re-evaluate the full index on a rolling cycle and to queue individual platforms for early re-evaluation when a material change is confirmed.
Triggers for an out-of-cycle re-evaluation include:
Each review page carries a "Reviewed" date in the subtitle. This date reflects the most recent evaluation cycle for that platform. The index page carries a separate "Updated" date that reflects the most recent full-index re-run.
We document our methodology's limitations because we believe informed scepticism from readers makes the index more credible, not less.
Use the Consensus Score as an efficient first filter, not a final decision. Use the dimension scores to identify which platforms align with your specific priorities. Then request demonstrations, conduct your own reference checks, and negotiate commercial terms before committing. No ranking methodology — including ours — substitutes for direct vendor evaluation at the procurement stage.
We maintain a corrections policy because publishing at this scale produces errors, and how a publication handles errors is as much a trust signal as the errors themselves.
A correction is warranted when a published review contains a demonstrably incorrect statement of fact — a wrong pricing figure, a mischaracterised feature, an inaccurate compliance status — that can be verified against publicly available documentation. Corrections are processed as follows:
Correction requests can be submitted via the contact address in the site footer. We respond to all substantive correction requests within 10 business days and publish a decision either way.