How CAIS™ is examined.
How the cut score is set.
How we would defend it.
An examination is only as credible as the methodology behind it. This document is the public, citable specification for how every CAIS™ examination is constructed, administered, scored, and audited — from the Common Body of Knowledge down to the individual item. It is designed to be held up to the standards an accreditation body would apply under ISO/IEC 17024 §9.2, and to be referenced in regulatory filings, employer due-diligence dossiers, and judicial review. Candidates, employers, regulators, and accreditation bodies read the same blueprint.
The blueprint is public. The methodology is citable. The cut score is not a negotiation.
Six principles this blueprint is built to satisfy.
Defensibility over cleverness. Every choice below is designed to survive audit.
01 · Criterion-referenced, not norm-referenced
Candidates are measured against a fixed competency standard derived from the CAIS Common Body of Knowledge, not against the performance of other candidates. A candidate passes because they demonstrated the required competency — not because they outperformed a cohort.
02 · Competency-based and job-task anchored
Every domain weight, item, scenario, and Build Task is traceable to a documented job task in the CAIS Practice Analysis. The Practice Analysis is refreshed on a three-year cycle under Standards Council oversight and is published in the Standards Library.
03 · Public blueprint, private item bank
Domain weights, item-type distributions, scoring methodology, and cut-score methodology are public. Individual items, the item bank, SME rating sheets, and candidate response data are not. The public part is what makes the credential citable; the private part is what makes it secure.
04 · Standards-aligned construction
Test construction follows principles derived from ISO/IEC 17024 §9.2 (examination) and the Standards for Educational and Psychological Testing (AERA/APA/NCME) where applicable. Divergences are documented in the Document Control section of this instrument.
05 · Fairness review before every administration
Every form is reviewed for content fairness and cultural sensitivity by a Fairness Subcommittee drawn from the Ethics Review Board. Post-administration, items are reviewed for statistical fairness via Differential Item Functioning (DIF) analysis.The attestation is the authoritative record of administration. The Registry is the human-readable mirror. If the two disagree, the registry prevails.
One standard. Three examinations.
Each credential in the CAIS™ pathway measures a distinct zone of professional competency. Each has its own blueprint, its own cut score, its own security regime. Specialty credentials (CAIS-Sec™, CAIS-Gov™) are examined by request under the same regime.
| Credential | What it measures | Written exam | Evidence component | Review | Pass mark |
|---|---|---|---|---|---|
| CAIS AI Essentials™ Foundation | AI literacy, responsible use, applied prompting in a working context | 30 items (25 MCQ + 5 scenario) MCQ + scenarios | Light applied capstone Documented AI-assisted task | Async rubric review (no panel) | 75% pass |
| CAIS AI Practitioner™ Professional core | Applied professional AI competency against the GAISB™ standard | 75 items 3 hr seat time MCQ + scenarios | None — exam-only | None | 65% pass |
| CAIS AI Team Architect™ Advanced applied | Installing practical AI systems and workflows in real organizational contexts | 75 items 3 hr seat time MCQ + scenarios | Applied team-scale capstone Deployed workflow installation + governance | Four-reviewer async panel + recorded walkthrough | 65% pass |
Pass marks are criterion-referenced cut scores. Candidates must meet or exceed the published pass mark to earn the credential.
Specialty credentials (CAIS-Sec™, CAIS-Gov™) are available by institutional or qualified-practitioner request. Their blueprints follow the same MCQ + scenario written-exam regime plus a domain-specific capstone.
What each credential weighs. And why.
Weights are derived from the CAIS™ Practice Analysis. They are not marketing choices; they are the statistical centre of gravity of each credential’s observed job tasks.
CAIS AI Essentials™
| CBK Domain | Domain code | Weight | Items (~) |
|---|---|---|---|
| Strategic Mindset & the Age of AI | AIS-101 | 15% | 5 |
| Foundations of Generative AI | AIS-120 | 30% | 9 |
| Prompt Engineering & LLM System Design | AIS-130 | 25% | 7 |
| AI Agents & Agentic Workflows | AIS-140 | 10% | 3 |
| Ethics, Data & Responsible AI | AIS-160 | 15% | 5 |
| Strategy, Transformation & Business Innovation | AIS-210 | 5% | 1 |
| Total | 100% | 30 | |
Foundation credential. Concentrates on literacy, responsible use, and safe applied prompting.
CAIS AI Practitioner™
| CBK Domain | Domain code | Weight | Items (~) |
|---|---|---|---|
| Strategic Mindset & the Age of AI | AIS-101 | 10% | 8 |
| Foundations of Generative AI | AIS-120 | 20% | 15 |
| Prompt Engineering & LLM System Design | AIS-130 | 25% | 19 |
| AI Agents & Agentic Workflows | AIS-140 | 20% | 15 |
| Ethics, Data & Responsible AI | AIS-160 | 15% | 11 |
| Strategy, Transformation & Business Innovation | AIS-210 | 5% | 4 |
| Innovation & Applied AI Foundations | AIS-230 | 5% | 3 |
| Total | 100% | 75 | |
CAIS AI Practitioner is exam-only — there is no capstone and no panel review. The credential is earned by passing the 75-question exam (65%+) and agreeing to the Certification Agreement.
CAIS AI Team Architect™
| CBK Domain | Domain code | Weight | Items (~) |
|---|---|---|---|
| Strategic Mindset & the Age of AI | AIS-101 | 10% | 8 |
| Foundations of Generative AI | AIS-120 | 10% | 7 |
| Prompt Engineering & LLM System Design | AIS-130 | 15% | 11 |
| AI Agents & Agentic Workflows | AIS-140 | 20% | 15 |
| Ethics, Data & Responsible AI | AIS-160 | 10% | 8 |
| Strategy, Transformation & Business Innovation | AIS-210 | 20% | 15 |
| Innovation & Applied AI Foundations | AIS-230 | 15% | 11 |
| Total | 100% | 75 | |
The AI Team Architect written exam is paired with a team-scale applied capstone reviewed by a four-reviewer async panel. The credential is earned via the 5-step process (Self-Guided Course → Live Cohort → Exam Bootcamp → Applied Capstone → Pass the exam and agree to the Certification Agreement). Both the written exam and the capstone must be passed independently.
Two proctored item types. One panel-reviewed component.
Only CAIS AI Team Architect uses a four-reviewer async SME panel with a recorded on-camera walkthrough. CAIS AI Essentials uses a light single-reviewer rubric review. CAIS AI Practitioner has no capstone (exam-only).
Proctored written-exam items
| Item type | Format | What it measures | Essentials | Practitioner | AI Team Architect |
|---|---|---|---|---|---|
| MCQ-SBA Single-best-answer | 1 correct of 4 options, all plausible | Recall, recognition, concept discrimination | 75% | 70% | 55% |
| Scenario Multi-stage case | Stem + 3–5 linked sub-items | Applied reasoning under constraint | 25% | 30% | 45% |
| Total of proctored written exam | 100% | 100% | 100% | ||
Sandboxed performance items have been retired from the CAIS™ examination. Skill execution is assessed instead through panel-reviewed Capstones (see below). Scenario weight rises with credential to reflect the increasing share of applied reasoning required at the Architect level.
Panel-reviewed component — Applied Capstones
Capstones are not proctored exam items. They are deliverables submitted asynchronously inside Prompt Atlas and scored against the published rubric. Only CAIS AI Team Architect uses a four-reviewer async SME panel with a recorded on-camera walkthrough. CAIS AI Essentials uses a light single-reviewer rubric review. CAIS AI Practitioner has no capstone (exam-only).
| Credential | In-course builds | Credential-bearing component | Defense format |
|---|---|---|---|
| CAIS AI Essentials™ | Formative practice tasks | Applied AI-assisted task capstone scored by SME reviewer against CAP-2026-01 rubric | Async single-reviewer rubric review — no walkthrough. |
| CAIS AI Practitioner™ | In-course builds (formative) | None — exam-only (75 items · 3 hr · 65%+) | None. |
| CAIS AI Team Architect™ | Live-Cohort and Exam-Bootcamp exercises | Team-scale applied capstone: deployed workflow installation + governance documentation + change-management artifact | Recorded walkthrough (10–15 min) reviewed by four-reviewer panel |
See Capstone Specification & Rubric (CAP-2026-01) for scoring procedures, defense protocols, and inter-rater reliability requirements.
Item-writing standards
- Every MCQ-SBA item has exactly one correct answer defensible in the published literature or in named GAISB™ standards. No "best-of-bad" items.
- Distractors are plausible misconceptions drawn from observed candidate errors in piloting, not fabricated nonsense.
- Stems are positive-phrased unless negative phrasing is the pedagogical point; "EXCEPT" and "NOT" items are bolded and limited to ≤10% of any form.
- No trick items. No tricks with formatting. No tricks with grammar. Cognitive load is domain-competency load, never language load.
- Every item carries a CBK domain tag, a sub-domain tag, a Bloom-level tag, and a Practice Analysis job-task reference.
- Every item passes through: author draft → peer SME review → Fairness Subcommittee review → pilot administration → psychometric screen (difficulty, discrimination, DIF) → bank admission.
Modified Angoff.
SEM-adjusted. Publicly defensible.
The cut score is not a number we choose. It is a number we compute and commit to.
Modified Angoff is the standard statistical method for setting the passing mark on a professional exam. A panel of nine subject-matter experts reviews every question and estimates what percentage of minimally-qualified candidates should answer it correctly. Those estimates, averaged and adjusted for statistical uncertainty (the Standard Error of Measurement, or SEM), become the passing score. It is the same family of methods used by medical-licensure, engineering, and accounting exams worldwide.
For each examination form, the passing standard is established via a Modified Angoffstandard-setting study conducted before the form goes live. The methodology is criterion-referenced, panel-based, and replicable. This section is written so that an auditor or a regulator can reconstruct the process from public record.
Panel composition
A standing Standard-Setting Panel of 9 SMEs is convened for each examination cycle. Panel composition is documented in the Council Register and is constructed for balance across: CBK domain expertise, industry sector (technology, financial services, public sector, media, healthcare, education), geography (minimum three regions), and career stage (minimum two early-career, two mid-career, two senior). Panelists complete conflict-of-interest declarations. No panelist has authored items for the form under review.
The minimally-competent candidate (MCC)
Before rating any item, the panel reaches consensus on a written description of theminimally-competent candidate (MCC) at the credential under review — a hypothetical candidate whose performance is exactly at the boundary of acceptable competency. In plain English: the panel agrees on the profile of the weakest candidate who should still pass, and uses that agreed profile as the reference point for every question they rate. This description is referenced repeatedly during rating and is published in the form’s Standard-Setting Report.
Three rounds of rating
Round 1 (independent): Each panelist reviews each item independently and records the probability, expressed as a decimal from 0.00 to 1.00, that the MCC would answer that item correctly. Panelists do not see each other's ratings or any item performance data.
Round 2 (calibration): The panel discusses items with substantial rater disagreement (typically σ ≥ 0.15), then re-rates all items independently. Item difficulty and discrimination data from piloting are provided. The goal is convergence, not forced consensus.
Round 3 (confirmation): Panelists are shown their own Round 2 ratings alongside the panel mean and the impact data (projected pass rate at each candidate cut score). They may adjust their ratings one final time. The Round 3 mean, summed across items, is the unadjusted cut score.
Standard-error adjustment
The unadjusted cut score is adjusted downward by one conditional Standard Error of Measurement (SEM) at the cut score. The effect is to grant the candidate the benefit of measurement error at the decision boundary. The adjusted cut score is the operative passing standard.
Published: the MCC description, the panel's round-by-round pass-rate projections, the unadjusted and adjusted cut scores (as proportion-correct values), the SEM at the cut score, and the demographic composition of the panel. Not published: individual panelist ratings, individual item identities, or item-level rater data.
Cut-score expression
The cut score is reported in three forms: (a) the raw proportion-correct value; (b) the scaled score equivalent on the form-invariant 200–800 reporting scale; (c) a pass/fail decision with a Standard Error band disclosed to the candidate. Candidates within ±1 SEM of the cut score are noted as "cut-score band" results in the Standard-Setting Report; the pass/fail decision is nevertheless final.
Build + Capstone scoring
Panel-reviewed Build Tasks and Capstone deliverables are scored against the published rubric by a panel of four SME reviewers, with adjudication by a fifth reviewer when dimension scores diverge by more than one rubric level. Inter-rater reliability is computed quarterly and published. SeeCAP-2026-01for Capstone-specific procedures.
Different forms. Same standard.
A candidate who sat a harder form should not be penalized for it. A candidate who sat an easier one should not benefit. Equating is how we enforce that.
We publish multiple versions of the exam so not everyone sees the same questions. To make that fair, we use a statistical method called equating that corrects for small differences in difficulty between versions. Every candidate is held to the same standard regardless of which version they sat.
Multiple forms of each credential’s examination are in active rotation for security and scheduling reasons. Scores across forms are made comparable through statistical equating. The procedure is documented here and re-documented in each form’s Equating Report.
Common-item non-equivalent groups (CINEG) design
CINEG is the name of the equating design used by most major professional certification programs. Each new form carries an anchor set of 20–30 items drawn from a stable internal-anchor bank. Anchor items are domain-balanced, represent the full difficulty range, and do not contribute to candidate scores. Anchor-item performance is used to place the new form onto the reference scale of the prior form via the Tucker linear-equating method (during the classical era of the bank) and subsequently via Item Response Theory (IRT) characteristic-curve equating once item-bank stability permits.
Transition to IRT
As the CAIS item bank matures, equating will transition to an Item Response Theory (IRT) framework. IRT is the modern statistical model used to score exams like the GMAT and GRE — specifically, a 3-parameter logistic (3PL) model for multiple-choice items and a Partial Credit Model (PCM) for multi-stage scenario items. The transition plan, including the minimum-sample-size thresholds and the back-equating procedure, is published in the Equating Transition Memorandum (EQT-2026-01, planned Q4 2026).
Scaled scoring
Reported scaled scores use a 200–800 range with a fixed cut score of 500 (i.e., the scale is anchored so that the passing standard on every form maps to 500). This convention decouples reporting from raw-score volatility across forms.
Fairness is a process, not a claim.
No examination can be proven unbiased. It can be subjected to a documented, defensible fairness regime.
After every exam, we check whether any question was systematically harder for one group of candidates than another — regardless of their actual competency. This check is calledDifferential Item Functioning (DIF). Questions that fail the check are pulled and reviewed. This is the same fairness procedure used by major licensing exams in medicine, law, and engineering.
Pre-administration: content fairness review
Every candidate form passes through the Fairness Subcommittee of the Ethics Review Board (ERB) before going live. The Subcommittee reviews items for: cultural specificity that is job-task-irrelevant, gendered or region-coded language, scenarios that privilege a narrow cultural frame, and illustrations or contexts that depend on assumed background unrelated to the competency tested. Flagged items are revised or retired.
Post-administration: Differential Item Functioning (DIF)
For every form administration that reaches the minimum sample threshold, items are screened for DIF using the Mantel-Haenszel procedure — a widely adopted statistical test for item bias — across candidate sub-groups defined by self-reported demographic and geographic categories. Items classified as Category C (large and significant DIF) are removed from the form and returned to the Fairness Subcommittee for review. Category B items (moderate DIF) are flagged for subject-matter-expert content review.
Published fairness metrics
The annual CAIS Psychometric Report publishes, at minimum: form-level pass rates by sub-group, item-level DIF flag counts, Subcommittee review outcomes, form-level Cronbach’s α and conditional SEMs, and any retired-item counts. The underlying micro-data is not published; the aggregate reporting is sufficient for audit.
Accommodations
Testing accommodations are available on documented request and are administered to protect the validity of the competency inference — extended time, alternative formats, adaptive interface settings, and private administration are supported. Accommodation requests are adjudicated by the CAIS Accommodations Panel under published criteria.
A credential that can't be verified isn't worth earning.
A credential that can be forged isn't worth verifying.
Security is layered from identity through item bank through administration through attestation.
Candidate identity
- Pre-enrollment identity verification via government-issued ID matched to candidate profile.
- Live proctor face-match and environment scan at session start.
- Continuous proctor monitoring throughout administration.
Item bank protection
- Item bank is never published. A version fingerprint of the active bank is published in the Standards Library on each version update; this is the integrity anchor, not the content.
- Item exposure is tracked per item per administration. Items exceeding exposure thresholds are rotated out.
- Authors, reviewers, and psychometric staff sign bank-access agreements; all access is audit-logged.
Administration
- All administrations are delivered inside the Prompt Atlas secure examination environment.
- Browser lockdown, clipboard isolation, network policy enforcement, keystroke monitoring.
- Session video retained for Council review for 24 months. Retention policy disclosed to candidate at enrollment.
- Prohibited-conduct flags (multiple persons in frame, secondary device, external communication) trigger immediate session suspension and ERB referral.
Public Registry attestation
Upon session completion, a structured attestation record is committed to the Public Verification Registry via the GAISB™ Standards Council.It does not contain item-level responses or video material. The attestation signature is the authoritative record; the human-readable Registry is the mirror.
A public registry record is permanent and independently verifiable. Committing administration records in the Public Registry provides three properties no hosted database can: independent third-party verifiability, institutional permanence that survives GAISB™ ceasing operations, and verifiable proof-of-administration that is useful in evidentiary and regulatory contexts.
Suspected misconduct
Suspected misconduct is referred to the Ethics Review Board for adjudication under the published Sanction Guidelines Matrix. Established misconduct — impersonation, item exfiltration, coordinated cheating — results in: credential revocation (if issued), a public revocation record with reason code, and a five-year re-sit ban. The revocation is permanent; the registry retains both issuance and revocation records in perpetuity.
When a candidate fails. What happens next.
Failure is a data point, not a verdict.
- Waiting period: 90 days between attempts, regardless of credential.
- Attempt cap: A maximum of three attempts per 12-month rolling window.
- After three failed attempts: mandatory remedial pathway inside Prompt Atlas — directed CBK review, Faculty-reviewed practice Builds, and a documented readiness sign-off before a fourth attempt is permitted.
- Form refresh: A candidate will not sit the same form twice; form rotation is enforced at administration.
- Code of Conduct bans: Where misconduct is established, re-sit is barred for the full ban period (typically five years for cheating; ten years or permanent for impersonation). See Code of Professional Conduct.
- Fee relief: Candidates who fail once at the cut-score band may elect a discounted re-sit fee as a policy matter; this is a Standards Council discretion, not a right.
How this blueprint can be challenged. How it can be audited.
A published methodology you can't contest is a press release, not a standard.
Public comment
This blueprint is open for structured public comment for 180 days from the publication of each revision. Comments are submitted through the Standards Library public-comment form, received on the public record, and disposed of by the Examination Working Group with a published disposition matrix (accepted, accepted-with-modification, rejected-with-reason, deferred). Material changes trigger a new 180-day window.
Regulator audit access
National competent authorities and recognized accreditation bodies may request Regulator Audit Access under the Regulator Engagement Office charter. Access includes, under NDA: the full Standard-Setting Report for specified forms, panel composition with conflict-of-interest declarations, item-bank integrity artefacts (not items), Fairness Subcommittee minutes, and the Psychometric Report appendices. Access is free of charge. See Audit Access.
Employer due diligence
Employers conducting due diligence on the credential may request a Blueprint Briefing through the Employer Recognition Network. Briefings cover the public blueprint in Q&A format and map CAIS™ credentials to hiring bands. See For Employers.
Document Control
A methodology your regulator can cite.
A cut score your auditor can reconstruct.
Every CAIS examination is scored against a Modified Angoff cut score set by a 9-member Standard-Setting Panel. Every administration is proctored inside Prompt Atlasand attested in the Public Verification Registry. Every form is equated. Every item is screened for fairness. The blueprint is public. The item bank is not.
Authored by GAISB™ · Earned inside Prompt Atlas · Proven by Real Builds