Reinforcement Learning from Human Feedback. Models learn from human preferences over their outputs. The contractor's role in RLHF is providing those preferences (typically by picking which of two responses is better).

What is a quality score?

Numerical rating (typically 0–1) of your evaluation accuracy on a task. Computed from rubric agreement, justification quality, edge-case identification, and completion. Rolling weighted average determines your platform tier.

What is a specialty pool?

Gated task pool for specific work types (framework specialty like FastAPI, niche language like Rust, domain like medical). Requires qualification calibration. Pays 10–40% premium above base rate.

What is calibration in AI training?

Process of aligning your judgment with platform standards via practice tasks scored against reference answers. Initial calibration determines starting tier; specialty calibration unlocks specialty pool access.

AI training glossary: 40 key terms every contractor should know

Forty terms that come up in AI training contracting, defined in plain language. Bookmark for reference.

Foundational

RLHF (Reinforcement Learning from Human Feedback): Technique where models learn from human preferences over their outputs. The contractor's role is providing those preferences.

Evaluation (eval): Scoring an AI model's output against criteria. Different from training — eval measures, training updates the model.

Rubric: The structured criteria evaluators use to score outputs. Good rubrics produce consistent scores across raters.

Calibration: The process of aligning your judgment with platform standards via practice tasks scored against reference answers.

Tier: Your performance level on a platform (entry, mid, senior). Determines your pay rate and task pool access.

Tasks and roles

Coding evaluator: Reviews code for correctness, edge cases, style. Most common technical role.

RLHF annotator: Provides preferences over model outputs, typically pairwise (A or B better).

Domain expert: Credentialed evaluator (MD, JD, PhD, CFA) for specialty work.

Agent task evaluator: Scores AI agents that take multi-step actions (browse, code, call APIs).

Long-context evaluator: Evaluates model use of long documents (50+ pages) when answering questions.

Red teamer: Tries to make models fail or produce harmful outputs.

Reference solution writer: Writes ideal answers used as training ground truth.

Quality and scoring

Quality score: Numerical rating (typically 0–1) of your evaluation accuracy on a task.

Inter-rater agreement: How often two independent raters reach similar conclusions on the same task. Higher = better rubric.

Justification: Written explanation of your scoring rationale. Major score-dimension on most platforms.

Rolling weighted average: Your quality score computed across recent N tasks (typically 80–120). Determines tier eligibility.

Use these terms in your applicationsSpecific terminology demonstrates platform fluency.

Open calculator →

Platform mechanics

Specialty pool: Gated task pool for specific work types (framework, niche language, domain). Pays premium rates.

Task drop: When a batch of tasks becomes available on the platform.

Pay cycle: Frequency of contractor payouts (Outlier weekly, Mercor bi-weekly).

Onboarding: Initial period combining application + calibration + first paid tasks.

Routing priority: How quickly you see new task drops; affected by tier and Outlier+ status.

Application process

Coding sample: Take-home programming test during application.

Work sample: Mercor/Surge written + coding test, longer than coding sample alone.

AI screener: Mercor's AI-conducted video interview.

Vetting interview: Turing's human-conducted technical interview.

Model concepts

Hallucination: Confident model output unsupported by source/knowledge/reasoning.

Pattern completion: Model treats similar things as identical (failure mode).

Constitutional AI: Approach where models are trained against specific principles. Specialty work category.

Multi-step reasoning: Tasks where models must chain multiple inferences.

Long-context: 1M+ token model contexts (small library of text).

Agent: AI model wrapped in a runtime that can take actions (browse, code, call APIs).

Compensation

Effective hourly rate: Gross earnings ÷ actual hours worked, accounting for unpaid wait time.

Specialty premium: Higher rate for specific specialty work (typically 10–40% above base).

Bounty: One-time payment for specific findings (red team programs).

Outlier+: Outlier's $39/month premium tier for priority routing.

Tax and payment

1099: US tax form for self-employed income.

Schedule C: US tax form for self-employment business income.

SEP-IRA / Solo 401(k): Self-employed retirement accounts with high contribution caps.

FIRC: Foreign Inward Remittance Certificate, India equivalent of foreign-income receipt.

Wise / Payoneer: Cross-border payment platforms commonly used by AI training contractors.

Career

Tier-up: Promotion to higher tier (entry → mid → senior).

Specialty calibration: Calibration tasks specific to a specialty pool, gating that pool's access.

Quality coach: Senior contractor who provides feedback in some specialty pools.

Program manager: Platform employee who runs specific contractor programs; can negotiate rates.

Bottom line

The vocabulary of AI training contracting is specific and worth learning. New contractors who use these terms accurately in applications and interviews signal experience. Senior contractors should be able to define each of these without thinking.

Find AI training contractsAll open roles · 9 platforms · filter by rate and hours.

Find your job →

AI training glossary 40 key terms every contractor should know.

Foundational

Tasks and roles

Quality and scoring

Platform mechanics

Application process

Model concepts

Compensation

Tax and payment

Career

Bottom line

Frequently asked questions

AI training glossary 40 key terms every contractor should know.

Foundational

Tasks and roles

Quality and scoring

Platform mechanics

Application process

Model concepts

Compensation

Tax and payment

Career

Bottom line

Frequently asked questions

Related