// Roles glossary

Know what you're applying for.

Every AI training contract type, defined in plain English. What you do, what you need to know, what platforms hire for it, and what it pays in 2026.

RLHF Annotator$35–$95/hr Coding Evaluator$40–$95/hr Math Reasoning Expert$60–$150/hr Domain Expert$55–$200/hr Multilingual Annotator$22–$65/hr Senior Software Engineer$100–$220/hr AI Research Contractor$90–$180/hr Creative Writing Trainer$28–$75/hr

RLHF Annotator

Reinforcement Learning from Human Feedback

$35–$95/hr

What it is. RLHF Annotators rank, score, and write feedback on responses generated by large language models. Your work directly shapes how the next version of GPT, Claude, Gemini, and others reasons.

What you actually do. Open a task. Read 2–4 model responses to a prompt. Pick the best one and explain why. Sometimes write the "ideal" response yourself. Repeat 30–80 times per session. Sessions are 30–90 minutes; you control the schedule.

What you need to know. Strong written English (or target language). Ability to spot subtle reasoning errors, factual mistakes, and unhelpful framings. No coding required for general RLHF; coding-specific RLHF (see Coding Evaluator) pays more.

How to get hired. Mercor are the most accessible entry points — quick application, sample task, decision in 1–3 days. Outlier requires more screening but pays slightly more for senior tier work.

Find RLHF jobs on joblet.ai →

Top platforms

Mercor · Outlier · Surge AI

Hours/week

10–25 typical

Skills required

Strong written English · attention to detail

Application time

15–60 min · sample task

Time to first paycheck

5–14 days

Coding Evaluator

Code review for AI training data

$40–$95/hr

What it is. Coding Evaluators review code generated by AI models, rate correctness and quality, fix mistakes, and produce reference solutions. The work trains the next generation of code-completion and code-generation systems.

What you actually do. Read a coding prompt and 1–4 candidate solutions. Run them mentally (or in a sandbox), find bugs, evaluate code quality, write a corrected reference solution if all candidates fail. Submit feedback. Move on.

What you need to know. Working professional fluency in at least one language — Python is most in-demand, followed by TypeScript/JavaScript, then Rust, C++, Go, Java. Reading-comprehension matters more than typing speed; you spend more time evaluating than writing.

How to get hired. Outlier is the volume leader — fastest application, most consistent work. Outlier pays slightly more at the senior tier. Mercor is best for niche languages (Rust, C++) where rates climb 30–50%.

Find coding eval jobs on joblet.ai →

Top platforms

Outlier · Mercor · Turing

Hours/week

10–25 typical

Languages in demand

Python · TS/JS · Rust · C++ · Go

Application time

30–90 min · coding sample

Time to first paycheck

5–10 days

Math Reasoning Expert

PhD-level mathematics for AI training

$60–$150/hr

What it is. Write rigorous step-by-step solutions to advanced math problems. Evaluate AI-generated proofs and rank by mathematical correctness. Help models learn what airtight mathematical reasoning looks like.

What you actually do. Open a problem (calculus, linear algebra, real analysis, abstract algebra, number theory, topology). Either write a clean proof yourself, or evaluate 2–4 model attempts and explain exactly where each goes wrong. Rate by correctness, completeness, and clarity.

What you need to know. PhD or in-progress PhD in mathematics, applied math, statistics, or theoretical CS. Comfort with proofs at undergraduate-textbook-and-above level. Some platforms accept MS + research experience.

How to get hired. Outlier's math expert program is the highest-paying. Outlier has higher volume. Turing is best for long-form proof writing where you can spend an hour on a single problem.

Find math reasoning jobs on joblet.ai →

Top platforms

Outlier · Turing

Hours/week

5–15 typical

Education required

PhD or in-progress · math/CS/stats

Application time

2–4 hrs · proof samples + interview

Time to first paycheck

14–30 days

Domain Expert

Medical, Legal, Finance, Engineering specialists

$55–$200/hr

What it is. Domain Experts evaluate AI responses on subject-matter accuracy in their specialty — medical, legal, finance, engineering, or scientific research. The bar is your expertise; the work is verifying that AI doesn't hallucinate when stakes are real.

What you actually do. Read a model response to a domain-specific question. Mark factual errors. Flag malpractice (medical), misinformation (legal), or material inaccuracies (finance). Either rewrite the response correctly or explain exactly what's wrong and why.

What you need to know. Active credentials in your field. Medical: MD/DO + active license + board cert pays $120–200/hr. Legal: JD + 5+ yrs practice. Finance: CPA, CFA, or equivalent. Engineering: PE license adds 30%.

How to get hired. Surge AI pays the highest for medical (premium for active practice). Mercor is best for legal — they have the largest law-domain inventory. Micro1 for finance. Most domain expert applications include credential verification.

Find domain expert jobs on joblet.ai →

Top platforms

Surge AI · Mercor · Micro1

Hours/week

3–15 typical

Credentials required

Active license/cert · 3+ yrs practice

Application time

1–3 hrs · credential check + interview

Time to first paycheck

14–45 days

Multilingual Annotator

Translation, evaluation, and cultural QA

$22–$65/hr

What it is. Native or near-native speakers of a language other than English evaluate AI responses for accuracy, fluency, and cultural appropriateness. Translation, idiom verification, and cultural-context flagging.

What you actually do. Read AI-generated text in your language. Score for grammatical correctness, natural fluency, cultural appropriateness, and factual accuracy. Sometimes translate or transcribe; sometimes write reference responses.

What you need to know. Native or near-native fluency in the target language plus working English. Some platforms require formal translation training; most don't. Languages with smaller speaker populations (Yoruba, Tamil, Vietnamese, Thai) command higher rates due to scarcity.

How to get hired. Outlier has the largest volume across most languages. Surge AI pays a premium for CJK (Chinese, Japanese, Korean). Outlier hits highest pay for low-resource languages. Toloka is the most accessible entry point.

Find multilingual jobs on joblet.ai →

Top platforms

Outlier · Surge AI · Toloka

Hours/week

10–30 typical

Skills required

Native-level target language + working English

Application time

30–90 min · language fluency test

Time to first paycheck

5–14 days

Senior Software Engineer · Contract

Full FTE-equivalent contract engineering

$100–$220/hr

What it is. This isn't AI training work — these are full-fledged contract engineering roles, often hands-on building production systems for AI-first companies, training infrastructure teams, or research labs needing senior engineers without committing to FTE.

What you actually do. Real engineering. Ship features, build infrastructure, debug production issues. Engagements are typically 3–6 months at 20–40 hrs/week. You're embedded with a team via Slack/Linear, attend standups, do code review.

What you need to know. 5–8+ years of professional engineering experience. Modern stack proficiency. Strong system design. The bar is closer to a Staff Engineer interview at a Series B than a typical contractor screen.

How to get hired. Turing has the most volume and the most rigorous screening (think 3-stage interview). Mercor pays a premium and has shorter engagements. Micro1 specializes in ML engineering contracts where pay tops $200/hr for hands-on training infra work.

Find senior engineer contracts on joblet.ai →

Top platforms

Turing · Mercor · Micro1

Hours/week

20–40 typical

Experience required

5–8+ yrs · production engineering

Application time

3–6 hrs · multi-stage interview

Time to first paycheck

21–45 days

AI Research Contractor

PhD-level research contributions

$90–$180/hr

What it is. Contribute to AI research projects on a contract basis — paper reviews, dataset curation, evaluation methodology, ablation studies, sometimes co-authorship. Adjacent to traditional research-engineer roles but contract-shaped.

What you actually do. Varies wildly by project. Could be: review 20 papers and synthesize a survey. Design evaluation criteria for a new benchmark. Run ablation experiments on a published model. Help draft a paper. Engagements are typically 4–12 weeks.

What you need to know. PhD (in progress or completed) in ML, CS, statistics, or related field. Publication history at top-tier venues (NeurIPS, ICML, ACL, CVPR, ICLR) makes you 2–3x more attractive. Strong written communication for paper-shaped work.

How to get hired. Mercor and Turing are the two most active. Mercor screens for publications; Turing screens for hands-on experimentation experience. Both pay similarly at the senior tier; Turing tends to higher volume of work.

Find AI research contracts on joblet.ai →

Top platforms

Mercor · Turing

Hours/week

10–25 typical

Education required

PhD (in-progress or completed) + publications

Application time

2–4 hrs · CV + interview

Time to first paycheck

21–45 days

Creative Writing Trainer

Fiction, marketing, long-form narrative

$28–$75/hr

What it is. Train language models to write better creative content. Evaluate AI-generated stories, marketing copy, scripts, or long-form essays. Sometimes write reference pieces yourself. The output trains models like ChatGPT and Claude on what good prose actually looks like.

What you actually do. Read AI-generated creative work. Score for craft elements: voice, pacing, character, plot consistency, originality, tone. Either fix the work in place or write a corrected reference. Long-form work (10K+ words) pays a premium.

What you need to know. Demonstrable writing portfolio. Published fiction or an MFA helps. Marketing-copy work cares about portfolio over credentials. Long-form work requires actual stamina to sustain a voice and structure across many thousands of words.

How to get hired. Outlier has the largest volume. Surge AI pays best for long-form (and screens hardest). Outlier is most accessible for marketing-focused creative work. Portfolio review is universal — be ready to share 2–3 polished pieces.

Find creative writing jobs on joblet.ai →

Top platforms

Outlier · Surge AI

Hours/week

10–20 typical

Skills required

Portfolio · published work or MFA preferred

Application time

1–2 hrs · portfolio review + writing sample

Time to first paycheck

10–21 days

Ready to find your role?

All open roles across every platform live on joblet.ai.

Find your job →