Find your job
// Roles glossary

Know what you're applying for.

Every AI training contract type, defined in plain English. What you do, what you need to know, what platforms hire for it, and what it pays in 2026.

RLHF Annotator

Reinforcement Learning from Human Feedback

$35–$95/hr

What it is. RLHF Annotators rank, score, and write feedback on responses generated by large language models. Your work directly shapes how the next version of GPT, Claude, Gemini, and others reasons.

What you actually do. Open a task. Read 2–4 model responses to a prompt. Pick the best one and explain why. Sometimes write the "ideal" response yourself. Repeat 30–80 times per session. Sessions are 30–90 minutes; you control the schedule.

What you need to know. Strong written English (or target language). Ability to spot subtle reasoning errors, factual mistakes, and unhelpful framings. No coding required for general RLHF; coding-specific RLHF (see Coding Evaluator) pays more.

How to get hired. Mercor and Scale AI are the most accessible entry points — quick application, sample task, decision in 1–3 days. Outlier requires more screening but pays slightly more for senior tier work.

Find RLHF jobs on joblet.ai →
Top platforms
Mercor · Scale AI · Outlier · Surge AI
Hours/week
10–25 typical
Skills required
Strong written English · attention to detail
Application time
15–60 min · sample task
Time to first paycheck
5–14 days

Coding Evaluator

Code review for AI training data

$40–$95/hr

What it is. Coding Evaluators review code generated by AI models, rate correctness and quality, fix mistakes, and produce reference solutions. The work trains the next generation of code-completion and code-generation systems.

What you actually do. Read a coding prompt and 1–4 candidate solutions. Run them mentally (or in a sandbox), find bugs, evaluate code quality, write a corrected reference solution if all candidates fail. Submit feedback. Move on.

What you need to know. Working professional fluency in at least one language — Python is most in-demand, followed by TypeScript/JavaScript, then Rust, C++, Go, Java. Reading-comprehension matters more than typing speed; you spend more time evaluating than writing.

How to get hired. Outlier is the volume leader — fastest application, most consistent work. Scale AI pays slightly more at the senior tier. Mercor is best for niche languages (Rust, C++) where rates climb 30–50%.

Find coding eval jobs on joblet.ai →
Top platforms
Outlier · Scale AI · Mercor · Turing
Hours/week
10–25 typical
Languages in demand
Python · TS/JS · Rust · C++ · Go
Application time
30–90 min · coding sample
Time to first paycheck
5–10 days

Math Reasoning Expert

PhD-level mathematics for AI training

$60–$150/hr

What it is. Write rigorous step-by-step solutions to advanced math problems. Evaluate AI-generated proofs and rank by mathematical correctness. Help models learn what airtight mathematical reasoning looks like.

What you actually do. Open a problem (calculus, linear algebra, real analysis, abstract algebra, number theory, topology). Either write a clean proof yourself, or evaluate 2–4 model attempts and explain exactly where each goes wrong. Rate by correctness, completeness, and clarity.

What you need to know. PhD or in-progress PhD in mathematics, applied math, statistics, or theoretical CS. Comfort with proofs at undergraduate-textbook-and-above level. Some platforms accept MS + research experience.

How to get hired. Scale AI's math expert program is the highest-paying. Outlier has higher volume. Turing is best for long-form proof writing where you can spend an hour on a single problem.

Find math reasoning jobs on joblet.ai →
Top platforms
Scale AI · Outlier · Turing
Hours/week
5–15 typical
Education required
PhD or in-progress · math/CS/stats
Application time
2–4 hrs · proof samples + interview
Time to first paycheck
14–30 days

Domain Expert

Medical, Legal, Finance, Engineering specialists

$55–$200/hr

What it is. Domain Experts evaluate AI responses on subject-matter accuracy in their specialty — medical, legal, finance, engineering, or scientific research. The bar is your expertise; the work is verifying that AI doesn't hallucinate when stakes are real.

What you actually do. Read a model response to a domain-specific question. Mark factual errors. Flag malpractice (medical), misinformation (legal), or material inaccuracies (finance). Either rewrite the response correctly or explain exactly what's wrong and why.

What you need to know. Active credentials in your field. Medical: MD/DO + active license + board cert pays $120–200/hr. Legal: JD + 5+ yrs practice. Finance: CPA, CFA, or equivalent. Engineering: PE license adds 30%.

How to get hired. Surge AI pays the highest for medical (premium for active practice). Mercor is best for legal — they have the largest law-domain inventory. Micro1 for finance. Most domain expert applications include credential verification.

Find domain expert jobs on joblet.ai →
Top platforms
Surge AI · Mercor · Micro1
Hours/week
3–15 typical
Credentials required
Active license/cert · 3+ yrs practice
Application time
1–3 hrs · credential check + interview
Time to first paycheck
14–45 days

Multilingual Annotator

Translation, evaluation, and cultural QA

$22–$65/hr

What it is. Native or near-native speakers of a language other than English evaluate AI responses for accuracy, fluency, and cultural appropriateness. Translation, idiom verification, and cultural-context flagging.

What you actually do. Read AI-generated text in your language. Score for grammatical correctness, natural fluency, cultural appropriateness, and factual accuracy. Sometimes translate or transcribe; sometimes write reference responses.

What you need to know. Native or near-native fluency in the target language plus working English. Some platforms require formal translation training; most don't. Languages with smaller speaker populations (Yoruba, Tamil, Vietnamese, Thai) command higher rates due to scarcity.

How to get hired. Scale AI has the largest volume across most languages. Surge AI pays a premium for CJK (Chinese, Japanese, Korean). Outlier hits highest pay for low-resource languages. Toloka is the most accessible entry point.

Find multilingual jobs on joblet.ai →
Top platforms
Scale AI · Outlier · Surge AI · Toloka
Hours/week
10–30 typical
Skills required
Native-level target language + working English
Application time
30–90 min · language fluency test
Time to first paycheck
5–14 days

Senior Software Engineer · Contract

Full FTE-equivalent contract engineering

$100–$220/hr

What it is. This isn't AI training work — these are full-fledged contract engineering roles, often hands-on building production systems for AI-first companies, training infrastructure teams, or research labs needing senior engineers without committing to FTE.

What you actually do. Real engineering. Ship features, build infrastructure, debug production issues. Engagements are typically 3–6 months at 20–40 hrs/week. You're embedded with a team via Slack/Linear, attend standups, do code review.

What you need to know. 5–8+ years of professional engineering experience. Modern stack proficiency. Strong system design. The bar is closer to a Staff Engineer interview at a Series B than a typical contractor screen.

How to get hired. Turing has the most volume and the most rigorous screening (think 3-stage interview). Mercor pays a premium and has shorter engagements. Micro1 specializes in ML engineering contracts where pay tops $200/hr for hands-on training infra work.

Find senior engineer contracts on joblet.ai →
Top platforms
Turing · Mercor · Micro1
Hours/week
20–40 typical
Experience required
5–8+ yrs · production engineering
Application time
3–6 hrs · multi-stage interview
Time to first paycheck
21–45 days

AI Research Contractor

PhD-level research contributions

$90–$180/hr

What it is. Contribute to AI research projects on a contract basis — paper reviews, dataset curation, evaluation methodology, ablation studies, sometimes co-authorship. Adjacent to traditional research-engineer roles but contract-shaped.

What you actually do. Varies wildly by project. Could be: review 20 papers and synthesize a survey. Design evaluation criteria for a new benchmark. Run ablation experiments on a published model. Help draft a paper. Engagements are typically 4–12 weeks.

What you need to know. PhD (in progress or completed) in ML, CS, statistics, or related field. Publication history at top-tier venues (NeurIPS, ICML, ACL, CVPR, ICLR) makes you 2–3x more attractive. Strong written communication for paper-shaped work.

How to get hired. Mercor and Turing are the two most active. Mercor screens for publications; Turing screens for hands-on experimentation experience. Both pay similarly at the senior tier; Turing tends to higher volume of work.

Find AI research contracts on joblet.ai →
Top platforms
Mercor · Turing
Hours/week
10–25 typical
Education required
PhD (in-progress or completed) + publications
Application time
2–4 hrs · CV + interview
Time to first paycheck
21–45 days

Creative Writing Trainer

Fiction, marketing, long-form narrative

$28–$75/hr

What it is. Train language models to write better creative content. Evaluate AI-generated stories, marketing copy, scripts, or long-form essays. Sometimes write reference pieces yourself. The output trains models like ChatGPT and Claude on what good prose actually looks like.

What you actually do. Read AI-generated creative work. Score for craft elements: voice, pacing, character, plot consistency, originality, tone. Either fix the work in place or write a corrected reference. Long-form work (10K+ words) pays a premium.

What you need to know. Demonstrable writing portfolio. Published fiction or an MFA helps. Marketing-copy work cares about portfolio over credentials. Long-form work requires actual stamina to sustain a voice and structure across many thousands of words.

How to get hired. Outlier has the largest volume. Surge AI pays best for long-form (and screens hardest). Scale AI is most accessible for marketing-focused creative work. Portfolio review is universal — be ready to share 2–3 polished pieces.

Find creative writing jobs on joblet.ai →
Top platforms
Outlier · Scale AI · Surge AI
Hours/week
10–20 typical
Skills required
Portfolio · published work or MFA preferred
Application time
1–2 hrs · portfolio review + writing sample
Time to first paycheck
10–21 days

Ready to find your role?

All open roles across every platform live on joblet.ai.

Find your job