aiagentrank.io
MavenCuratedEditor's pick

Mastering LLMs: Evaluation (Hamel Husain & Shreya Shankar)

Reviewed by AI Agent Rank editors · Last verified 2026-05-24

Our take

The cohort that defined the modern AI evaluation playbook. Hamel and Shreya teach you how to build eval sets, run experiments, ship dashboards, and avoid the LLM-as-judge traps that fool most teams. $1,995 for 4 weeks with live sessions and a capstone. Worth every dollar if you're shipping production LLM features — the cost of doing evaluation wrong is bigger than the tuition. Sells out within hours of every run.

About the instructor

Hamel Husain & Shreya Shankar
Independent AI consultants; ex-Airbnb, GitHub

Hamel built ML evaluation systems at GitHub and Airbnb. Shreya researches LLM evaluation at UC Berkeley. The most-cited course on production AI evaluation in 2026.

Pros

  • +Defined the modern AI evals playbook — most-cited course in the space
  • +Live cohort + capstone — material commitment device
  • +Hamel and Shreya are both practicing experts, not generalists

Cons

  • $1,995 is steep; you need to ship a real eval project to recoup
  • Sells out fast; runs only 4-6 times/year
  • Heavy pre-reqs: Python + production LLM shipping experience

Best for

  • · Engineers shipping LLM features in production who lack eval infrastructure
  • · Senior IC or staff engineers tasked with "make our AI more reliable"

Not ideal for

  • · Anyone not actively building production LLM systems
  • · Beginners — pre-reqs are real
Ready to enroll?

$1995 one-time on Maven · 4 weeks (~6-8h/wk live + work)

Enroll on Maven

After this course

These are the agents and tools where the skills from this course actually pay back.

Alternatives we considered

Other courses on the same topic. The right pick depends on your level and constraints — see each card for the trade-offs.

Mastering LLMs: Evaluation (Hamel Husain & Shreya Shankar) — review (2026) | AI Agent Rank · AI Agent Rank