Complete Beginner's Course on AI Evaluations in 50 Minutes (2025) | Aman Khan

Published at : 23 Dec 2025

Today, I want to share a new episode with Aman Khan.

The best way to learn about AI evaluations is to watch 2 PMs build them live from scratch. In our new episode, Aman and I walk through creating evals for an AI customer support agent — from labeling a golden dataset to aligning LLM judges. This is the complete beginners AI eval course you've been waiting for.

Aman and I talked about:
(00:00) What are AI evals and how to get good at them
(02:52) The 4 types of AI evaluations everyone should know
(06:08) Live demo: Building evals for a customer support agent
(10:29) Using Anthropic's console to generate great prompts
(15:13) Creating the evaluation criteria
(17:40) Adding human labels to the golden dataset
(31:05) Scaling evals with LLM-judge prompts
(38:21) How to align LLM judges with human judgment

Get the takeaways: https://creatoreconomy.so/p/complete-beginner-course-on-ai-evaluations-aman-khan

Where to find Aman:
X: https://www.linkedin.com/in/amanberkeley/
Website: https://arize.com/

📌 Subscribe to this channel – more interviews coming soon!