AUROC and AUPRC: What They Actually Tell You About Your Ranking Model
Five ways to understand AUROC, why AUPRC matters more for imbalanced recsys tasks, and practical guidance for interpreting these metrics in production.
ML Engineer at Roblox w/ 9+ years of experience in recommender systems. I build ML models that help 100M+ daily users find games they want to play. Previously, I built similar models for e-commerce recommendations at Wish.
Side projects:
Current goals:
Five ways to understand AUROC, why AUPRC matters more for imbalanced recsys tasks, and practical guidance for interpreting these metrics in production.
Why transformer inference is memory-bound, how KV caching eliminates redundant computation, and how FlashAttention achieves 75% GPU utilization through IO-aware tiling.
Why reinforcement learning matters for recommendations, a refresher on MDP fundamentals, and understanding the spectrum from bandits to full RL.
A deep dive into YouTube's REINFORCE system and Google's Actor-Critic extension, including the critical Top-K correction and off-policy learning.