RL for Recommender Systems, Part 1: Foundations
Why reinforcement learning matters for recommendations, a refresher on MDP fundamentals, and understanding the spectrum from bandits to full RL.
All of my thoughts on recent trends in these industries, collected in chronological order.
Why reinforcement learning matters for recommendations, a refresher on MDP fundamentals, and understanding the spectrum from bandits to full RL.
A deep dive into YouTube's REINFORCE system and Google's Actor-Critic extension, including the critical Top-K correction and off-policy learning.
Deep dives into Netflix's slate evaluation, Kuaishou's retention optimization, and Spotify's cold-start solution.
Pinterest's latest RL systems, a decision framework for choosing your approach, and a practical implementation roadmap.