Generative Slate Recommendation with Reinforcement Learning

Romain Deffayet, Thibaut Thonet, Jean-Michel Renders, Maarten de Rijke
Published at WSDM'23
Paper | Code


Learning to recommend slates with RL is hard, because of the combinatorial action-space. Instead of making assumptions on user behavior, we tried taking actions in the latent space of a VAE, which then decodes slates.