rlxf.ai is a blog and research site maintained by Shixiang Shane Gu, a pondering AI researcher. Shane has pioneered work across reinforcement learning, generative modeling, and large language model reasoning — including co-inventing Gumbel-Softmax, developing Zero-Shot Chain-of-Thought prompting ("Let's think step by step"), and building sample-efficient deep RL algorithms. This site collects his thoughts, writing, and research updates.

Latest Posts

Latest Publications

Subscribe to the newsletter