Blog Publications About

rlxf.ai is a blog and research site maintained by Shixiang Shane Gu, an AI researcher at Google DeepMind, currently working on Gemini and Omni Thinking. Previously he led the Multilinguality team for Gemini Post-Training (Gemini 1.5–2.5 production models). Shane is recognized for pioneering contributions across generative modeling, reinforcement learning, and LLM reasoning — including co-inventing Gumbel-Softmax, Zero-Shot Chain-of-Thought prompting ("Let's think step by step"), and showing that LLMs can self-improve. This site collects his thoughts, writing, and research updates.

Latest Posts

No posts found.

Selected Publications

Generative Modeling

Categorical Reparameterization with Gumbel-Softmax

Eric Jang, Shixiang Gu, Ben Poole

ICLR 2017

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih

ICLR 2016

Robotics

Deep RL for Robotic Manipulation with Asynchronous Off-Policy Updates

Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine

ICRA 2017

Blocks Assemble! Learning to Assemble with Large-Scale Structured RL

Seyed Kamyar Seyed Ghasemipour, Daniel Freeman, Byron David, Shixiang Shane Gu, Satoshi Kataoka, Igor Mordatch

ICML 2022

RL

Continuous Deep Q-Learning with Model-based Acceleration

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

ICML 2016

A Divergence Minimization Perspective on Imitation Learning Methods

Best Paper Award

Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

CoRL 2019

LLM

Large Language Models are Zero-Shot Reasoners

Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa

NeurIPS 2022

Large Language Models Can Self-Improve

Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han

EMNLP 2023

World Model

Video Models are Zero-Shot Reasoners

Thaddäus Wiedemer, Yuxuan Li, Paul Vicol, Shixiang Shane Gu, Nick Matarese, Kevin Swersky, Been Kim, Priyank Jaini, Robert Geirhos

arXiv 2025

Mind's Eye: Grounded Language Model Reasoning through Simulation

Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai

ICLR 2023

All Publications →