Skip to content

A curated list of Diffusion Model in RL resources (continually updated)

License

Notifications You must be signed in to change notification settings

opendilab/awesome-diffusion-model-in-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Diffusion Model in RL

Awesome docs visitor badge GitHub stars GitHub forks GitHub license

This is a collection of research papers for Diffusion Model in RL. And the repository will be continuously updated to track the frontier of Diffusion RL.

Welcome to follow and star!

Table of Contents

Overview of Diffusion Model in RL

The Diffusion Model in RL was introduced by “Planning with Diffusion for Flexible Behavior Synthesis” by Janner, Michael, et al. It casts trajectory optimization as a diffusion probabilistic model that plans by iteratively refining trajectories.

image info

There is another way: "Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning" by Wang, Z. proposed Diffusion Model as policy-optimization in offline RL, et al. Specifically, Diffusion-QL forms policy as a conditional diffusion model with states as the condition from the offline policy-optimization perspective.

image info

Advantage

  1. Bypass the need for bootstrapping for long term credit assignment.
  2. Avoid undesirable short-sighted behaviors due to the discounting future rewards.
  3. Enjoy the diffusion models widely used in language and vision, which are easy to scale and adapt to multi-modal data.

Papers

format:
- [title](paper link) [links]
  - author1, author2, and author3...
  - publisher
  - key 
  - code 
  - experiment environment

Arxiv

NeurIPS 2024

  • Adversarial Environment Design via Regret-Guided Diffusion Models

    • Hojun Chung, Junseo Lee, Minsoo Kim, Dohyeong Kim, Songhwai Oh
    • Key: Reinforcement Learning, Unsupervised Environment Design, Diffusion Models
    • ExpEnv: Minigrid, Partially Observable Maze Navigation, 2D Bipedal Locomotion
  • Graph Diffusion Policy Optimization

    • Yijing Liu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Wei Chen
    • Keyword: Graph Generation, Diffusion Models, Reinforcement Learning
    • ExpEnv: Drug Design, Graph Generation Tasks
    • Code: official
  • PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference

    • Kendong Liu, Zhiyu Zhu, Chuanhao Li, Hui Liu, Huanqiang Zeng, Junhui Hou
    • Key: Image Inpainting, Diffusion Models, Reinforcement Learning, Human Preference Alignment
    • Exp: Image inpainting comparison, image extension, 3D reconstruction
    • Code: official
  • Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

    • Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank C. Park
    • Key: Diffusion Models, Maximum Entropy Inverse Reinforcement Learning (IRL), Energy-Based Models (EBM), Anomaly Detection
    • ExpEnv: Empirical studies on generative modeling and anomaly detection tasks.
  • Text-Aware Diffusion for Policy Learning

    • Calvin Luo, Mandy He, Zilai Zeng, Chen Sun
    • Key: Reinforcement Learning, Text-Conditioned Diffusion, Zero-Shot Reward, Policy Learning
    • ExpEnv: Humanoid, Dog environments, Meta-World
  • Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient

    • Zechu Li, Rickmer Krohn, Tao Chen, Anurag Ajay, Pulkit Agrawal, Georgia Chalvatzaki
    • Key: Reinforcement Learning, Multimodal Learning, Diffusion Models, Actor-Critic Algorithm
    • ExpEnv: High-dimensional continuous control tasks, Maze navigation with unseen obstacles
  • Model-Based Diffusion for Trajectory Optimization

    • Chaoyi Pan, Zeji Yi, Guanya Shi, Guannan Qu
    • Key: Model-Based Diffusion, Trajectory Optimization, Diffusion Models
    • ExpEnv: Contact-rich Tasks, High-dimensional Humanoids
  • Diffusion for World Modeling: Visual Details Matter in Atari

    • Eloi Alonso, Adam Jelley, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce, François Fleuret
    • Key: Reinforcement Learning, Diffusion Models, World Modeling, Visual Details
    • ExpEnv: Atari 100k Benchmark, Counter-Strike: Global Offensive
  • MADiff: Offline Multi-agent Learning with Diffusion Models

    • Zhengbang Zhu, Minghuan Liu, Liyuan Mao, Bingyi Kang, Minkai Xu, Yong Yu, Stefano Ermon, Weinan Zhang
    • Key: Offline Reinforcement Learning, Multi-agent Learning, Diffusion Models, Coordination
    • ExpEnv: Multi-agent Learning Tasks
    • Code: official
  • Amortizing Intractable Inference in Diffusion Models for Vision, Language, and Control

    • Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin
    • Key: Diffusion Models, Amortized Inference, Reinforcement Learning, Vision, Language, Multimodal Data
    • ExpEnv: Vision (Classifier Guidance), Language (Infilling under Discrete Diffusion LLM), Multimodal (Text-to-Image Generation), Offline RL Benchmarks
  • Diffusion Actor-Critic with Entropy Regulator

    • Yinuo Wang, Likun Wang, Yuxuan Jiang, Wenjun Zou, Tong Liu, Xujie Song, Wenxuan Wang, Liming Xiao, Jiang Wu, Jingliang Duan, Shengbo Eben Li
    • Key: Reinforcement Learning, Diffusion Models, Entropy Regulation, Multimodal Policy
    • ExpEnv: MuJoCo Benchmarks, Multimodal Tasks
  • Diffusion Spectral Representation for Reinforcement Learning

    • Dmitry Shribak, Chen-Xiao Gao, Yitong Li, Chenjun Xiao, Bo Dai
    • Key: Reinforcement Learning, Diffusion Models, Representation Learning, Markov Decision Processes (MDP), Partially Observable Markov Decision Processes (POMDP)
    • ExpEnv: Various RL Benchmarks (Fully and Partially Observable Settings)

ICML 2024

CVPR 2024

ICLR 2024

NeurIPS 2023

ICML 2023

ICLR 2023

ICRA 2023

NeurIPS 2022

ICML 2022

Codebase

  • GenerativeRL

    • Zhang, Jinouwen and Xue, Rongkun and Niu, Yazhe and Chen, Yun and Chen, Xinyan and Wang, Ruiheng and Liu, Yu
    • Publisher: GitHub
    • Key: Reinforcement Learning, Generative Model, Diffusion Model, Flow Model
    • Code: official
  • CleanDiffuser

    • Zibin Dong and Yifu Yuan and Jianye Hao and Fei Ni and Yi Ma and Pengyi Li and Yan Zheng
    • Publisher: GitHub
    • Key: Reinforcement Learning, Generative Model, Diffusion Model, Flow Model
    • Code: official

Contributing

Our purpose is to make this repo even better. If you are interested in contributing, please refer to HERE for instructions in contribution.

License

Awesome Diffusion Model in RL is released under the Apache 2.0 license.