Multi-agent constrained policy optimisation S Gu, JG Kuba, M Wen, R Chen, Z Wang, Z Tian, J Wang, A Knoll, Y Yang arXiv preprint arXiv:2110.02793, 2021 | 44 | 2021 |
Sauté rl: Almost surely safe reinforcement learning using state augmentation A Sootla, AI Cowen-Rivers, T Jafferjee, Z Wang, D Mguni, J Wang, ... ICML 2022, 2022 | 41 | 2022 |
ChessGPT: Bridging Policy Learning and Language Modeling X Feng, Y Luo, Z Wang, H Tang, M Yang, K Shao, D Mguni, Y Du, J Wang NeurIPS 2023, 2023 | 12 | 2023 |
MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment Z Wang, Y Du, Y Zhang, M Fang, B Huang arXiv preprint arXiv:2312.03644, 2023 | 9 | 2023 |
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach Y Zhang, Y Du, B Huang, Z Wang, J Wang, M Fang, M Pechenizkiy NeurIPS 2023, 2023 | 5* | 2023 |
Desta: A framework for safe reinforcement learning with markov games of intervention D Mguni, U Islam, Y Sun, X Zhang, J Jennings, A Sootla, C Yu, Z Wang, ... arXiv preprint arXiv:2110.14468, 2021 | 5 | 2021 |
Natural Language Reinforcement Learning X Feng, Z Wan, M Yang, Z Wang, GA Koushiks, Y Du, Y Wen, J Wang arXiv preprint arXiv:2402.07157, 2024 | | 2024 |
Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models X Lou, J Zhang, Z Wang, K Huang, Y Du AAMAS 2024, 2024 | | 2024 |