[1] Zhu, Zhuangdi, et al. "Off-policy imitation learning from observations." NeurIPS, 2020.[2] Torabi, Faraz, Garrett Warnell, and Peter Stone. "Behavioral cloning from observation." IJCAI, 2018.
[3] Kim, Geon-Hyeong, et al. "DemoDICE: Offline imitation learning with supplementary imperfect demonstrations." ICLR, 2022.
[4] Garg, Divyansh, et al. "IQ-Learn: Inverse soft-Q Learning for Imitation." NeurIPS, 2021.
[5] Eysenbach, Ben, Sergey Levine, and Russ R. Salakhutdinov. "Replacing rewards with examples: Example-based policy search via recursive classification." NeurIPS, 2021.
[6] Kim, Geon-Hyeong, et al. "LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation." NeurIPS, 2022