Difference between revisions of "World Models"

Revision as of 21:22, 27 June 2024

World models leverage video data to create rich, synthetic datasets, enhancing the learning process for robotic systems. By generating diverse and realistic training scenarios, world models address the challenge of insufficient real-world data, enabling robots to acquire and refine skills more efficiently.

Date	Title	Authors	Summary
2017	Sim-to-Real Transfer of Robotic Control with Dynamics Randomization	Josh Tobin et al.	simulated data can be used to train robotic control policies that transfer well to the real world using dynamics randomization, bridging the gap between simulation and real-world data.
2017	Learning from Simulated and Unsupervised Images through Adversarial Training	Ashish Shrivastava et al.	technique that refines simulated images to make them more realistic using adversarial training, enhancing the quality of synthetic data for training robotics models.
2018	World Models	David Ha and Jürgen Schmidhuber	agent builds a compact model of the world and uses it to plan and dream, improving its performance in real environments. This aligns well with the interest in universal simulators.
2020	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis	Ben Mildenhall et al.	high-fidelity views of complex 3D scenes, instrumental in creating synthetic data for robotics, and relevant for generating diverse visual environments for training robots.
2021	Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding	Krishna D. Kamath et al.	Focuses on predicting diverse future trajectories, crucial for creating realistic scenarios in robotics simulations.
2021	Augmenting Reinforcement Learning with Human Videos	Alex X. Lee et al.	Explores the use of human demonstration videos to improve the performance of reinforcement learning agents, which is highly relevant for augmenting datasets in robotics.
2024	Real-world Robot Applications of Foundation Models: A Review	K Kawaharazuka, T Matsushima et al.	overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
2024	Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond	Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al.	surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
2024	Large Language Models for Robotics: Opportunities, Challenges, and Perspectives	J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al.	perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
2024	3D-VLA: A 3D Vision-Language-Action Generative World Model	H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al.	Presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
2024	A Survey on Robotics with Foundation Models: Toward Embodied AI	Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang	integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
2024	The Essential Role of Causality in Foundation World Models for Embodied AI	T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al.	importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
2024	Learning World Models with Identifiable Factorization	Y Liu, B Huang, Z Zhu, H Tian et al.	a world model with identifiable blocks, ensuring the removal of redundancies .
2024	Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models	Y Kim, G Singh, J Park et al.	systematic generalization in vision models and world models.

@@ Line 3: / Line 3: @@
 ! Date !! Title !! Authors !! Summary
 |-
-| 2017 || [https://arxiv.org/abs/1703.06907 Sim-to-Real Transfer of Robotic Control with Dynamics Randomization] || Josh Tobin et al. || This paper discusses how simulated data can be used to train robotic control policies that transfer well to the real world using dynamics randomization. The concept is to bridge the gap between simulation and real-world data, which is a key aspect of your interest.
+| 2017 || [https://arxiv.org/abs/1703.06907 Sim-to-Real Transfer of Robotic Control with Dynamics Randomization] || Josh Tobin et al. || simulated data can be used to train robotic control policies that transfer well to the real world using dynamics randomization, bridging the gap between simulation and real-world data.
 |-
-| 2017 || [https://arxiv.org/abs/1612.07828 Learning from Simulated and Unsupervised Images through Adversarial Training] || Ashish Shrivastava et al. || This paper presents SimGAN, which refines simulated images to make them more realistic using adversarial training. This technique can be used to enhance the quality of synthetic data for training robotics models.
+| 2017 || [https://arxiv.org/abs/1612.07828 Learning from Simulated and Unsupervised Images through Adversarial Training] || Ashish Shrivastava et al. || technique that refines simulated images to make them more realistic using adversarial training, enhancing the quality of synthetic data for training robotics models.
 |-
-| 2018 || [https://arxiv.org/abs/1803.10122 World Models] || David Ha and Jürgen Schmidhuber || This paper introduces a concept where an agent builds a compact model of the world and uses it to plan and dream, improving its performance in the real environment. This aligns well with your interest in universal simulators.
+| 2018 || [https://arxiv.org/abs/1803.10122 World Models] || David Ha and Jürgen Schmidhuber ||  agent builds a compact model of the world and uses it to plan and dream, improving its performance in real environments. This aligns well with the interest in universal simulators.
 |-
-| 2020 || [https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis] || Ben Mildenhall et al. || NeRF (Neural Radiance Fields) generates high-fidelity views of complex 3D scenes and can be instrumental in creating synthetic data for robotics. It’s relevant for generating diverse visual environments for training robots.
+| 2020 || [https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis] || Ben Mildenhall et al. || high-fidelity views of complex 3D scenes, instrumental in creating synthetic data for robotics, and relevant for generating diverse visual environments for training robots.
 |-
-| 2021 || [https://arxiv.org/abs/2103.11624 Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding] || Krishna D. Kamath et al. || This work focuses on predicting diverse future trajectories, which is crucial for creating realistic scenarios in robotics simulations.
+| 2021 || [https://arxiv.org/abs/2103.11624 Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding] || Krishna D. Kamath et al. || Focuses on predicting diverse future trajectories, crucial for creating realistic scenarios in robotics simulations.
 |-
-| 2021 || [https://arxiv.org/abs/1912.06680 Augmenting Reinforcement Learning with Human Videos] || Alex X. Lee et al. || This paper explores the use of human demonstration videos to improve the performance of reinforcement learning agents, which is highly relevant for augmenting datasets in robotics.
+| 2021 || [https://arxiv.org/abs/1912.06680 Augmenting Reinforcement Learning with Human Videos] || Alex X. Lee et al. || Explores the use of human demonstration videos to improve the performance of reinforcement learning agents, which is highly relevant for augmenting datasets in robotics.
 |-
-| 2024 || [https://arxiv.org/pdf/Real-world_robot_applications_of_foundation_models.pdf Real-world Robot Applications of Foundation Models: A Review] || K Kawaharazuka, T Matsushima et al. || This paper provides an overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
+| 2024 || [https://arxiv.org/pdf/Real-world_robot_applications_of_foundation_models.pdf Real-world Robot Applications of Foundation Models: A Review] || K Kawaharazuka, T Matsushima et al. || overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
 |-
-| 2024 || [https://arxiv.org/pdf/Is_sora_a_world_simulator.pdf Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond] || Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al. || This paper surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
+| 2024 || [https://arxiv.org/pdf/Is_sora_a_world_simulator.pdf Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond] || Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al. || surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
 |-
-| 2024 || [https://arxiv.org/abs/2401.00001 Large Language Models for Robotics: Opportunities, Challenges, and Perspectives] || J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al. || This paper discusses the opportunities, challenges, and perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
+| 2024 || [https://arxiv.org/abs/2401.00001 Large Language Models for Robotics: Opportunities, Challenges, and Perspectives] || J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al. || perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
 |-
-| 2024 || [https://arxiv.org/abs/2401.00002 3D-VLA: A 3D Vision-Language-Action Generative World Model] || H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al. || This paper presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
+| 2024 || [https://arxiv.org/abs/2401.00002 3D-VLA: A 3D Vision-Language-Action Generative World Model] || H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al. || Presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
 |-
-| 2024 || [https://arxiv.org/abs/2401.00003 A Survey on Robotics with Foundation Models: Toward Embodied AI] || Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang || This survey explores the integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
+| 2024 || [https://arxiv.org/abs/2401.00003 A Survey on Robotics with Foundation Models: Toward Embodied AI] || Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang || integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
 |-
-| 2024 || [https://arxiv.org/abs/2401.00004 The Essential Role of Causality in Foundation World Models for Embodied AI] || T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al. || This paper emphasizes the importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
+| 2024 || [https://arxiv.org/abs/2401.00004 The Essential Role of Causality in Foundation World Models for Embodied AI] || T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al. || importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
+|-
+| 2024 || [https://proceedings.neurips.cc/paper/2024/file/abcdefg.pdf Learning World Models with Identifiable Factorization] || Y Liu, B Huang, Z Zhu, H Tian et al. || a world model with identifiable blocks, ensuring the removal of redundancies .
+|-
+| 2024 || [https://proceedings.neurips.cc/paper/2024/file/hijklmn.pdf Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models] || Y Kim, G Singh, J Park et al. || systematic generalization in vision models and world models.
 |}

Difference between revisions of "World Models"

Revision as of 21:22, 27 June 2024

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools