Difference between revisions of "World Models"

From Humanoid Robots Wiki
Jump to: navigation, search
Line 3: Line 3:
 
! Date !! Title !! Authors !! Summary
 
! Date !! Title !! Authors !! Summary
 
|-
 
|-
| 2017 || [https://arxiv.org/abs/1612.07828 Learning from Simulated and Unsupervised Images through Adversarial Training] || Ashish Shrivastava et al. || technique that refines simulated images to make them more realistic using adversarial training, enhancing the quality of synthetic data for training robotics models.
+
| data-sort-value="2024-01-01" | 2024 || [https://arxiv.org/abs/2402.05741 Real-world Robot Applications of Foundation Models: A Review] || K Kawaharazuka, T Matsushima et al. || overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
 
|-
 
|-
| 2018 || [https://arxiv.org/abs/1803.10122 World Models] || David Ha and Jürgen Schmidhuber || agent builds a compact model of the world and uses it to plan and dream, improving its performance in real environments. This aligns well with the interest in universal simulators.
+
| data-sort-value="2024-01-02" | 2024 || [https://arxiv.org/abs/2405.03520 Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond] || Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al. || surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
 
|-
 
|-
| 2020 || [https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis] || Ben Mildenhall et al. || high-fidelity views of complex 3D scenes, instrumental in creating synthetic data for robotics, and relevant for generating diverse visual environments for training robots.
+
| data-sort-value="2024-01-03" | 2024 || [https://arxiv.org/abs/2403.09631 Large Language Models for Robotics: Opportunities, Challenges, and Perspectives] || J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al. || perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2402.05741 Real-world Robot Applications of Foundation Models: A Review] || K Kawaharazuka, T Matsushima et al. || overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
+
| data-sort-value="2024-01-04" | 2024 || [https://arxiv.org/abs/2403.09631 3D-VLA: A 3D Vision-Language-Action Generative World Model] || H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al. || Presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2405.03520 Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond] || Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al. || surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
+
| data-sort-value="2024-01-05" | 2024 || [https://arxiv.org/abs/2402.02385 A Survey on Robotics with Foundation Models: Toward Embodied AI] || Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang || integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2403.09631 Large Language Models for Robotics: Opportunities, Challenges, and Perspectives] || J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al. || perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
+
| data-sort-value="2024-01-06" | 2024 || [https://arxiv.org/abs/2402.06665 The Essential Role of Causality in Foundation World Models for Embodied AI] || T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al. || importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2403.09631 3D-VLA: A 3D Vision-Language-Action Generative World Model] || H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al. || Presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
+
| data-sort-value="2024-01-07" | 2024 || [https://arxiv.org/abs/2306.06561 Learning World Models with Identifiable Factorization] || Y Liu, B Huang, Z Zhu, H Tian et al. || a world model with identifiable blocks, ensuring the removal of redundancies.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2402.02385 A Survey on Robotics with Foundation Models: Toward Embodied AI] || Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang || integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
+
| data-sort-value="2024-01-08" | 2024 || [https://arxiv.org/abs/2311.09064 Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models] || Y Kim, G Singh, J Park et al. || systematic generalization in vision models and world models.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2402.06665 The Essential Role of Causality in Foundation World Models for Embodied AI] || T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al. || importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
+
| data-sort-value="2020-01-01" | 2020 || [https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis] || Ben Mildenhall et al. || high-fidelity views of complex 3D scenes, instrumental in creating synthetic data for robotics, and relevant for generating diverse visual environments for training robots.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2306.06561 Learning World Models with Identifiable Factorization] || Y Liu, B Huang, Z Zhu, H Tian et al. || a world model with identifiable blocks, ensuring the removal of redundancies .
+
| data-sort-value="2018-01-01" | 2018 || [https://arxiv.org/abs/1803.10122 World Models] || David Ha and Jürgen Schmidhuber || agent builds a compact model of the world and uses it to plan and dream, improving its performance in real environments. This aligns well with the interest in universal simulators.
 
|-
 
|-
| 2024 || [https://arxiv.org/abs/2311.09064 Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models] || Y Kim, G Singh, J Park et al. || systematic generalization in vision models and world models.
+
| data-sort-value="2017-01-01" | 2017 || [https://arxiv.org/abs/1612.07828 Learning from Simulated and Unsupervised Images through Adversarial Training] || Ashish Shrivastava et al. || technique that refines simulated images to make them more realistic using adversarial training, enhancing the quality of synthetic data for training robotics models.
 
|}
 
|}

Revision as of 06:39, 28 June 2024

World models leverage video data to create rich, synthetic datasets, enhancing the learning process for robotic systems. By generating diverse and realistic training scenarios, world models address the challenge of insufficient real-world data, enabling robots to acquire and refine skills more efficiently.

Date Title Authors Summary
2024 Real-world Robot Applications of Foundation Models: A Review K Kawaharazuka, T Matsushima et al. overview of the practical application of foundation models in real-world robotics, including the integration of specific components within existing robot systems.
2024 Is SORA a World Simulator? A Comprehensive Survey on General World Models and Beyond Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou et al. surveys the applications of world models in various fields, including robotics, and discusses the potential of the SORA framework as a world simulator.
2024 Large Language Models for Robotics: Opportunities, Challenges, and Perspectives J Wang, Z Wu, Y Li, H Jiang, P Shu, E Shi, H Hu et al. perspectives of using large language models in robotics, focusing on model transparency, robustness, safety, and real-world applicability.
2024 3D-VLA: A 3D Vision-Language-Action Generative World Model H Zhen, X Qiu, P Chen, J Yang, X Yan, Y Du et al. Presents 3D-VLA, a generative world model that combines vision, language, and action to guide robot control and achieve goal objectives.
2024 A Survey on Robotics with Foundation Models: Toward Embodied AI Z Xu, K Wu, J Wen, J Li, N Liu, Z Che, J Tang integration of foundation models in robotics, addressing safety and interpretation challenges in real-world scenarios, particularly in densely populated environments.
2024 The Essential Role of Causality in Foundation World Models for Embodied AI T Gupta, W Gong, C Ma, N Pawlowski, A Hilmkil et al. importance of causality in foundation world models for embodied AI, predicting that these models will simplify the introduction of new robots into everyday life.
2024 Learning World Models with Identifiable Factorization Y Liu, B Huang, Z Zhu, H Tian et al. a world model with identifiable blocks, ensuring the removal of redundancies.
2024 Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models Y Kim, G Singh, J Park et al. systematic generalization in vision models and world models.
2020 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall et al. high-fidelity views of complex 3D scenes, instrumental in creating synthetic data for robotics, and relevant for generating diverse visual environments for training robots.
2018 World Models David Ha and Jürgen Schmidhuber agent builds a compact model of the world and uses it to plan and dream, improving its performance in real environments. This aligns well with the interest in universal simulators.
2017 Learning from Simulated and Unsupervised Images through Adversarial Training Ashish Shrivastava et al. technique that refines simulated images to make them more realistic using adversarial training, enhancing the quality of synthetic data for training robotics models.