Skip to content

Latest commit

 

History

History
412 lines (292 loc) · 43.4 KB

awesome_robotics_llm.md

File metadata and controls

412 lines (292 loc) · 43.4 KB

Awesome-robotics-llm

Survey

  • Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI, arXiv, 2407.06886, arxiv, pdf, cication: 2

    Yang Liu, Weixing Chen, Yongjie Bai, Guanbin Li, Wen Gao, Liang Lin · (embodied_ai_paper_list - hcplab-sysu) Star

  • A Survey on Integration of Large Language Models with Intelligent Robots, arXiv, 2404.09228, arxiv, pdf, cication: -1

    Yeseung Kim, Dohyun Kim, Jieun Choi, Jisang Park, Nayoung Oh, Daehyung Park

Papers

  • Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning, arXiv, 2407.20798, arxiv, pdf, cication: -1

    Norman Di Palo, Leonard Hasenclever, Jan Humplik, Arunkumar Byravan · (sites.google) · (mp.weixin.qq)

  • ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning, arXiv, 2406.19741, arxiv, pdf, cication: -1

    Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai

    · (HEBO - huawei-noah) Star

  • LLaRA: Supercharging Robot Learning Data for Vision-Language Policy, arXiv, 2406.20095, arxiv, pdf, cication: -1

    Xiang Li, Cristina Mata, Jongwoo Park, Kumara Kahatapitiya, Yoo Sung Jang, Jinghuan Shang, Kanchana Ranasinghe, Ryan Burgert, Mu Cai, Yong Jae Lee

    · (LLaRA - LostXine) Star

  • PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators, arXiv, 2406.20083, arxiv, pdf, cication: -1

    Kuo-Hao Zeng, Zichen Zhang, Kiana Ehsani, Rose Hendrix, Jordi Salvador, Alvaro Herrasti, Ross Girshick, Aniruddha Kembhavi, Luca Weihs

  • OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents, arXiv, 2407.00114, arxiv, pdf, cication: -1

    Zihao Wang, Shaofei Cai, Zhancun Mu, Haowei Lin, Ceyao Zhang, Xuejie Liu, Qing Li, Anji Liu, Xiaojian Ma, Yitao Liang

    · (OmniJarvis - CraftJarvis) Star · (omnijarvis.github)

  • Octo-planner: On-device Language Model for Planner-Action Agents, arXiv, 2406.18082, arxiv, pdf, cication: -1

    Wei Chen, Zhiyuan Li, Zhen Guo, Yikang Shen · (huggingface) · (nexa4ai)

  • OpenVLA: An Open-Source Vision-Language-Action Model, arXiv, 2406.09246, arxiv, pdf, cication: -1

    Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi · (openvla.github) · (openvla - openvla) Star

  • Octo: An Open-Source Generalist Robot Policy, arXiv, 2405.12213, arxiv, pdf, cication: 26

    Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu · (octo-models.github) · (octo - octo-models) Star

  • From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control, arXiv, 2405.04798, arxiv, pdf, cication: -1

    Yide Shentu, Philipp Wu, Aravind Rajeswaran, Pieter Abbeel

  • LEGENT: Open Platform for Embodied Agents, arXiv, 2404.18243, arxiv, pdf, cication: -1

    Zhili Cheng, Zhitong Wang, Jinyi Hu, Shengding Hu, An Liu, Yuge Tu, Pengkai Li, Lei Shi, Zhiyuan Liu, Maosong Sun

  • DrEureka | Language Model Guided Sim-To-Real Transfer

    · (eureka-research.github) · (DrEureka - eureka-research) Star

  • OpenEQA: Embodied Question Answering in the Era of Foundation Models

    · (open-eqa.github) · (open-eqa - facebookresearch) Star

  • Yell At Your Robot: Improving On-the-Fly from Language Corrections, arXiv, 2403.12910, arxiv, pdf, cication: -1

    Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, Chelsea Finn · (qbitai)

  • ShapeGrasp: Zero-Shot Task-Oriented Grasping with Large Language Models through Geometric Decomposition, arXiv, 2403.18062, arxiv, pdf, cication: -1

    Samuel Li, Sarthak Bhagat, Joseph Campbell, Yaqi Xie, Woojun Kim, Katia Sycara, Simon Stepputtis · (shapegrasp.github)

  • 3D-VLA: A 3D Vision-Language-Action Generative World Model, arXiv, 2403.09631, arxiv, pdf, cication: -1

    Haoyu Zhen, Xiaowen Qiu, Peihao Chen, Jincheng Yang, Xin Yan, Yilun Du, Yining Hong, Chuang Gan

    · (mp.weixin.qq)

  • EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents, arXiv, 2403.12014, arxiv, pdf, cication: -1

    Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal · (envgen-llm.github)

  • CoPa

  • SIMA generalist AI agent for 3D virtual environments - Google DeepMind

    · (storage.googleapis)

  • RT-H: Action Hierarchies Using Language, arXiv, 2403.01823, arxiv, pdf, cication: -1

    Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh · (rt-hierarchy.github)

  • Empowering Large Language Model Agents through Action Learning, arXiv, 2402.15809, arxiv, pdf, cication: -1

    Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

  • Learning to Learn Faster from Human Feedback with Language Model Predictive Control, arXiv, 2402.11450, arxiv, pdf, cication: -1

    Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed · (robot-teaching.github)

  • Generative Expressive Robot Behaviors using Large Language Models, arXiv, 2401.14673, arxiv, pdf, cication: -1

    Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh · (generative-expressive-motion.github)

  • Adaptive Mobile Manipulation for Articulated Objects In the Open World, arXiv, 2401.14403, arxiv, pdf, cication: -1

    Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak · (open-world-mobilemanip.github)

  • OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics, arXiv, 2401.12202, arxiv, pdf, cication: -1

    Peiqi Liu, Yaswanth Orru, Chris Paxton, Nur Muhammad Mahi Shafiullah, Lerrel Pinto · (ok-robot.github)

  • AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

    · (auto-rt.github)

  • Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis, arXiv, 2312.08782, arxiv, pdf, cication: -1

    Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao

  • ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent, arXiv, 2312.10003, arxiv, pdf, cication: -1

    Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan

  • ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation, arXiv, 2312.16217, arxiv, pdf, cication: 1

    Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong · (qbitai) · (sites.google)

  • Vision-Language Models as a Source of Rewards, arXiv, 2312.09187, arxiv, pdf, cication: -1

    Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin

  • Foundation Models in Robotics: Applications, Challenges, and the Future, arXiv, 2312.07843, arxiv, pdf, cication: -1

    Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman

    · (Awesome-Robotics-Foundation-Models - robotics-survey) Star

  • SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions, arXiv, 2312.01307, arxiv, pdf, cication: -1

    Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas · (geometry.stanford) · (SAGE - geng-haoran) Star

  • Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones, arXiv, 2311.15033, arxiv, pdf, cication: -1

    Haoran Zhao, Fengxing Pan, Huqiuyue Ping, Yaoming Zhou · (qbitai)

  • From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3", arXiv, 2312.06571, arxiv, pdf, cication: -1

    Takahide Yoshida, Atsushi Masumori, Takashi Ikegami · (tnoinkwms.github)

  • Controllable Human-Object Interaction Synthesis, arXiv, 2312.03913, arxiv, pdf, cication: -1

    Jiaman Li, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig, C. Karen Liu

  • Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia, arXiv, 2312.03664, arxiv, pdf, cication: -1

    Alexander Sasha Vezhnevets, John P. Agapiou, Avia Aharon, Ron Ziv, Jayd Matyas, Edgar A. Duéñez-Guzmán, William A. Cunningham, Simon Osindero, Danny Karmon, Joel Z. Leibo

    · (concordia - google-deepmind) Star

  • Vision-Language Foundation Models as Effective Robot Imitators, arXiv, 2311.01378, arxiv, pdf, cication: -1

    Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu · (RoboFlamingo - RoboFlamingo) Star · (jiqizhixin)

  • GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration, arXiv, 2311.12015, arxiv, pdf, cication: -1

    Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi · (microsoft.github)

    · (jiqizhixin)

  • Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections, arXiv, 2311.10678, arxiv, pdf, cication: -1

    Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh

  • GOAT: GO to Any Thing, arXiv, 2311.06430, arxiv, pdf, cication: -1

    Matthew Chang, Theophile Gervet, Mukul Khanna, Sriram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra

  • LLaMA Rider: Spurring Large Language Models to Explore the Open World, arXiv, 2310.08922, arxiv, pdf, cication: -1

    Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, Zongqing Lu · [jiqizhixin]

  • RoboVQA: Multimodal Long-Horizon Reasoning for Robotics, arXiv, 2311.00899, arxiv, pdf, cication: -1

    Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi

  • Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning, arXiv, 2310.20587, arxiv, pdf, cication: -1

    Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu

  • Large Language Models as Generalizable Policies for Embodied Tasks, arXiv, 2310.17722, arxiv, pdf, cication: -1

    Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev

  • Creative Robot Tool Use with Large Language Models, arXiv, 2310.13065, arxiv, pdf, cication: -1

    Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, Yaru Niu, Tingnan Zhang, Fei Xia, Jie Tan, Ding Zhao

  • Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning, arXiv, 2310.12921, arxiv, pdf, cication: -1

    Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, David Lindner

  • Eureka: Human-Level Reward Design via Coding Large Language Models, arXiv, 2310.12931, arxiv, pdf, cication: 1

    Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar · [Eureka - eureka-research] Star · [qbitai]

  • Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1

    Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

  • Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control, arXiv, 2307.00117, arxiv, pdf, cication: 3

    Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine · [bair.berkeley]

  • Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency, arXiv, 2309.17382, arxiv, pdf, cication: -1

    Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang · [RAFA_code - agentification] Star

  • Video Language Planning, arXiv, 2310.10625, arxiv, pdf, cication: -1

    Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum

  • Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1

    Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

  • FireAct: Toward Language Agent Fine-tuning, arXiv, 2310.05915, arxiv, pdf, cication: -1

    Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao

  • LangNav: Language as a Perceptual Representation for Navigation, arXiv, 2310.07889, arxiv, pdf, cication: -1

    Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim

  • Learning Interactive Real-World Simulators, arXiv, 2310.06114, arxiv, pdf, cication: -1

    Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Dale Schuurmans, Pieter Abbeel · [universal-simulator.github]

  • GenSim: Generating Robotic Simulation Tasks via Large Language Models, arXiv, 2310.01361, arxiv, pdf, cication: -1

    Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang

  • A Data Source for Reasoning Embodied Agents, AAAI, 2023, arxiv, pdf, cication: 1

    Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

  • Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping, CoRL, 2023, arxiv, pdf, cication: 3

    Adam Rashid, Satvik Sharma, Chung Min Kim, Justin Kerr, Lawrence Chen, Angjoo Kanazawa, Ken Goldberg

  • Thought Cloning: Learning to Think while Acting by Imitating Human Thinking, arXiv, 2306.00323, arxiv, pdf, cication: 4

    Shengran Hu, Jeff Clune

  • Physically Grounded Vision-Language Models for Robotic Manipulation, arXiv, 2309.02561, arxiv, pdf, cication: 1

    Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh

  • BEVBert: Multimodal Map Pre-training for Language-guided Navigation, arXiv, 2212.04385, arxiv, pdf, cication: 11

    Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, Jing Shao · [vln-bevbert - marsaki] Star

  • Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation, arXiv, 2308.07931, arxiv, pdf, cication: 3

    William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola · [qbitai] · [jiqizhixin]

  • Foundation Model based Open Vocabulary Task Planning and Executive System for General Purpose Service Robots, arXiv, 2308.03357, arxiv, pdf, cication: 1

    Yoshiki Obinata, Naoaki Kanazawa, Kento Kawaharazuka, Iori Yanokura, Soonhyo Kim, Kei Okada, Masayuki Inaba

  • Learning to Model the World with Language, arXiv, 2308.01399, arxiv, pdf, cication: 4

    Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan · [mp.weixin.qq]

  • Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI, arXiv, 2308.05221, arxiv, pdf, cication: 1

    Hangjie Shi, Leslie Ball, Govind Thattai, Desheng Zhang, Lucy Hu, Qiaozi Gao, Suhaila Shakiah, Xiaofeng Gao, Aishwarya Padmakumar, Bofei Yang

  • Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition, arXiv, 2307.14535, arxiv, pdf, cication: 9

    Huy Ha, Pete Florence, Shuran Song

  • RT-2: Vision-Language-Action Models

    · [qbitai] · [robotics-transformer2.github]

  • Towards A Unified Agent with Foundation Models, workshop on reincarnating reinforcement learning at iclr 2023, 2023, arxiv, pdf, cication: 9

    Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess, Martin Riedmiller

  • SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning, arXiv, 2307.06135, arxiv, pdf, cication: 13

    Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, Niko Suenderhauf

  • VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models, arXiv, 2307.05973, arxiv, pdf, cication: 35

    Wenlong Huang, Chen Wang, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Li Fei-Fei · [voxposer.github] · [mp.weixin.qq]

  • Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation, arXiv, 2307.03659, arxiv, pdf, cication: 3

    Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn

  • Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners, arXiv, 2307.01928, arxiv, pdf, cication: 24

    Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley

  • Building Cooperative Embodied Agents Modularly with Large Language Models, arXiv, 2307.02485, arxiv, pdf, cication: 5

    Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan

  • ChatGPT for Robotics: Design Principles and Model Abilities, microsoft auton. syst. robot. res, 2023, arxiv, pdf, cication: 111

    Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor · [PromptCraft-Robotics - microsoft] Star

  • Statler: State-Maintaining Language Models for Embodied Reasoning, arXiv, 2306.17840, arxiv, pdf, cication: 5

    Takuma Yoneda, Jiading Fang, Peng Li, Huanyu Zhang, Tianchong Jiang, Shengjie Lin, Ben Picker, David Yunis, Hongyuan Mei, Matthew R. Walter

  • REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction, arXiv, 2306.15724, arxiv, pdf, cication: 13

    Zeyi Liu, Arpit Bahety, Shuran Song

  • ViNT: A Foundation Model for Visual Navigation, arXiv, 2306.14846, arxiv, pdf, cication: 10

    Dhruv Shah, Ajay Sridhar, Nitish Dashora, Kyle Stachowicz, Kevin Black, Noriaki Hirose, Sergey Levine · [visualnav-transformer.github]

  • HomeRobot: Open-Vocabulary Mobile Manipulation, arXiv, 2306.11565, arxiv, pdf, cication: 5

    Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner

  • Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27

    Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik

  • SayTap: Language to Quadrupedal Locomotion, arXiv, 2306.07580, arxiv, pdf, cication: 4

    Yujin Tang, Wenhao Yu, Jie Tan, Heiga Zen, Aleksandra Faust, Tatsuya Harada

  • ChessGPT: Bridging Policy Learning and Language Modeling, arXiv, 2306.09200, arxiv, pdf, cication: 2

    Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang

  • Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27

    Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik

  • Embodied Executable Policy Learning with Language-based Scene Summarization, arXiv, 2306.05696, arxiv, pdf, cication: -1

    Jielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao

  • GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System, arXiv, 2306.01741, arxiv, pdf, cication: -1

    Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

  • ReAct: Synergizing Reasoning and Acting in Language Models, arXiv, 2210.03629, arxiv, pdf, cication: 293

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao

Robotics

  • Berkeley Humanoid: A Research Platform for Learning-based Control, arXiv, 2407.21781, arxiv, pdf, cication: -1

    Qiayuan Liao, Bike Zhang, Xuanyu Huang, Xiaoyu Huang, Zhongyu Li, Koushil Sreenath · (berkeley-humanoid)

  • Theia: Distilling Diverse Vision Foundation Models for Robot Learning, arXiv, 2407.20179, arxiv, pdf, cication: -1

    Jinghuan Shang, Karl Schmeckpeper, Brandon B. May, Maria Vittoria Minniti, Tarik Kelestemur, David Watkins, Laura Herlant · (theia.theaiinstitute)

  • Achieving Human Level Competitive Robot Table Tennis, arXiv, 2408.03906, arxiv, pdf, cication: -1

    David B. D'Ambrosio, Saminda Abeyruwan, Laura Graesser, Atil Iscen, Heni Ben Amor, Alex Bewley, Barney J. Reed, Krista Reymann, Leila Takayama, Yuval Tassa · (sites.google)

  • Berkeley Humanoid: A Research Platform for Learning-based Control

  • Theia: Distilling Diverse Vision Foundation Models for Robot Learning, arXiv, 2407.20179, arxiv, pdf, cication: -1

    Jinghuan Shang, Karl Schmeckpeper, Brandon B. May, Maria Vittoria Minniti, Tarik Kelestemur, David Watkins, Laura Herlant

  • Cross Anything: General Quadruped Robot Navigation through Complex Terrains, arXiv, 2407.16412, arxiv, pdf, cication: -1

    Shaoting Zhu, Derun Li, Yong Liu, Ningyi Xu, Hang Zhao · (cross-anything.github)

  • robocasa_rss24.pdf

    · (robocasa)

  • RoboDreamer: Learning Compositional World Models for Robot Imagination, arXiv, 2404.12377, arxiv, pdf, cication: -1

    Siyuan Zhou, Yilun Du, Jiaben Chen, Yandong Li, Dit-Yan Yeung, Chuang Gan · (robovideo.github)

  • TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction, arXiv, 2405.10315, arxiv, pdf, cication: -1

    Yunfan Jiang, Chen Wang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei · (transic-robot.github)

  • lerobot - huggingface Star

  • Learning H-Infinity Locomotion Control, arXiv, 2404.14405, arxiv, pdf, cication: -1

    Junfeng Long, Wenye Yu, Quanyi Li, Zirui Wang, Dahua Lin, Jiangmiao Pang

  • Body Design and Gait Generation of Chair-Type Asymmetrical Tripedal Low-rigidity Robot, arXiv, 2404.05932, arxiv, pdf, cication: -1

    Shintaro Inoue, Kento Kawaharazuka, Kei Okada, Masayuki Inaba · (shin0805.github)

  • Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

    · (vid2robot.github)

  • DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset, arXiv, 2403.12945, arxiv, pdf, cication: -1

    Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis · (droid-dataset.github)

  • DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation, arXiv, 2403.07788, arxiv, pdf, cication: -1

    Chen Wang, Haochen Shi, Weizhuo Wang, Ruohan Zhang, Li Fei-Fei, C. Karen Liu

  • DexCap | Scalable and Portable Mocap Data Collection System for Dexterous Manipulation

    · (DexCap - j96w) Star

  • Learning Generalizable Feature Fields for Mobile Manipulation, arXiv, 2403.07563, arxiv, pdf, cication: -1

    Ri-Zhao Qiu, Yafei Hu, Ge Yang, Yuchen Song, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer

  • Humanoid Locomotion as Next Token Prediction, arXiv, 2402.19469, arxiv, pdf, cication: 1

    Ilija Radosavovic, Bike Zhang, Baifeng Shi, Jathushan Rajasegaran, Sarthak Kamat, Trevor Darrell, Koushil Sreenath, Jitendra Malik · (mp.weixin.qq)

Demos

Projects

  • Co-LLM-Agents - UMass-Foundation-Model Star

    Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models" · [qbitai]

Other

Reference