cover

Embodied AI - Fundamentals and Applications

SLAI AP0001

This course aims at introducing the fundamentals in algorithms, data and systems of the autonomous intelligent systems, which often refers to the embodied and robotics applications. As the fast advances in the field of AI, how to utilize the learning-based, data-driven approaches to improve the applications for the better human life, becomes very pivotal. We will address the key challenges in this domain, such as:

  • (i) How to formulate a system that is equipped with generalization, intelligence and reliability merits?
  • (ii) How to balance the data distribution between simulation and real-world data?
  • (iii) Is scaling law the only pathway towards high-level AGI?

We will introduce the concepts, principles and knowhow to build the embodied intelligent systems. The basic fundamentals would be detailed in the lectures, with tutorials and hands-on training sessions. All the important topics will be covered, such as imitation learning, reinforcement learning, and so on. The highlights in this course would consist of several guest lectures from outside renowned speakers from both industry and academia to address the latest advances in this field. The hands-on session is akin to tutorials or hackathons where students learn the recipe of technologies from scratch quickly. These features would be complementary to the main lecture and facilitate the final group presentation.

The course includes 13 lectures (by instructor or guest lecture) and 4 tutorial/hands-on session (by TA). Homework includes both written exercises and programming.

The basic programming skill is needed, e.g. python and C++. We require students to have prior knowledge in undergraduate linear algebra, statistics, and probability. Background in machine learning, and computer vision may allow you to better appreciate certain aspects of the course material, but not necessarily all at once.

The course is open to research postgraduates (Rpg, MPhil, PhD). If you're curious about whether you would benefit from this course, contact the instructor for details.

We do not have a fixed textbook. The following resources are for references.

Textbook:

Online courses (some contents in our course are by courtesy of them):

Conferences (where you get the latest advances in this field):

  • Robotics: Science and Systems (RSS)
  • Conference on Robot Learning (CoRL)
  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • Annual Conference on Neural Information Processing Systems (NeurIPS)
  • Course website: https://opendrivelab.com/AP0001
    • We in general will use the Piazza as well to post lecture materials, homework, data, code examples, etc.
  • Computing tools: We will use the python and (a little bit) C++, and Google Colab environments for many of the in-lecture demos, examples and homework. Hugging Face for hosting the dataset would be recommended. Cloud services would be introduced to run some GPU experiments.

The grading policy of this course is as follows:

  • Participation (10%)
  • Quiz (10%)
  • Assignment (20%): There will be about 2 homework assignments for the first 14 weeks. Each homework contains both a written part and a programming part.
  • Group Project (60%): You can work on a topic of your choice. Open-source data will be provided for your reference. Each team needs to submit a midterm project proposal and give a 10-min presentation. The proposal and presentation should show adequate literature surveys on related topics, and provide good motivations to support the ideas. Each team will deliver a short (15 min) talk in the last week together with a project report and code.
    • The final presentation might be in the form of a symposium or mini-conference style, presenting Posters and panels, depending on the future arrangement.
DateLectureTopicNote
2026/3/5Course Introduction [BASIC]
  • Introduction to Embodied AI
  • DL and ML Basics
2026/3/12CV, Robotics and Coding Mixer [BASIC]
  • Computer Vision Fundamentals for Embodied AI
  • Core robotics basics
  • Integration practice of CV and robotics
2026/3/19Reinforcement Learning [BASIC]
  • Fundamentals of Reinforcement Learning
  • Key challenges of RL in Embodied AI
  • Imitation learning combined with RL
HW1
2026/3/26World Models and Foundation Models in Embodied AI [BASIC]
  • Core concepts and construction of world models
  • Foundation models adaptation for embodied AI
2026/4/2Manipulation and VLAs [ALGORITHM]
  • Core problems of robotic manipulation in embodied AI
  • Design and training of VLA models
  • VLA model deployment for typical robotic manipulation tasks
Attendance required
2026/4/9Navigation and VLNs [ALGORITHM]
  • Core challenges of embodied navigation tasks
  • Fundamentals and key technologies of VLN models
  • VLN model optimization for unknown navigation scenarios
2026/4/16Simulation [DATA]
  • High-fidelity simulation engines
  • Sim-to-real gap mitigation
  • Scenario and task design
HW1 due; RSS AC Meeting Trip
2026/4/23Data Collection System / MoCap / Data Pyramid [DATA]
  • Motion capture fundamentals
  • End-to-end data collection pipeline
  • The data pyramid for embodied AI
ICLR Day 1
2026/4/30Ego-centric Data Learning [DATA]
  • Egocentric perception
  • Representation learning from ego-centric data
2026/5/7Locomotion on Humanoids [ALGORITHM]
  • Humanoid locomotion fundamentals
  • Learning-based locomotion
  • Adaptive and robust locomotion
HW2
2026/5/14Whole-body Control on Humanoids [ALGORITHM]
  • Whole-body control theory
  • Integration of locomotion and manipulation
2026/5/21DexHand: A Hardware Perspective [HARDWARE]
  • DexHand architecture deep dive
  • Hardware-software co-design
  • Calibration and maintenance
2026/5/28Hands-on Experiments: LeRobot / TodderBot / MMHand [HARDWARE]
  • Platform familiarization
  • Low-level control interface
  • Algorithm deployment
CoRL submission
2026/6/4Building Humanoids from Scratch or World Models [HARDWARE]
  • Full humanoid system integration
  • "From scratch" vs. "model-centric" design
  • World models as a virtual hardware
HW2 due; CVPR week
2026/6/11How to Make Research Impact: Writing, Rebuttal and Presentation in Embodied AI [BASIC]
  • Impactful paper writing
  • Response to reviews (rebuttal)
  • High-impact presentation
  • Ethics and open science
Quiz in class
2026/6/18Final Project Symposium I [PRESENTATION]
  • Oral presentation sessions
  • Peer review and moderation
2026/6/25Final Project Symposium II [PRESENTATION]
  • Oral presentation sessions
  • Peer review and moderation

INSTRUCTOR

TAs

  • Zihao Zhang
  • Haitao Jiang
  • Tianyu Zhang