
Embodied AI - Fundamentals and Applications
SLAI AP0001
This course aims at introducing the fundamentals in algorithms, data and systems of the autonomous intelligent systems, which often refers to the embodied and robotics applications. As the fast advances in the field of AI, how to utilize the learning-based, data-driven approaches to improve the applications for the better human life, becomes very pivotal. We will address the key challenges in this domain, such as:
- (i) How to formulate a system that is equipped with generalization, intelligence and reliability merits?
- (ii) How to balance the data distribution between simulation and real-world data?
- (iii) Is scaling law the only pathway towards high-level AGI?
We will introduce the concepts, principles and knowhow to build the embodied intelligent systems. The basic fundamentals would be detailed in the lectures, with tutorials and hands-on training sessions. All the important topics will be covered, such as imitation learning, reinforcement learning, and so on. The highlights in this course would consist of several guest lectures from outside renowned speakers from both industry and academia to address the latest advances in this field. The hands-on session is akin to tutorials or hackathons where students learn the recipe of technologies from scratch quickly. These features would be complementary to the main lecture and facilitate the final group presentation.
The course includes 13 lectures (by instructor or guest lecture) and 4 tutorial/hands-on session (by TA). Homework includes both written exercises and programming.
The basic programming skill is needed, e.g. python and C++. We require students to have prior knowledge in undergraduate linear algebra, statistics, and probability. Background in machine learning, and computer vision may allow you to better appreciate certain aspects of the course material, but not necessarily all at once.
The course is open to research postgraduates (Rpg, MPhil, PhD). If you're curious about whether you would benefit from this course, contact the instructor for details.
We do not have a fixed textbook. The following resources are for references.
Textbook:
- Foundations of Computer Vision, by Antonio Torralba et al.
- Understanding Deep Learning, by Simon Prince
Online courses (some contents in our course are by courtesy of them):
- Introduction to Robot Learning, by Guanya Shi
- Deep Learning for Robotics, by Deepak Pathak
Conferences (where you get the latest advances in this field):
- Robotics: Science and Systems (RSS)
- Conference on Robot Learning (CoRL)
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- Annual Conference on Neural Information Processing Systems (NeurIPS)
- Course website: https://opendrivelab.com/AP0001
- We in general will use the Piazza as well to post lecture materials, homework, data, code examples, etc.
- Computing tools: We will use the python and (a little bit) C++, and Google Colab environments for many of the in-lecture demos, examples and homework. Hugging Face for hosting the dataset would be recommended. Cloud services would be introduced to run some GPU experiments.
The grading policy of this course is as follows:
- Participation (10%)
- Quiz (10%)
- Assignment (20%): There will be about 2 homework assignments for the first 14 weeks. Each homework contains both a written part and a programming part.
- Group Project (60%): You can work on a topic of your choice. Open-source data will be provided for your reference. Each team needs to submit a midterm project proposal and give a 10-min presentation. The proposal and presentation should show adequate literature surveys on related topics, and provide good motivations to support the ideas. Each team will deliver a short (15 min) talk in the last week together with a project report and code.
- The final presentation might be in the form of a symposium or mini-conference style, presenting Posters and panels, depending on the future arrangement.
| Date | Lecture | Topic | Note |
|---|---|---|---|
| 2026/3/5 | Course Introduction [BASIC] |
| |
| 2026/3/12 | CV, Robotics and Coding Mixer [BASIC] |
| |
| 2026/3/19 | Reinforcement Learning [BASIC] |
| HW1 |
| 2026/3/26 | World Models and Foundation Models in Embodied AI [BASIC] |
| |
| 2026/4/2 | Manipulation and VLAs [ALGORITHM] |
| Attendance required |
| 2026/4/9 | Navigation and VLNs [ALGORITHM] |
| |
| 2026/4/16 | Simulation [DATA] |
| HW1 due; RSS AC Meeting Trip |
| 2026/4/23 | Data Collection System / MoCap / Data Pyramid [DATA] |
| ICLR Day 1 |
| 2026/4/30 | Ego-centric Data Learning [DATA] |
| |
| 2026/5/7 | Locomotion on Humanoids [ALGORITHM] |
| HW2 |
| 2026/5/14 | Whole-body Control on Humanoids [ALGORITHM] |
| |
| 2026/5/21 | DexHand: A Hardware Perspective [HARDWARE] |
| |
| 2026/5/28 | Hands-on Experiments: LeRobot / TodderBot / MMHand [HARDWARE] |
| CoRL submission |
| 2026/6/4 | Building Humanoids from Scratch or World Models [HARDWARE] |
| HW2 due; CVPR week |
| 2026/6/11 | How to Make Research Impact: Writing, Rebuttal and Presentation in Embodied AI [BASIC] |
| Quiz in class |
| 2026/6/18 | Final Project Symposium I [PRESENTATION] |
| |
| 2026/6/25 | Final Project Symposium II [PRESENTATION] |
|
INSTRUCTOR
TAs
- Zihao Zhang
- Haitao Jiang
- Tianyu Zhang
