Embodied AI - Fundamentals and Applications

SLAI AP0001

Description

This course aims at introducing the fundamentals in algorithms, data and systems of the autonomous intelligent systems, which often refers to the embodied and robotics applications. As the fast advances in the field of AI, how to utilize the learning-based, data-driven approaches to improve the applications for the better human life, becomes very pivotal. We will address the key challenges in this domain, such as:

(i) How to formulate a system that is equipped with generalization, intelligence and reliability merits?
(ii) How to balance the data distribution between simulation and real-world data?
(iii) Is scaling law the only pathway towards high-level AGI?

We will introduce the concepts, principles and knowhow to build the embodied intelligent systems. The basic fundamentals would be detailed in the lectures, with tutorials and hands-on training sessions. All the important topics will be covered, such as imitation learning, reinforcement learning, and so on. The highlights in this course would consist of several guest lectures from outside renowned speakers from both industry and academia to address the latest advances in this field. The hands-on session is akin to tutorials or hackathons where students learn the recipe of technologies from scratch quickly. These features would be complementary to the main lecture and facilitate the final group presentation.

The course includes 13 lectures (by instructor or guest lecture) and 4 tutorial/hands-on session (by TA). Homework includes both written exercises and programming.

Prerequisite

The basic programming skill is needed, e.g. python and C++. We require students to have prior knowledge in undergraduate linear algebra, statistics, and probability. Background in machine learning, and computer vision may allow you to better appreciate certain aspects of the course material, but not necessarily all at once.

The course is open to research postgraduates (Rpg, MPhil, PhD). If you're curious about whether you would benefit from this course, contact the instructor for details.

Textbook and References

We do not have a fixed textbook. The following resources are for references.

Textbook:

Foundations of Computer Vision, by Antonio Torralba et al.
Understanding Deep Learning, by Simon Prince

Online courses (some contents in our course are by courtesy of them):

Introduction to Robot Learning, by Guanya Shi
Deep Learning for Robotics, by Deepak Pathak

Conferences (where you get the latest advances in this field):

Robotics: Science and Systems (RSS)
Conference on Robot Learning (CoRL)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Annual Conference on Neural Information Processing Systems (NeurIPS)

Website and Computing Tools

Course website: https://opendrivelab.com/AP0001
- We in general will use the Piazza as well to post lecture materials, homework, data, code examples, etc.
Computing tools: We will use the python and (a little bit) C++, and Google Colab environments for many of the in-lecture demos, examples and homework. Hugging Face for hosting the dataset would be recommended. Cloud services would be introduced to run some GPU experiments.

Grading Policy

The grading policy of this course is as follows:

Participation (10%)
Quiz (10%)
Assignment (20%): There will be about 2 homework assignments for the first 14 weeks. Each homework contains both a written part and a programming part.
Group Project (60%): You can work on a topic of your choice. Open-source data will be provided for your reference. Each team needs to submit a midterm project proposal and give a 10-min presentation. The proposal and presentation should show adequate literature surveys on related topics, and provide good motivations to support the ideas. Each team will deliver a short (15 min) talk in the last week together with a project report and code.
- The final presentation might be in the form of a symposium or mini-conference style, presenting Posters and panels, depending on the future arrangement.

Schedule

Date	Lecture	Topic	Note
2026/3/5	Course Introduction [BASIC]	Introduction to Embodied AI DL and ML Basics
2026/3/12	CV, Robotics and Coding Mixer [BASIC]	Computer Vision Fundamentals for Embodied AI Core robotics basics Integration practice of CV and robotics
2026/3/19	Reinforcement Learning [BASIC]	Fundamentals of Reinforcement Learning Key challenges of RL in Embodied AI Imitation learning combined with RL	HW1
2026/3/26	World Models and Foundation Models in Embodied AI [BASIC]	Core concepts and construction of world models Foundation models adaptation for embodied AI
2026/4/2	Manipulation and VLAs [ALGORITHM]	Core problems of robotic manipulation in embodied AI Design and training of VLA models VLA model deployment for typical robotic manipulation tasks	Attendance required
2026/4/9	Navigation and VLNs [ALGORITHM]	Core challenges of embodied navigation tasks Fundamentals and key technologies of VLN models VLN model optimization for unknown navigation scenarios
2026/4/16	Simulation [DATA]	High-fidelity simulation engines Sim-to-real gap mitigation Scenario and task design	HW1 due; RSS AC Meeting Trip
2026/4/23	Data Collection System / MoCap / Data Pyramid [DATA]	Motion capture fundamentals End-to-end data collection pipeline The data pyramid for embodied AI	ICLR Day 1
2026/4/30	Ego-centric Data Learning [DATA]	Egocentric perception Representation learning from ego-centric data
2026/5/7	Locomotion on Humanoids [ALGORITHM]	Humanoid locomotion fundamentals Learning-based locomotion Adaptive and robust locomotion	HW2
2026/5/14	Whole-body Control on Humanoids [ALGORITHM]	Whole-body control theory Integration of locomotion and manipulation
2026/5/21	DexHand: A Hardware Perspective [HARDWARE]	DexHand architecture deep dive Hardware-software co-design Calibration and maintenance
2026/5/28	Hands-on Experiments: LeRobot / TodderBot / MMHand [HARDWARE]	Platform familiarization Low-level control interface Algorithm deployment	CoRL submission
2026/6/4	Building Humanoids from Scratch or World Models [HARDWARE]	Full humanoid system integration "From scratch" vs. "model-centric" design World models as a virtual hardware	HW2 due; CVPR week
2026/6/11	How to Make Research Impact: Writing, Rebuttal and Presentation in Embodied AI [BASIC]	Impactful paper writing Response to reviews (rebuttal) High-impact presentation Ethics and open science	Quiz in class
2026/6/18	Final Project Symposium I [PRESENTATION]	Oral presentation sessions Peer review and moderation
2026/6/25	Final Project Symposium II [PRESENTATION]	Oral presentation sessions Peer review and moderation

Instructors

INSTRUCTOR

Hongyang Li

TAs

Zihao Zhang
Haitao Jiang
Tianyu Zhang