Robotics 101:
An Odyssey from A Vision Perspective

CVPR 2025 Tutorial
202B, June 12, Nashville, USA

Join other events at CVPR 2025

Introduction

In recent years, there has been a growing interest in robotics in the vision community. Getting started in robotics research is challenging since it requires a wide range of knowledge. This comprehensive full-day tutorial covers all aspects of robotics. We will provide the necessary background to understand the different aspects of robotics. We will also explore technical advancements, key challenges, and potential avenues for future research.

Contact

Tentative Schedule

Time zone:
  • Hongyang Li
    The University of Hong Kong, China
    Opening Remarks
  • Chonghao Sima
    OpenDriveLab, China
    When Autonomous Driving meets Test-time Computing
    Biography

    Chonghao Sima, a second-year Ph.D. student at The University of Hong Kong, has published 10+ first-author/co-first-author papers in top-tier conferences and journals including NeurIPS, CVPR, ICCV, ECCV, and PAMI. His collaborative work on end-to-end autonomous driving, UniAD, received the Best Paper Award at IEEE CVPR 2023. His contribution to BEVFormer was recognized in the 2022 Top 100 Most Influential AI Papers list. Honored as an Outstanding Reviewer for CVPR 2023, he led two award-winning projects: PersFormer (2022) and DriveLM (2024), both selected for oral presentations at ECCV. DriveLM ranked #9 on the most influential papers of ECCV 2024. His publications have accumulated 3,400+ citations with 4,000+ GitHub stars across repositories. Research interests focus on 3D perception, multimodal models, end-to-end autonomous driving systems and upper-body humanoid.

  • Dhruv Shah
    Google DeepMind, USA
    Theme TBD
    Biography

    Dhruv Shah is a Senior Research Scientist at Google DeepMind, working foundation models of and for robotics. Previously, he obtained his PhD in EECS at UC Berkeley, where he was advised by Sergey Levine. His research was supported by the Berkeley Fellowship for Graduate Study, and has been nominated for (and received) several Best Paper Awards at leading robotics conferences, including RSS and ICRA.

    Earlier, he graduated with honors from IIT Bombay, where he received the Undergraduate Research Award and the Institute Academic Prize. He have also been fortunate to spend time at Meta AI (FAIR), Google DeepMind (Brain Robotics), Carnegie Mellon University, Imperial College London and the University of Sydney.

  • Coffee Break
  • Guanya Shi
    CMU, USA
    Theme TBD
    Biography

    Guanya Shi is an Assistant Professor in the Robotics Institute and the School of Computer Science at Carnegie Mellon University (CMU). He lead the LeCAR (Learning and Control for Agile Robotics) Lab. He also have courtesy faculty positions in the Electrical and Computer Engineering and Mechanical Engineering Departments.

    He completed his Ph.D. in 2022 from Caltech, advised by Soon-Jo Chung and Yisong Yue. He received a B.E. from Tsinghua University in 2017. From 2022 to 2023, he was a postdoctoral scholar in the Paul G. Allen School of Computer Science and Engineering at the University of Washington with Byron Boots.

  • Lunch Break
  • Davide Scaramuzza
    University of Zurich, Switzerland
    Theme TBD
    Biography

    Davide Scaramuzza is a Professor of Robotics and Perception at the University of Zurich. He did his Ph.D. at ETH Zurich, a postdoc at the University of Pennsylvania, and was a visiting professor at Stanford University. His research focuses on autonomous, agile navigation of micro drones using standard and event-based cameras. He pioneered autonomous, vision-based navigation of micro drones, which inspired the navigation algorithm of the NASA Mars helicopter and many drone companies. In 2022, his team demonstrated that an AI-powered drone could outperform the world champions of drone racing, a result published in Nature and featured on the magazine cover. His result marks the first time an AI defeated a human in the physical world (previous AI wins against humans at chess, Go, StanrCraft, and Gran Turismo were done on board games or video games). His research contributed significantly to visual-inertial state estimation, vision-based agile navigation of microdrones, and low-latency, robust perception with event cameras. His results have been transferred to many products, from drones to automobiles, cameras, AR/VR headsets, and mobile devices. He counts several entrepreneurial achievements: In 2015, he co-founded Zurich-Eye, which became Facebook-Meta Zurich and developed the world-leading virtual-reality headset Meta Quest. In 2020, he co-founded SUIND, which builds autonomous drones for precision agriculture. For his research contributions and tech transfer, he has won many awards, including a Kiyo-Tomiyasu IEEE Technical Field Award (won only by two other roboticists in the 20 years of existence of this award), an IEEE Robotics and Automation Society Early Career Award, a European Research Council Consolidator Grant, and many paper awards, including the 2023 IROS Best Paper Award, the 2022 IEEE Robotics and Automation Letters Best Paper Award, and the 2018 IEEE Transactions on Robotics Best Paper Award. He co-authored the book "Introduction to Autonomous Mobile Robots," published by MIT Press, which has sold over 10 thousand copies worldwide and is among the most used textbooks for teaching mobile robotics. He has been consulting the United Nations on disaster response, the Fukushima Action Plan, disarmament, and AI for good. Many aspects of his research have been featured in the media, such as The New York Times, The Economist, The Guardian, and Forbes.

  • Boris Ivanovic
    NVIDIA, USA
    Theme TBD
    Biography

    Boris is currently a Senior Research Scientist and Manager in NVIDIA's Autonomous Vehicle Research Group. His research interests include novel end-to-end AV architectures, sensor and traffic simulation, AI safety, and the thoughtful integration of foundation models in AV development. Prior to joining NVIDIA, he received his Ph.D. in Aeronautics and Astronautics under the supervision of Marco Pavone in 2021 and an M.S. in Computer Science in 2018, both from Stanford University. He received his B.A.Sc. in Engineering Science from the University of Toronto in 2016.

  • Coffee Break
  • Deepak Pathak
    CMU, USA
    Theme TBD
    Biography

    Deepak Pathak is Raj Reddy Assistant Professor at Carnegie Mellon University in the School of Computer Science. He is a member of the Robotics Institute and affiliated to Machine Learning Department. He works in Artificial Intelligence at the intersection of Computer Vision, Machine Learning & Robotics.

    Previously, he spent a year as researcher at Meta AI Research collaborating with Jitendra Malik and visiting PostDoc at UC Berkeley with Pieter Abbeel. He received my PhD from UC Berkeley advised by Alyosha Efros & Trevor Darrell, and his Bachelors in Computer Science from IIT Kanpur.

  • Yilun Du
    Harvard, USA
    Theme TBD
    Biography

    Yilun Du is currently a senior research scientist at Google Deepmind and an incoming Assistant Professor at Harvard starting in Fall 2025 in the Kempner Institute and CS. He received my PhD at MIT EECS, advised by Prof. Leslie Kaelbling, Prof. Tomas Lozano-Perez and Prof. Joshua B. Tenenbaum. Previously, he also obtained his bachelor's degree from MIT, was a research fellow at OpenAI, an intern and visiting researcher at FAIR and Google Deepmind, and got a gold medal at the International Biology Olympiad. His research focuses on generative models, decision making, robot learning, embodied agents, and the applications of such tools to scientific domains.

  • Closing Remarks

Speakers

Davide Scaramuzza

Professor University of Zurich

Deepak Pathak

Assistant Professor CMU

Guanya Shi

Assistant Professor CMU

Boris Ivanovic

Senior Research Scientist NVIDIA

Dhruv Shah

Senior Research Scientist Google DeepMind

Yilun Du

Assistant Professor Harvard

Chonghao Sima

Research Scientist OpenDriveLab

Organizers

- Huijie Wang OpenDriveLab
- Christos Sakaridis ETH Zürich
- Jean Oh CMU
- Chonghao Sima OpenDriveLab