Enjoy our hosted events


Autonomous Grand Challenge

The field of autonomy is rapidly evolving, and recent advancements from the machine learning community, such as large language models (LLM) and world models, bring great potential. We believe the future lies in explainable, end-to-end models that understand the world and generalize to unvisited environments. In light of this, we propose seven new challenges that push the boundary of existing perception, prediction, and planning pipelines.
  • Congratulations to the winners! Results here

  • 3,000+ submissions made by 480+ teams from 28 countries and regions across ALL continents

  • Participants come from renowned research institutes, including Harvard, Oxford, TTUM, NUS, Tsinghua, etc., and top-tier enterprises, including NVIDIA, AMD, Bosch, Wayve, SAMSUNG, Huawei, etc.

  • 30,000+ total views on the Challenge websites and 110,000+ total views on posts on social media
  • Workshop

    Foundation Models for Autonomous Systems

    Autonomous systems, such as robots and self-driving cars, have rapidly evolved over the past decades. Recently, foundation models have emerged as a promising approach to building more generalist autonomous systems due to their ability to learn from vast amounts of data and generalize to new tasks. The motivation behind this workshop is to explore the potential of foundation models for autonomous agents and discuss the challenges and opportunities associated with this approach.
    Summit 435, June 17

    Sergey Levine UC Berkeley
    Alex Kendall Wayve
    Andrei Bursuc Valeo
    Rares Ambrus TRI
    Ted Xiao Google DeepMind
    Sherry Yang Google DeepMind
    Li Chen Shanghai AI Lab


    Towards Building AGI in Autonomy and Robotics

    In this tutorial, we explore the intersection of AGI technologies and the advancement of autonomous systems, specifically in the field of robotics. We invite participants to embark on an investigative journey that covers essential concepts, frameworks, and challenges. Through discussion, we aim to shed light on the crucial role of fundamental models in enhancing the cognitive abilities of autonomous agents. Through cooperation, we aim to chart a path for the future of robotics, where the integration of AGI enables autonomous systems to push the limits of their capabilities and intelligence, ushering in a new era of intelligent autonomy.
    Summit 447, June 18 Morning

    Kristen Grauman UT Austin
    Deva Ramanan CMU
    Chelsea Finn Stanford
    Kashyap Chitta Univeristy of Tübingen
    Chonghao Sima Shanghai AI Lab


    How to Balance Academic Roles and Interest amidst the Wave of Emerging Technologies?

    Although the past century witnessed an unprecedented expansion of scientific and technological knowledge, there are concerns that innovative activity is slowing. It is not uncommon for individuals to compromise their personal research interests in order to fulfill academic obligations, such as funding, service, etc. Nevertheless, preserving one's research interests is crucial for fostering diversity in research.
    Summit Elliott Bay, June 19 17:00 - 19:00
    Register here

    Explore our participating talks

    Workshop on Data-Driven Autonomous Driving Simulation

    Real-world on-road testing of autonomous vehicles can be expensive or dangerous, making simulation a crucial tool to accelerate the development of safe autonomous driving (AD), a technology with enormous real-world impact. However, to minimise the sim-to-real gap, good agent behaviour models and sensor/perception imitation are paramount. A recent surge in published papers in this fast-growing field has led to a lot of progress, but several fundamental questions remain unanswered, for example regarding the fidelity and diversity of generative behaviour and perception models, generation of realistic controllable scenes at scale and the safety assessment of the simulation toolchain. In this workshop, our goal is to bring together practitioners and researchers from all areas of AD simulation and to discuss pressing challenges, recent breakthroughs and future directions.
    Summit 342, June 18

    Raquel Urtasun Waabi & University of Toronto
    Dragomir Anguelov Waymo
    Chonghao Sima Shanghai AI Lab

    VLADR: Vision and Language for Autonomous Driving and Robotics

    The contemporary discourse in technological advancement underscores the increasingly intertwined roles of vision and language processing, especially within the realms of autonomous driving and robotics. The necessity for this symbiosis is apparent when considering the multifaceted dynamics of real-world environments.
    Summit 345-346, June 18

    Jitendra Malik UC Berkeley
    Trevor Darrell UC Berkeley
    Chonghao Sima Shanghai AI Lab

    End-to-End Autonomy: A New Era of Self-Driving

    This tutorial aims to dissect the complexities and nuances of end-to-end autonomy, covering theoretical foundations, practical implementations and challenges, and future directions of this evolving technology.
    Summit 444, June 18 Afternoon

    Jamie Shotton Wayve
    Long Chen Wayve
    Hongyang Li Shanghai AI Lab

    The Sixth Workshop on Precognition: Seeing Through the Future

    The workshop will discuss recent approaches and research trends not only in anticipating human behavior from videos, but also precognition in multiple other visual applications, such as medical imaging, health-care, human face aging prediction, early event prediction, autonomous driving forecasting, and so on.
    Summit Elliott Bay, June 18 Afternoon

    Louis Foucard Figure.ai
    Monroe Kennedy III Stanford
    Hongyang Li Shanghai AI Lab

    Meet our team


    Generalized Predictive Model for Autonomous Driving

    GenAD is the first large-scale video world model in the autonomous driving discipline. To empower the generalization ability of our model, we acquire over 2000 hours of driving videos from the web, spanning areas all over the world with diverse weather conditions and traffic scenarios. Inheriting the merits of recent latent diffusion models, GenAD handles the challenging dynamics in driving scenes with novel temporal reasoning blocks. We showcase that it can generalize to various unseen driving datasets in a zero-shot manner. GenAD can be adapted into an action-conditioned prediction model or a motion planner, holding great potential for real-world driving applications. The OpenDV-YouTube Dataset is hosted here.
    Venue: Poster Session 4 & Exhibit Hall (Arch 4A-E), June 20 17:15 - 18:45


    Visual Point Cloud Forecasting enables Scalable Autonomous Driving

    ViDAR is a pioneering multi-modal world model designed for autonomous driving. It is capable of predicting future point clouds from historical visual input by understanding the 3D structures and temporal dynamics, and eventually benefits downstream tasks. ViDAR is a pioneering general world model designed for autonomous driving. It is capable of predicting future point clouds from historical visual input by understanding the 3D structures and temporal dynamics, and eventually benefits downstream tasks. All codes, pre-trained models, and fine-tuned models are released here.
    Venue: Poster Session 4 & Exhibit Hall (Arch 4A-E), June 20 17:15 - 18:45

    Workshop Oral

    DriveLM: Driving with Graph Visual Question Answering

    In DriveLM, we study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems to boost generalization and enable interactivity with human users.Specifically, we aim to facilitate Perception, Prediction, Planning, Behavior, Motion tasks with human-written reasoning logic as a connection. We propose the task of GVQA to connect the QA pairs in a graph-style structure. To support this novel task, we provide the DriveLM-Data. All codes are released here.
    Summit 345-346, June 18

    Acknowledge our members for professional service

    Many team members make their contribution to CVPR 2024, altogether we are building a more professional community to shape the future of AI. We sincerely thank all for their service:

  • Area Chair: Ping Luo and Hongyang Li
  • Reviewer: Li Chen, Chonghao Sima, Jiazhi Yang, Huijie Wang, Jia Zeng, Zetong Yang, Tianyu Li, Shenyuan Gao, Qingwen Bu, Bangjun Wang, Linyan Huang, and Yunsong Zhou

  • Great thanks to our sponsors

    Diversity, equity, and inclusion

    Respecting the CVPR 2024 DEI statement, organizers, speakers, and committee members of our events, encompassing the workshop, challenge, and tutorial, curated a wide variety of researchers from both academia and industry, with different backgrounds, regions, genders, and ages.

    Many of the greatest ideas come from a diverse mix of minds, backgrounds, and experiences. We provide equal opportunities to all participants without regard to nationality, affiliation, race, religion, color, age, disability, or any other restriction. We believe diversity drives innovation. When we say we welcome participation from everyone, we mean everyone.

    Join us

    Looking for opportunities in Shanghai / Hong Kong?

    We are searching for talents from all over the world. Are you looking for opportunities as Postdoc, full-time employee, intern, etc.? Don't hesitate to contact us via [email protected] or Dr. Hongyang Li.