Challenge

Autonomous Grand Challenge

The field of autonomy is rapidly evolving, and recent advancements from the machine learning community, such as large language models (LLM) and world models, bring great potential. We believe the future lies in explainable, end-to-end models that understand the world and generalize to unvisited environments. In light of this, we propose seven new challenges that push the boundary of existing perception, prediction, and planning pipelines.

Congratulations to the winners! Results here

3,000+ submissions made by 480+ teams from 28 countries and regions across ALL continents

Participants come from renowned research institutes, including Harvard, Oxford, TTUM, NUS, Tsinghua, etc., and top-tier enterprises, including NVIDIA, AMD, Bosch, Wayve, SAMSUNG, Huawei, etc.

30,000+ total views on the Challenge websites and 110,000+ total views on posts on social media

Workshop

Foundation Models for Autonomous Systems

Autonomous systems, such as robots and self-driving cars, have rapidly evolved over the past decades. Recently, foundation models have emerged as a promising approach to building more generalist autonomous systems due to their ability to learn from vast amounts of data and generalize to new tasks. The motivation behind this workshop is to explore the potential of foundation models for autonomous agents and discuss the challenges and opportunities associated with this approach.
Summit 442, June 17

Sergey Levine UC Berkeley

Alex Kendall Wayve

Andrei Bursuc Valeo

Rares Ambrus TRI

Ted Xiao Google DeepMind

Sherry Yang Google DeepMind

Li Chen Shanghai AI Lab

Tutorial

Towards Building AGI in Autonomy and Robotics

In this tutorial, we explore the intersection of AGI technologies and the advancement of autonomous systems, specifically in the field of robotics. We invite participants to embark on an investigative journey that covers essential concepts, frameworks, and challenges. Through discussion, we aim to shed light on the crucial role of fundamental models in enhancing the cognitive abilities of autonomous agents. Through cooperation, we aim to chart a path for the future of robotics, where the integration of AGI enables autonomous systems to push the limits of their capabilities and intelligence, ushering in a new era of intelligent autonomy.
Summit 447, June 18 Morning

Kristen Grauman UT Austin

Deva Ramanan CMU

Chelsea Finn Stanford

Kashyap Chitta Univeristy of Tübingen

Chonghao Sima Shanghai AI Lab

Social

How to Balance Academic Roles and Interest amidst the Wave of Emerging Technologies?

Although the past century witnessed an unprecedented expansion of scientific and technological knowledge, there are concerns that innovative activity is slowing. It is not uncommon for individuals to compromise their personal research interests in order to fulfill academic obligations, such as funding, service, etc. Nevertheless, preserving one's research interests is crucial for fostering diversity in research.
Summit Elliott Bay, June 19 17:00 - 19:00
Register here

Explore our participating talks

Workshop on Data-Driven Autonomous Driving Simulation

Real-world on-road testing of autonomous vehicles can be expensive or dangerous, making simulation a crucial tool to accelerate the development of safe autonomous driving (AD), a technology with enormous real-world impact. However, to minimise the sim-to-real gap, good agent behaviour models and sensor/perception imitation are paramount. A recent surge in published papers in this fast-growing field has led to a lot of progress, but several fundamental questions remain unanswered, for example regarding the fidelity and diversity of generative behaviour and perception models, generation of realistic controllable scenes at scale and the safety assessment of the simulation toolchain. In this workshop, our goal is to bring together practitioners and researchers from all areas of AD simulation and to discuss pressing challenges, recent breakthroughs and future directions.
Summit 342, June 18

Felix Heide Torc Robotics & Princeton University

Dragomir Anguelov Waymo

Chonghao Sima Shanghai AI Lab

VLADR: Vision and Language for Autonomous Driving and Robotics

The contemporary discourse in technological advancement underscores the increasingly intertwined roles of vision and language processing, especially within the realms of autonomous driving and robotics. The necessity for this symbiosis is apparent when considering the multifaceted dynamics of real-world environments.
Summit 345-346, June 18

Jitendra Malik UC Berkeley

Trevor Darrell UC Berkeley

Chonghao Sima Shanghai AI Lab

End-to-End Autonomy: A New Era of Self-Driving

This tutorial aims to dissect the complexities and nuances of end-to-end autonomy, covering theoretical foundations, practical implementations and challenges, and future directions of this evolving technology.
Summit 444, June 18 Afternoon

Jamie Shotton Wayve

Long Chen Wayve

Hongyang Li Shanghai AI Lab

The Sixth Workshop on Precognition: Seeing Through the Future

The workshop will discuss recent approaches and research trends not only in anticipating human behavior from videos, but also precognition in multiple other visual applications, such as medical imaging, health-care, human face aging prediction, early event prediction, autonomous driving forecasting, and so on.
Summit Elliott Bay, June 18 Afternoon

Louis Foucard Figure.ai

Monroe Kennedy III Stanford

Hongyang Li Shanghai AI Lab

Highlight

Generalized Predictive Model for Autonomous Driving

GenAD is the first large-scale video world model in the autonomous driving discipline. To empower the generalization ability of our model, we acquire over 2000 hours of driving videos from the web, spanning areas all over the world with diverse weather conditions and traffic scenarios. Inheriting the merits of recent latent diffusion models, GenAD handles the challenging dynamics in driving scenes with novel temporal reasoning blocks. We showcase that it can generalize to various unseen driving datasets in a zero-shot manner. GenAD can be adapted into an action-conditioned prediction model or a motion planner, holding great potential for real-world driving applications. The OpenDV-YouTube Dataset is hosted here.
Venue: Poster Session 4 & Exhibit Hall (Arch 4A-E), June 20 17:15 - 18:45

Highlight

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

ViDAR is a pioneering multi-modal world model designed for autonomous driving. It is capable of predicting future point clouds from historical visual input by understanding the 3D structures and temporal dynamics, and eventually benefits downstream tasks. ViDAR is a pioneering general world model designed for autonomous driving. It is capable of predicting future point clouds from historical visual input by understanding the 3D structures and temporal dynamics, and eventually benefits downstream tasks. All codes, pre-trained models, and fine-tuned models are released here.
Venue: Poster Session 4 & Exhibit Hall (Arch 4A-E), June 20 17:15 - 18:45

Workshop Oral

DriveLM: Driving with Graph Visual Question Answering

In DriveLM, we study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems to boost generalization and enable interactivity with human users.Specifically, we aim to facilitate Perception, Prediction, Planning, Behavior, Motion tasks with human-written reasoning logic as a connection. We propose the task of GVQA to connect the QA pairs in a graph-style structure. To support this novel task, we provide the DriveLM-Data. All codes are released here.
Summit 345-346, June 18

Acknowledge our members for professional service

Many team members make their contribution to CVPR 2024, altogether we are building a more professional community to shape the future of AI. We sincerely thank all for their service:

Area Chair: Ping Luo and Hongyang Li

Reviewer: Li Chen, Chonghao Sima, Jiazhi Yang, Huijie Wang, Jia Zeng, Zetong Yang, Tianyu Li, Shenyuan Gao, Qingwen Bu, Bangjun Wang, Linyan Huang, and Yunsong Zhou

Great thanks to our sponsors

Diversity, equity, and inclusion

Respecting the CVPR 2024 DEI statement, organizers, speakers, and committee members of our events, encompassing the workshop, challenge, and tutorial, curated a wide variety of researchers from both academia and industry, with different backgrounds, regions, genders, and ages.

Many of the greatest ideas come from a diverse mix of minds, backgrounds, and experiences. We provide equal opportunities to all participants without regard to nationality, affiliation, race, religion, color, age, disability, or any other restriction. We believe diversity drives innovation. When we say we welcome participation from everyone, we mean everyone.

OpenDriveLab at CVPR 2024

June 17 - 21, Seattle, USA

Enjoy our hosted events

Challenge

Autonomous Grand Challenge

Workshop

Foundation Models for Autonomous Systems

Tutorial

Towards Building AGI in Autonomy and Robotics

Social

How to Balance Academic Roles and Interest amidst the Wave of Emerging Technologies?

Explore our participating talks

Workshop on Data-Driven Autonomous Driving Simulation

VLADR: Vision and Language for Autonomous Driving and Robotics

End-to-End Autonomy: A New Era of Self-Driving

The Sixth Workshop on Precognition: Seeing Through the Future

Meet our team

Highlight

Generalized Predictive Model for Autonomous Driving

Highlight

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

Workshop Oral

DriveLM: Driving with Graph Visual Question Answering

Acknowledge our members for professional service

Great thanks to our sponsors

Diversity, equity, and inclusion

Snapshots

Join us