OpenDriveLab

      • Home
      • Event
        • ICLR 2023 Workshop
        • CVPR 2023 Workshop
      • Publication

      Publications

          Publications
        • Editor's Pick
        • End-to-end Autonomous Driving
        • Bird's-eye-view Perception
        • Prediction and Planning
        • Computer Vision at Large

        We position OpenDriveLab as one of the most top research teams around globe, since we've got talented people and published work at top venues.

        Editor's Pick


        Picture

        Planning-oriented Autonomous Driving

        Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Hongyang Li, et al.
        CVPR 2023 Award Candidate GitHub [Sell a New Philosophy]
        A comprehensive framework up-to-date that incorporates full-stack driving tasks in one network.

        Picture

        BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

        Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, et al.
        ECCV 2022 GitHub [Baseline][nuScenes First Place]
        [Waymo Challenge 2022 Official First Place]

        A paradigm for autonomous driving that applies both Transformer and Temporal structure to generate BEV features.

        Picture

        PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark

        Li Chen, Chonghao Sima, Yang Li, Xiangwei Geng, Junchi Yan, et al.
        ECCV 2022 (Oral) GitHub GitHub [Redefine the Community]
        PersFormer adopts a unified 2D/3D anchor design and an auxiliary task to detect 2D/3D lanes; we release one of the first large-scale real-world 3D lane datasets, OpenLane.



        End-to-end Autonomous Driving

        Picture

        Policy Pre-Training for End-to-End Autonomous Driving via Self-Supervised Geometric Modeling

        Penghao Wu, Li Chen, Hongyang Li, Xiaosong Jia, Junchi Yan, Yu Qiao
        ICLR 2023 GitHub
        An intuitive and straightforward fully self-supervised framework curated for the policy pre-training in visuomotor driving.

        Picture

        Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline

        Penghao Wu, Xiaosong Jia, Li Chen, Junchi Yan, Hongyang Li, Yu Qiao
        NeurIPS 2022 GitHub [Carla First Place]
        Take the initiative to explore the combination of controller based on a planned trajectory and perform control prediction.

        Picture

        ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

        Shengchao Hu, Li Chen, Penghao Wu, Hongyang Li, et al.
        ECCV 2022 GitHub
        A spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously.



        Bird's-eye-view Perception

        Picture

        Delving into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe

        Hongyang Li, Chonghao Sima, Jifeng Dai, Wenhai Wang, Lewei Lu, et al.
        arXiv 2022 GitHub [Setup the Table]
        We review the most recent work on BEV perception and provide analysis of different solutions.

        Picture

        BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

        Chenyu Yang, Xizhou Zhu, Hongyang Li, Jifeng Dai, et al.
        CVPR 2023 Highlight
        A novel bird's-eye-view (BEV) detector with perspective supervision, which converges faster and better suits modern image backbones.



        Prediction and Planning

        Picture

        Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach

        Xiaosong Jia, Li Chen, Penghao Wu, Jia Zeng, et al.
        CoRL 2022
        We find taking scratch trajectories generated by MLP as input, a refinement module based on structures with temporal prior, could boost the accuracy.

        Picture

        HDGT: Heterogeneous driving graph transformer for multi-agent trajectory prediction via scene encoding

        Xiaosong Jia, Penghao Wu, Li Chen, Hongyang Li, Yu Liu, Junchi Yan
        arXiv 2022 GitHub
        HDGT formulates the driving scene as a heterogeneous graph with different types of nodes and edges.



        Computer Vision at Large

        Picture

        Stare at What You See: Masked Image Modeling without Reconstruction

        Hongwei Xue, Peng Gao, Hongyang Li, et al.
        CVPR 2023 GitHub
        An efficient MIM paradigm MaskAlign and a Dynamic Alignment module to apply learnable alignment to tackle the problem of input inconsistency.

      OpenDriveLab

      Copyright © 2023 - All Rights Reserved - OpenDriveLab