Mingzhe Guo1,2, Yixiang Yang1, Chuanrong Han1, Rufeng Zhang2, Shirui Li2, Ji Wan2, Zhipeng Zhang1 ✉
1 AutoLab, School of Artificial Intelligence, Shanghai Jiao Tong University, 2 Baidu Inc.
✉ corresponding author: zhipeng.zhang.cv@outlook.com
Accepted to ICLR 2026!
This repository contains the official implementation of FlowAD, a novel ego-scene interactive modeling framework for autonomous driving. Unlike traditional approaches that treat each timestamp in isolation, FlowAD explicitly models the feedback of ego-vehicle motion to future observations, fundamentally improving the understanding of the driving process and enhancing planning capabilities.
The architecture of our FlowAD structured around three core components: 1) Ego-guided scene partition. 2) Spatial and temporal flow prediction. 3) Task-aware enhancement.
Inspired by human perception, FlowAD represents ego-scene interaction as scene flow relative to the ego-vehicle, capturing relative motion as learnable scene flow within the latent feature space. This enables modeling ego-motion feedback using existing log-replay datasets without requiring complex scenario simulations.
Key Achievements:
- 19% collision rate reduction over SparseDrive on nuScenes
- 60% FCP (our proposed metric) improvement (1.39 frames) on nuScenes validation set
- 51.77 driving score on Bench2Drive closed-loop evaluation
- Demonstrated generality across perception, end-to-end planning, and VLM analysis
Our method achieves significant improvements across multiple tasks:
| Method | Backbone | Detection | Tracking | Motion Prediction | Planning | FCP↓ | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| mAP↑ | NDS↑ | AMOTA↑ | AMOTP↓ | minADE↓ | minFDE↓ | Avg.L2 (m)↓ | Avg.Col↓ | |||
| UniAD | ResNet101 | 0.380 | 0.498 | 0.359 | 1.320 | 0.71 | 1.02 | 0.69 | 0.12 | 2.96 |
| SparseDrive | ResNet101 | 0.496 | 0.588 | 0.501 | 1.085 | 0.60 | 0.96 | 0.58 | 0.06 | 2.30 |
| FlowAD (Ours) | ResNet101 | 0.523 | 0.605 | 0.518 | 1.040 | 0.56 | 0.93 | 0.52 | 0.05 | 0.91 |
FCP (Frames before Correct Planning): Lower is better. FlowAD achieves 1.39 frames improvement (48% reduction) over SparseDrive and 2.03 frames improvement (60% reduction) over baseline methods.
FlowAD demonstrates superior detection of occluded objects, small targets, and dense scenes through learned scene flow dynamics.
FlowAD achieves 51.77 driving score, demonstrating robust closed-loop performance.
- Clone this repository:
git clone https://github.com/your-repo/FlowAD.git
cd FlowAD- Navigate to the desired sub-project:
cd SparseDrive-Flow - Follow the sub-project's README for environment setup and training/evaluation.
If you find this work useful in your research, please consider citing:
@inproceedings{FlowAD2026,
title={FlowAD: Ego-Scene Interactive Modeling for Autonomous Driving},
author={Anonymous},
booktitle={Under review as a conference paper at ICLR 2026},
year={2026},
url={https://openreview.net/pdf?id=m4JpoJRgAr}
}This work builds upon several excellent open-source projects:
- SparseDrive - End-to-end autonomous driving
This project is released under the Apache 2.0 license. Please see the LICENSE files in each sub-project for more details.