Skip to content

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving; [CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

License

Notifications You must be signed in to change notification settings

autonomousvision/transfuser

Repository files navigation

TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

PWC

This repository contains the code for the PAMI 2023 paper TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving. This work is a journal extension of the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. The code for the CVPR 2021 paper is available in the cvpr2021 branch.

If you find our code or papers useful, please cite:

@article{Chitta2023PAMI,
  author = {Chitta, Kashyap and
            Prakash, Aditya and
            Jaeger, Bernhard and
            Yu, Zehao and
            Renz, Katrin and
            Geiger, Andreas},
  title = {TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving},
  journal = {Pattern Analysis and Machine Intelligence (PAMI)},
  year = {2023},
}
@inproceedings{Prakash2021CVPR,
  author = {Prakash, Aditya and
            Chitta, Kashyap and
            Geiger, Andreas},
  title = {Multi-Modal Fusion Transformer for End-to-End Autonomous Driving},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Also, check out the code for other recent work on CARLA from our group:

Contents

  1. Setup
  2. Dataset and Training
  3. Evaluation

Setup

Clone the repo, setup CARLA 0.9.10.1, and build the conda environment:

git clone https://github.com/autonomousvision/transfuser.git
cd transfuser
git checkout 2022
chmod +x setup_carla.sh
./setup_carla.sh
conda env create -f environment.yml
conda activate tfuse
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu102.html
pip install mmcv-full==1.5.3 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.11.0/index.html

Dataset and Training

Our dataset is generated via a privileged agent which we call the autopilot (/team_code_autopilot/autopilot.py) in 8 CARLA towns using the routes and scenario files provided in this folder. See the tools/dataset folder for detailed documentation regarding the training routes and scenarios. You can download the dataset (210GB) by running:

chmod +x download_data.sh
./download_data.sh

The dataset is structured as follows:

- Scenario
    - Town
        - Route
            - rgb: camera images
            - depth: corresponding depth images
            - semantics: corresponding segmentation images
            - lidar: 3d point cloud in .npy format
            - topdown: topdown segmentation maps
            - label_raw: 3d bounding boxes for vehicles
            - measurements: contains ego-agent's position, velocity and other metadata

Data generation

In addition to the dataset itself, we have provided the scripts for data generation with our autopilot agent. To generate data, the first step is to launch a CARLA server:

./CarlaUE4.sh --world-port=2000 -opengl

For more information on running CARLA servers (e.g. on a machine without a display), see the official documentation. Once the server is running, use the script below for generating training data:

./leaderboard/scripts/datagen.sh <carla root> <working directory of this repo (*/transfuser/)>

The main variables to set for this script are SCENARIOS and ROUTES.

Training script

The code for training via imitation learning is provided in train.py.
A minimal example of running the training script on a single machine:

cd team_code_transfuser
python train.py --batch_size 10 --logdir /path/to/logdir --root_dir /path/to/dataset_root/ --parallel_training 0

The training script has many more useful features documented at the start of the main function. One of them is parallel training. The script has to be started differently when training on a multi-gpu node:

cd team_code_transfuser
CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=16 OPENBLAS_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node=2 --max_restarts=0 --rdzv_id=1234576890 --rdzv_backend=c10d train.py --logdir /path/to/logdir --root_dir /path/to/dataset_root/ --parallel_training 1

Enumerate the GPUs you want to train on with CUDA_VISIBLE_DEVICES. Set the variable OMP_NUM_THREADS to the number of cpus available on your system. Set OPENBLAS_NUM_THREADS=1 if you want to avoid threads spawning other threads. Set --nproc_per_node to the number of available GPUs on your node.

The evaluation agent file is build to evaluate models trained with multiple GPUs. If you want to evaluate a model trained with a single GPU you need to remove this line.

Evaluation

Longest6 benchmark

We make some minor modifications to the CARLA leaderboard code for the Longest6 benchmark, which are documented here. See the leaderboard/data/longest6 folder for a description of Longest6 and how to evaluate on it.

Pretrained agents

Pre-trained agent files for all 4 methods can be downloaded from AWS:

mkdir model_ckpt
wget https://s3.eu-central-1.amazonaws.com/avg-projects/transfuser/models_2022.zip -P model_ckpt
unzip model_ckpt/models_2022.zip -d model_ckpt/
rm model_ckpt/models_2022.zip

Running an agent

To evaluate a model, we first launch a CARLA server:

./CarlaUE4.sh --world-port=2000 -opengl

Once the CARLA server is running, evaluate an agent with the script:

./leaderboard/scripts/local_evaluation.sh <carla root> <working directory of this repo (*/transfuser/)>

By editing the arguments in local_evaluation.sh, we can benchmark performance on the Longest6 routes. You can evaluate both privileged agents (such as [autopilot.py]) and sensor-based models. To evaluate the sensor-based models use submission_agent.py as the TEAM_AGENT and point to the folder you downloaded the model weights into for the TEAM_CONFIG. The code is automatically configured to use the correct method based on the args.txt file in the model folder.

You can look at qualitative examples of the expected driving behavior of TransFuser on the Longest6 routes here.

Parsing longest6 results

To compute additional statistics from the results of evaluation runs we provide a parser script tools/result_parser.py.

${WORK_DIR}/tools/result_parser.py --xml ${WORK_DIR}/leaderboard/data/longest6/longest6.xml --results /path/to/folder/with/json_results/ --save_dir /path/to/output --town_maps ${WORK_DIR}/leaderboard/data/town_maps_xodr

It will generate a results.csv file containing the average results of the run as well as additional statistics. It also generates town maps and marks the locations where infractions occurred.

Submitting to the CARLA leaderboard

To submit to the CARLA leaderboard you need docker installed on your system. Edit the paths at the start of make_docker.sh. Create the folder team_code_transfuser/model_ckpt/transfuser. Copy the model.pth files and args.txt that you want to evaluate to team_code_transfuser/model_ckpt/transfuser. If you want to evaluate an ensemble simply copy multiple .pth files into the folder, the code will load all of them and ensemble the predictions.

cd leaderboard
cd scripts
./make_docker.sh

The script will create a docker image with the name transfuser-agent. Follow the instructions on the leaderboard to make an account and install alpha.

alpha login
alpha benchmark:submit  --split 3 transfuser-agent:latest

The command will upload the docker image to the cloud and evaluate it.