VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos (CVPR 2023)

Project Page | Paper | Supplementary | ScanNet Test Results

Installation

sudo apt install libsparsehash-dev
conda env create -f environment.yaml
conda activate visfusion

ScanNet Dataset

We use the same input data structure as NeuralRecon. You could download and extract ScanNet v2 dataset by following the instructions provided at http://www.scan-net.org/ or the scannet_wrangling_scripts provided by SimpleRecon.

Expected directory structure of ScanNet:

DATAROOT
└───scannet
│   └───scans
│   |   └───scene0000_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
│   └───scans_test
│   |   └───scene0707_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
|   └───scannetv2_test.txt
|   └───scannetv2_train.txt
|   └───scannetv2_val.txt

Then generate the input fragments and the ground truth TSDFs for the training/val data split by

python tools/tsdf_fusion/generate_gt.py --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

and for the test split by

python tools/tsdf_fusion/generate_gt.py --test \ 
                                        --data_path PATH_TO_SCANNET \ 
                                        --save_name all_tsdf_9 \ 
                                        --window_size 9

Example data

We provide an example ScanNet scene (scene0785_00) to quickly try out the code. Download it from here and unzip it into the main directory of the project code.

The reconstructed meshes will be saved to PROJECT_PATH/results.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt

By default, it will output double layer meshes (for NeuralRecon's evaluation). Set MODEL.SINGLE_LAYER_MESH=True to directly output single layer meshes for TransformerFusion's evaluation.

python main.py --cfg ./config/test.yaml \
                SCENE scene0785_00 \ 
                TEST.PATH ./example_data/ScanNet \ 
                LOGDIR: ./checkpoints \ 
                LOADCKPT pretrained/model_000049.ckpt \ 
                MODEL.SINGLE_LAYER_MESH True

Training

Change TRAIN.PATH to your own data path in config/train.yaml and start training by running ./train.sh.

train.sh:

#!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES=0

python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 20 MODEL.FUSION.FUSION_ON False
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 41
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 44 TRAIN.FINETUNE_LAYER 0 MODEL.PASS_LAYERS 0
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 47 TRAIN.FINETUNE_LAYER 1 MODEL.PASS_LAYERS 1
python main.py --cfg ./config/train.yaml TRAIN.EPOCHS 50 TRAIN.FINETUNE_LAYER 2 MODEL.PASS_LAYERS 2

The training is seperated to five phases:

Phase 1 (epoch 1 - 20), train single fragments. MODEL.FUSION.FUSION_ON=False
Phase 2 (epoch 21 - 41), train the whole model with GRUFusion.
Phase 3 (epoch 42 - 44), finetune the first layer with GRUFusion. TRAIN.FINETUNE_LAYER=0, MODEL.PASS_LAYERS=0
Phase 4 (epoch 45 - 47), finetune the second layer with GRUFusion. TRAIN.FINETUNE_LAYER=1, MODEL.PASS_LAYERS=1
Phase 5 (epoch 48 - 50), finetune the third layer with GRUFusion. TRAIN.FINETUNE_LAYER=2, MODEL.PASS_LAYERS=2

Test

Change TEST.PATH to your own data path in config/test.yaml and start testing by running

python main.py --cfg ./config/test.yaml

Evaluation

We use NeuralRecon's evaluation for our main results.

python tools/evaluation.py --model ./results/scene_scannet_checkpoints_fusion_eval_49 --n_proc 16

You could print previous evaluation results by

python tools/visualize_metrics.py --model ./results/scene_scannet_checkpoints_fusion_eval_49

Here is the 3D metrics on ScanNet generated by the provided checkpoint using NeuralRecon's evaluation:

Acc ↓	Comp ↓	Chamfer ↓	Prec ↑	Recall ↑	F-Score↑
5.6	10.0	7.80	0.694	0.537	0.604

and using TransformerFusion's evaluation (set MODEL.SINGLE_LAYER_MESH=True to output single layer meshes):

Acc ↓	Comp ↓	Chamfer ↓	Prec ↑	Recall ↑	F-Score↑
4.10	8.66	6.38	0.757	0.588	0.660

ARKit data

To try with your own data captured from ARKit, please refer to NeuralRecon's DEMO.md for more details.

python test_scene.py --cfg ./config/test_scene.yaml \ 
                     DATASET ARKit \ 
                     TEST.PATH ./example_data/ARKit_scan \ 
                     LOADCKPT pretrained/model_000049.ckpt

Citation

If you find our work useful in your research please consider citing our paper:

@inproceedings{gao2023visfusion,
  title={VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos},
  author={Gao, Huiyu and Mao, Wei and Liu, Miaomiao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={17317--17326},
  year={2023}
}

Acknowledgment

This repository is partly based on the repo NeuralRecon. Many thanks to Jiaming Sun for the great code!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
checkpoints		checkpoints
config		config
datasets		datasets
media		media
models		models
ops		ops
tools		tools
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py
test.sh		test.sh
test_scene.py		test_scene.py
train.sh		train.sh
utils.py		utils.py

License

huiyu-gao/VisFusion

Folders and files

Latest commit

History

Repository files navigation

VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos (CVPR 2023)

Project Page | Paper | Supplementary | ScanNet Test Results

Installation

ScanNet Dataset

Example data

Training

Test

Evaluation

ARKit data

Citation

Acknowledgment

About

Topics

Resources

License

Stars

Watchers

Forks

Languages