Skip to content

Development of a depth estimation model based on a UNET architecture - connection of Bi-directional Feature Pyramid Network (BIFPN) and EfficientNet.

Notifications You must be signed in to change notification settings

kzaleskaa/depth-estimation-with-compression

Repository files navigation


Depth Estimation (BiFPN + EfficientNet)

PyTorch Lightning Config: Hydra Template

Description

This project entails the development and optimization of a depth estimation model based on a UNET architecture enhanced with Bi-directional Feature Pyramid Network (BIFPN) and EfficientNet components. The model is trained on the NYU Depth V2 dataset and evaluated on the Structural Similarity Index (SSIM) metric.

Installation

Pip

# clone project
git clone https://github.com/kzaleskaa/depth-estimation-with-compression
cd depth-estimation-with-compression

# [OPTIONAL] create conda environment
conda create -n myenv python=3.11
conda activate myenv

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Conda

# clone project
git clone https://github.com/kzaleskaa/depth-estimation-with-compression
cd depth-estimation-with-compression

# create conda environment and install dependencies
conda env create -f environment.yaml -n myenv

# activate conda environment
conda activate myenv

How to run

Train model with default configuration

# train on CPU
python src/train.py trainer=cpu

# train on GPU
python src/train.py trainer=gpu

Train model with chosen experiment configuration from configs/experiment/

python src/train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20 data.batch_size=64

Results for BiFPN + FFNet

The base model was trained for 25 epochs. QAT was performed for 10 epochs.

Baseline and Fuse

Method test/ssim (Per tensor) model size (MB) (Per tensor)
baseline 0.778 3.53
fuse 0.778 3.45

PTQ, QAT, and PTQ + QAT (Per tensor and Per channel)

Method test/ssim (Per tensor) model size (MB) (Per tensor) test/ssim (Per channel) model size (MB) (Per channel)
ptq 0.6480 0.96791 0.6518 0.9679
qat 0.7715 0.96791 0.7627 0.9681
ptq + qat 0.7724 0.96899 0.7626 0.9692

About

Development of a depth estimation model based on a UNET architecture - connection of Bi-directional Feature Pyramid Network (BIFPN) and EfficientNet.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published