GitHub - GokulVSD/MonoDAC: Monocular Depth Estimation via a Fully Convolutional Deep Neural Network, utilising Atrous Convolutions, with 3D Point Cloud Visualisation.

Monocular Depth Estimation via a Fully Convolutional Deep Neural Network, utilising Atrous Convolutions, with 3D Point Cloud Visualisation.

Final Year Project

Developed during our 7th and 8th semesters as a part of our undergraduate degree course work.

Generating depth maps, colloquially known as depth estimation, from a single monocular RGB image has long been known to be an ill-posed problem. Traditional depth estimation techniques involve inference from stereo RGB pairs, via depth cues, or through the use of laser based LIDAR sensors, which produce sparse or dense point clouds depending on the size or cost of the sensor. Most modern smartphones contain more than one image sensor; however, utilising these sensors for depth estimation is infeasible as smartphone vendors restrict access to one image sensor at a time. In other cases, the sensors are of varying quality and focal lengths, rendering them inadequate for the purpose of depth inference. Producing depth maps for monocular RGB images is a crucial task due to their use in Depth-of-Field (DoF) image processing, Augmented Reality (AR), and Simultaneous Localisation and Mapping (SLAM).

To tackle the above problem, we propose a fully convolutional DCNN approach to learning and generating depth maps from single RGB images, utilising Atrous Convolution layers and ASPP for semantic segmentation and feature pooling & extraction in a convolutional neural network, employing an encoder-decoder architecture. We also apply Bicubic upsampling convolutions to further boost depth estimation accuracy, while simplifying previously proposed architectures so as to improve on performance, taking into consideration the computational and accuracy constraints that plague prior efforts.

We are showcasing the results of our model using a 3D point cloud view, and have trained our model using a subset of NYUv2 dataset that contains RGB and depth map pairs, which were constructed using Kinect sensors in an unsupervised manner.

Documentation

Project report and manual

Data-Flow Diagram Level 2

Network Architecture

Screenshots

Dependencies

TensorFlow dataflow and differentiable programming library

Keras neural-network library

NumPy multi-dimensional arrays and matrices

PIL python imaging library

Pillow a fork of PIL

Scikit-learn various classification, regression and clustering algorithms

Scikit-image segmentation, transformations, color manipulation, filtering, morphology, feature detection

Open3D 3D rendering

OpenCV2 real-time computer vision library

Flask web server

Other Requirements

OpenGL 3.5.5 or newer

IP Camera and a network connection

Training code and model withheld due to academic constraints.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
art		art
static		static
temp		temp
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bicubic_upsampler.py		bicubic_upsampler.py
evaluate.py		evaluate.py
monodac_demo.py		monodac_demo.py
monodac_predictor.py		monodac_predictor.py
point_cloud.py		point_cloud.py

License

GokulVSD/MonoDAC

Folders and files

Latest commit

History

Repository files navigation

Monocular Depth Estimation via a Fully Convolutional Deep Neural Network, utilising Atrous Convolutions, with 3D Point Cloud Visualisation.

Final Year Project

Developed during our 7th and 8th semesters as a part of our undergraduate degree course work.

Documentation

Data-Flow Diagram Level 2

Network Architecture

Screenshots

Dependencies

Other Requirements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages