This project aims at constructing a fully convolutional neural network for performing semantic segmentation to identify drivable road from a car dashcam image (trained and tested on the KITTI data set).
The architecture of the fully convolutional neural network is based on the VGG-16 image classifier. The final fully connected layer, Layer 7, of VGG-16 image classifier is converted into a 1x1 convolution and the depth of two, road and not-road, is set. Layer 3 and 4 are also converted similarly and added to Layer 7 as skip connections after it is decoded by upsampling/transposing it. Regularization is applied to each convolutional and transposed convolutional layer.
The hyperparameters used for training are
- keep probability: 0.8
- learning rate: 0.001
- epochs: 90
- batch size: 5
- regularization scalar: 0.001
The following figure shows transitions of average error losses for each epoch. The blue line shows the one when the output of pooling layers, Layer 3 and Layer 4, are scaled before those layers are added to Layer 7, and the orange line shows the one when the output of those layers are not scaled before those layers are added to Layer 7.
In both of the cases, the average error losses are decreasing over over time. Although the blue line is slower to decrease the average loss than the orange line, it reaches a slightly better result than the other with respect to the average error loss.
The following images show some results of inference of the trained network. The top one is from the network with scaled pooling output and the bottom one is from the network without scaled pooling output.
One observation is that the network with scaled pooling output produces better results under a difficult situation, e.g. there are shadows, lanes are unclear, and the network without scaled pooling output produce clearer segmentation when other objects such as vehicles are included.
In this project, you'll label the pixels of a road in images using a Fully Convolutional Network (FCN).
Make sure you have the following is installed:
Download the Kitti Road dataset from here. Extract the dataset in the data
folder. This will create the folder data_road
with all the training a test images.
Implement the code in the main.py
module indicated by the "TODO" comments.
The comments indicated with "OPTIONAL" tag are not required to complete.
Run the following command to run the project:
python main.py
Note If running this in Jupyter Notebook system messages, such as those regarding test status, may appear in the terminal rather than the notebook.
- Ensure you've passed all the unit tests.
- Ensure you pass all points on the rubric.
- Submit the following in a zip file.
helper.py
main.py
project_tests.py
- Newest inference images from
runs
folder (all images from the most recent run)
A well written README file can enhance your project and portfolio. Develop your abilities to create professional README files by completing this free course.