Synthesize Hazy/Foggy Image using Monodepth and Atmospheric Scattering Model



Original Source Here

Synthesize Hazy/Foggy Image using Monodepth and Atmospheric Scattering Model

The recent era of deep learning and computer vision leads to the rapid development of autonomous driving technology where object detection plays an extremely important role. Various object detectors such as R-CNN family (R-CNN, Fast R-CNN, Faster R-CNN, Cascade R-CNN), YOLO series (YOLOv1-v4) have been proposed, and abundant driving object detection datasets including BDD100K, WAYMO, etc. are also available. However, there still exists a huge gap between academic research and realistic deployment. The performances of object detectors in practice usually drop despite being trained well with almost 99.99% accuracy during offline training. One of the most popular reasons for that unexpected incident is because domain shift problem.

Domain shift happens when the distribution of testing data is different from that of training data. For instance, the object detector of an autonomous vehicle is trained to work well on favorable training data which is captured in good weather conditions while the factual weather during the deployment phase can include rain or fog. Hence, to bring more proficient performance, object detectors should be more robust against domain shift problems. The figure below shows a comparison in performance between a hazy input (left) and a haze-free input (right) when applying a pre-trained YOLOv4 model which is trained on original clean data (red: ground truth, green: detection). From the figure, we can see clearly the effect of domain shift on the detector’s performance.

The difference in performance between a hazy input (left) and a haze-free input (right) (red: ground truth, green: detection) (Image by Author)

A simple but effective solution is applying an independent environmental condition classifier along with the object detector to recognize the environmental changes as presented in this paper: Enhancement of Robustness in Object Detection Module for Advanced Driver Assistance Systems (read full paper). Or another unified system called MultiScale Domain Adaptive YOLO (MS-DAYOLO) has been proposed which is integrated with the state-of-the-art one-stage object detection model YOLOv4 and a Domain Adaptive Network (DAN) for domain invariant feature learning.

For such studies, we need to prepare a diverse dataset that can help to train a powerful object detection model to tackle the domain shift issue. In this post, I would like to introduce a robust tool for synthesizing hazy/foggy image data from clean images using Monodepth and Atmospheric Scattering Model. More details of the project can be found here.

Atmospheric Scattering Model

As in the above figure, the atmospheric scattering model which is popularly used to describe a hazy image is shown in the red box (equation (1)), where x is the pixel position, I(x) is the hazy image that we aim to create, J(x) is the clean image that we already have, A is the atmospheric light which is commonly set to 1, and t(x) is the medium transmission of the scene. Based on equation (1), to obtain I(x), we need to know the transmission t(x). When the atmospheric light is homogenous, t(x) can be expressed as equation (2), where β represents the scattering coefficient of the atmosphere, hence, what is in demand now is the depth map d(x) of the scene. A modern method to estimate the depth map of a scene using a single image that can be taken into account is Monodepth which is about to be mentioned in the next section.

Monodepth

Single image depth estimation (Image in Monodepth paper [source])

Estimating scene depth using a single RGB image is a challenging and long-standing problem in computer vision. Recently, Monodepth (ver1, ver2) is the method that has made a breakthrough in this topic and has established a new baseline for single image depth estimation. Due to several improvements in version2 compared with version1, I would like to utilize Monodepth2 in this tutorial. The available source code of Monodepth2 can be found here.

Results

The example images below are picked from WAYMO dataset. WAYMO dataset provides driving-scene images in 5 different viewing angles of the driver. The figure below shows several results of synthesized hazy images using the method introduced in this post. In the figure, 1st row: original images, 2nd row: estimated depth maps using Monodepth2, 3rd row: synthesized hazy images.

1st row: original image, 2nd row: estimated depth map, 3rd row: synthesized hazy image (Image by Author)

We can set the thickness of haze in the synthesized hazy image by configuring different values of the scattering coefficient of the atmosphere β. In the following results, the images in the 2nd row are the synthesized hazy images with sparse haze when using β=1.2 while the 3rd-row images are the synthesized dense-haze images with β=2.

1st row: original image, 2nd row: synthesized hazy image with sparse haze, 3rd row: synthesized hazy image with dense haze (Image by Author)

Conclusions

In this post, I have introduced a method to synthesize hazy/foggy image data that can be implemented as a data augmentation step for training to improve the robustness of object detection models against domain shift problems. The full implementation can be found here.

Readers are welcome to visit my Facebook fan page which is for sharing things regarding Machine Learning: Diving Into Machine Learning. Further notable posts from me regarding object detection can also be found at YOLOv4–5D review, Darkeras, and EFPN.

Thanks for spending time!

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: