From Pixels to Artificial Perception

Original Source Here

From Pixels to Artificial Perception

Understanding Computer Vision Fundamentals: An Introduction to Image Intrinsic, Representation, Features, Filters and Morphological Operations

Computer vision is a fascinating field that aims to teach machines how to “see” the world as we do. It has numerous practical applications in areas such as self-driving cars, facial recognition, object detection, and medical imaging. In this article, I will first go over what constitutes features in images and how we can manipulate them, and then I will go over various priors from computer vision that are being used in deep learning.

Unsurprisingly, we humans only perceive a portion of the electromagnetic spectrum. As a result, imaging devices are adapted to represent human perception of scenes as possible. Cameras process raw sensor data through a series of operations to achieve a highly familiar representation of human perception. Likewise, even radio-graphic images are calibrated to aid humans perception [2].

Bayer filter procedure [source: Wikipedia]

The camera sensor produces a grayscale grid structure that is constructed through a Bayer color filter mosaic. Cells in this grid represent intensities of a particular color. Thus, instead of recording a 3×8-bit array for every pixel, each filter in the Bayer filter records one color. Inspired by the fact that the human eye is more sensitive to green light, Bryce Bayer allocated twice the filters for green color than blue or red.

Camera ISP then reconstructs the image by applying a demosaicing algorithm in the form of color space. In computer vision, images are represented using different color spaces. The most common color space is RGB (Red, Green, Blue), where each pixel is interpreted as a 3D cube with dimensions of width, height, and depth (3 for RGB).

Another widely used color space is BGR (Blue, Green, Red), which was popular during the development of OpenCV. Unlike RGB, BGR considers the red channel as the least important. After this, a series of transformations such as black level correction, intensity adjustment, white balance adjustment, color correction, gamma correction, and finally, compression is applied.

HSV vs HSL [source: Wikipedia]

Apart from RGB and BGR, there are other color spaces like HSV (Hue, Saturation, Value) and HSL (Hue, Lightness, Saturation). HSV isolates the value component of each pixel, which varies the most during changes in lighting conditions. The H channel in HSV remains fairly consistent, even in the presence of shadows or excessive brightness. HSL, on the other hand, represents images based on hue, lightness, and saturation values.


In computer vision, we look for features to identify relevant patterns or structures within an image. Features can include edges, corners, or blobs, which serve as distinctive attributes for further analysis. Edges are areas in an image where the intensity abruptly changes, often indicating object boundaries. Understanding the frequency of images is also essential, as high-frequency components correspond to the edges of objects.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: