Project1 - Guanyou Li

Section 1: Basic Function

The main goal of this project is to implement an image alignment algorithm to align red, green, and blue color channels to produce a color-accurate image output. Specifically, the code uses a pyramid image alignment method combined with normalized cross correlation (NCC) as an evaluation metric to achieve multi-level alignment of images.

We use NCC as the basis for judging the similarity of images. First, remove the mean of the two images separately to remove the average brightness in the image. Calculate the sum of the products of the images after removing the mean as the numerator. Calculate the square root of the product of the sum of the squares of each image after removing the mean as the denominator. Finally, divide the numerator by the denominator to get the NCC value. If the denominator is zero, the NCC is set to zero. NCC is used to evaluate the similarity of two image blocks under different offsets to find the best alignment position.

However, when processing large image files such as tif files, there may be a large offset, which may be slow and ineffective, so we need to build a pyramid alignment method. The pyramid alignment method is to achieve accurate alignment of different color channels by reducing the resolution of the image layer by layer. Starting from the original image, gradually reduce the size of the image until the minimum size of the image reaches a certain threshold. Perform preliminary alignment on the top layer of the pyramid, that is, the image with the smallest size. Move down from the top layer of the pyramid, each time using the alignment result of the previous layer as the initial translation of the current layer for further alignment. In order to ensure accuracy, I usually search all layers step by step. Although the whole process is very slow, it can ensure the accuracy of image synthesis. Of course, you can also adjust the research range to 15 to reduce the time.

Section 2: Bells and Whistles

2.1 Edge Detection

For some images, edge detection is not so effective, but for some images with a huge difference between red and blue, such as emir.tif, edge detection is a good optimization. I used the Canny edge detection algorithm to extract edge information in the image, and then performed image alignment on these edge maps. Compared with directly using the pixel intensity values of the image for alignment, this method can better ignore changes in color or brightness and focus on important shapes and structures in the image, thereby improving the accuracy of alignment.

2.2 Remove Edge Parts

My idea is to remove y edge parts in the image, especially those areas with almost no information (usually white or nearly white areas), so as to process and align the image more effectively. The function checks each row and column of the image to find those areas containing non-white pixels (that is, pixel values below the threshold parameter setting value, which is 0.9 by default). Find the location of the outermost non-white area respectively. According to the detected non-white area boundaries, crop the image to the smallest rectangular frame containing these areas to remove the useless edges of the image. Obviously, I only considered the white case, but most of the time, the invalid information on the edge may also be blue and red, but in my attempt, removing blue and red is likely to affect the picture, so I only considered white.

2.3 Crop

The main purpose of the crop function I added is to crop the two images so that they are the same size. In image processing, especially the alignment of multi-channel images, the size of the images may be inconsistent due to different alignment offsets. This ensures that the size of each channel image is consistent, making the subsequent image overlay processing smoother and without problems due to size mismatch. First, the minimum height and minimum width of the two images are calculated, and then it crops the two images based on these minimum heights and widths, keeping only the front part of the image. Finally, the two cropped images are returned.

2.4 White Balance

White balance is a technique for correcting image color deviations. By adjusting the average value of each color channel, the overall tone of the image is more natural and balanced. The white balance function calculates the average value and average grayscale value of the three channels of red, green, and blue in the image. By adjusting the pixel value of each color channel, the average value of each channel is close to avg_gray. Specifically, the pixel value of each channel is multiplied by a scaling factor, which is the ratio of avg_gray to the average value of the channel. The pixel value of each channel is clipped to ensure that it is in the range of 0 to 1. However, the actual effect makes the entire tone very dim, which does not conform to the aesthetics of the image, so the code is finally implemented but the final display part is not used.

2.5 Parallel Processing

In the main function, I used the Parallel and delayed functions of the joblib library to parallelize the image processing tasks. Since I am writing a batch process of a whole folder of pictures, this can greatly improve the average speed of processing pictures.