Introduction

Each image in the Prokudin-Gorskii collection consists of three exposures made with color separation filters of red, green, and blue respectively. This project focuses on using algorithms to align the three filters of each image automatically, then stacking the three colors to produce a colored image. For larger images, this project uses image pyramids to optimize the alignment efficiency. In addition, this project also experimented with automatic cropping and contrasting.

cathedral.jpg
Blue Channel

cathedral.jpg
Green Channel

cathedral.jpg
Red Channel

Small Images

Approach

To align the three color channels, the algorithm needs to use one channel as the base and calculate the optimal displacements of the other two channels over the base channel, such that both displacements result in maximum similarity between the shifted channel and the base channel.

I used an exhaustive method on small .jpg images that takes the blue channel of each image as a base channel and rolls the red and green channels of that image through a window of possible displacements, (-15, 15) pixels on both height and width, to search for the optimal displacement. To measure the similarity, I used the Sum of Squared Difference (SSD) as the metric to score the differences between the shifted channel and the base channel. Over the window of possible displacements, the displacement with minimum SSD score indicates that the shifted channel and the base channel achieve maximum similarity, thus the displacement is optimal.

At first, I applied this method on each image directly without any preprocessing, and the result images' color channels did not align well. After experimenting, I realized that as all color channels have some borders that would interfere with the displacement calculation, such as decreasing the SSD score due to the borders matching between channels, preprocessing is required to remove these artifacts. Therefore, I preprocessed each image by cropping 10% from all four sides of the image before the alignment process. This preprocessing method ensures that the borders will have a minimum interference during the displacement search.

Results

cathedral.jpg
Red: (12, 3)
Green: (5, 2)

monastry.jpg
Red: (3, 2)
Green: (-3, 2)

tobolsk.jpg
Red: (6, 3)
Green: (3, 3)

Custom Images

custom_kivach.jpg
Red: (12, 3)
Green: (3, 1)

custom_na_dunaie.jpg
Red: (8, 11)
Green: (3, 5)

custom_pamiatnik.jpg
Red: (8, 5)
Green: (1, 3)

Larger Images

Approach

For larger .tif images that consist of thousands of pixels on both side, the window of possible displacements is much larger, and the aforementioned exhaustive search will need to iterate through a larger window on more pixels, making the method alone not effective nor efficient.

To remedy the increase in image size, I used the image pyramid technique that recursively downscale the size of images by 2 until the width is less than 500 pixels. In each layer of the pyramid, I used the exhaustive method on the image, but with an increasing window of possible displacements, starting with (-4, 4) on the original image and adding (-4, 4) after each layer. This ensures the alignment's efficiency and quality without many tradeoffs When the image size is similar to the .jpg images, I used the default (-15, 15) window since from the previous part, it is efficient and effective for images of this size. After each recursive call, the calculated displacement from the downscaled image is multiplied by 2, and then added to the calculated displacement of the current layer.

Results

church.tif
Red: (58, -4)
Green: (25, 4)

emir.tif
Red: (104, 56)
Green: (49, 24)

harvesters.tif
Red: (124, 14)
Green: (60, 17)

icon.tif
Red: (90, 23)
Green: (41, 17)

lady.tif
Red: (112, 12)
Green: (52, 9)

melons.tif
Red: (178, 13)
Green: (82, 10)

onion_church.tif
Red: (108, 36)
Green: (52, 26)

sculpture.tif
Red: (140, -27)
Green: (33, -11)

self_portrait.tif
Red: (176, 37)
Green: (79, 29)

three_generations.tif
Red: (112, 11)
Green: (53, 14)

train.tif
Red: (87, 32)
Green: (42, 6)

Custom Images

custom_milanie.tif
Red: (-2, -55)
Green: (-10, -19)

custom_oranzhereie.tif
Red: (126, 34)
Green: (60, 28)

custom_piony.tif
Red: (156, 30)
Green: (75, 21)

custom_siren.tif
Red: (96, -25)
Green: (50, -6)

custom_stantsiia_soroka.tif
Red: (107, 0)
Green: (28, 1)

custom_zakat.tif
Red: (114, -68)
Green: (75, -41)

Bells & Whistles

Automatic Cropping

The naive method of cropping predefined 10% from all sides does not work well on all aligned images since the border artifacts have different sizes across different images, and also do not align parallelly to the four sides of the image. Therefore, I implemented an automatic cropping method that uses the average intensity value of pixels in each row/column of each color channel to determine whether each row/column of the image is a part of the border or not.

For each color channel of an image, starting from the four sides, this method will take the average intensity value of all pixels in the row/column, and then use (0.12, 0.945) as the thresholds for border detection. If the average intensity value is larger than 0.945 or smaller than 0.12, then this method will classify this row/column as a part of the border, and then move to the next row/column and repeat the process. For each side, when this method calculates the average intensity value of a row/column that's between 0.12 and 0.945, the method will consider this row/column as the start of the actual image content. Then, it will crop the image from the opposite side to this row/column, and then move to the next side of the image until all four sides of this image are cropped.

Examples

harvesters.tif
Before Auto Cropping

harvesters.tif
After Auto Cropping

self_portrait.tif
Before Auto Cropping

self_portrait.tif
After Auto Cropping

Automatic Contrasting

For automatic contrasting, I first applied the aforementioned automatic cropping, as well as cropping an additional 5% to ensure that the interference of border artifacts is minimized. Then, I rescaled the image intensities such that for each image, the pixel with the lowest intensity across all 3 channels is rescaled to 0, and the pixel with the highest intensity across all 3 channels is rescaled to 1. For edge cases where the intensity values of the cropped 5% would be larger than 1 or less than 0, those values are bounded to 1 and 0 using np.min and np.max functions.

For all pixels in between, I implemented a linear and a nonlinear mapping that both maps the original intensity values to values between 0 and 1. The linear mapping used the simple linear formula below, and the effect of applying this linear mapping during automatic contrasting is inconspicuous on most images. For the nonlinear mapping, the nonlinear formula below produced the plot where the gradient is the highest when the original intensity value is 0.5, and decreases as the original intensity value is closer to 0 and 1. This mapping ensures that the intensity differences between most pixels after automatic contrasting are larger, thus creating images with more contrast.