We’ve all heard the saying A picture is worth a thousand words. But is a tarnished image with gaping holes or splotches or blurs worth a few hundred? What if you just found an age-old photograph of your grandparents’ wedding, but the surface was so worn that you could barely make out their faces. Or perhaps you got photobombed in what would otherwise have been the perfect picture. Or maybe you’re like me and are wondering why hasn’t anyone integrated an option in a smartphone camera app to remove unwanted objects from images? This view from my school would be just the sort of thing Inpainting could improve. What is Image Inpainting?

  Inpainting is the process of reconstructing lost or deteriorated parts of images and videos. In the museum world, in the case of a valuable painting, this task would be carried out by a skilled art conservator or art restorer. In the digital world, inpainting refers to the application of sophisticated algorithms to replace lost or corrupted parts of the image data. (source)

This official definition of inpainting on Wikipedia already takes into account the use of “sophisticated algorithms” that do the same work of manually overwriting imperfections or repairing defects but in a fraction of the time.

As deep learning technologies progress further, however, the process of inpainting has become automated in so complete a manner that these days, it requires no human intervention at all. Simply feed a damaged image to a neural network and receive the corrected output. Go ahead and try it out yourself, with NVIDIA’s web playground that demonstrates how their network fills in a missing portion for any image.

Simply drag and drop any image file, erase a portion of it with the cursor and watch how the AI patches it up. I tried it on a few pictures lying around on my desktop. Here’s one of them below, with a big chunk of my face missing and the neural network restoring it again in a matter of seconds, albeit making me look like I just got out of a street fight.

You can also use it to quickly getting rid of something in a picture, too. Here’s another image I had lying around. A great view of Hangzhou’s west lake with the picturesque Leifeng Pagoda in the distance. The AI does a great job in envisioning a new lake with no pagoda around.

Traditional forms of image restoration usually evolve around some simple concept. Given a gap in pixels, fill the gap with pixels that are the same as, or similar to, neighboring pixels. These techniques are generally dependent on various factors and are more efficient for removing noise or small defects from images. They will most likely fail when the image has huge gaps or a significant amount of missing data.

The most straightforward and conventional technique for image restoration is deconvolution, which is performed in the frequency domain and after computing the Fourier transform of both the image and the PSF to undo the resolution loss caused by the blurring factors. The result of applying this technique usually creates an imperfectly deblurred image. Source

In a basic sense, inpainting does refer to the restoration of missing parts of an image based on the background information. It can be thought of as a process of filling in missing data in a designated region of visual input.

  A newsletter for machine learners — by machine learners. Sign up to receive our weekly dive into all things ML, curated by our experts in the field.

A New Approach with Machine Learning

The new age alternative is to use deep learning to inpaint images by utilizing supervised image classification. The idea is that each image has a specific label, and neural networks learn to recognize the mapping between images and their labels by repeatedly being taught or “trained”.

When trained on huge training datasets (millions of images with thousands of labels), deep networks have remarkable classification performance that can often surpass human accuracy. Generative adversarial networks are typically used for this sort of implementation, given their ability to “generate” new data, or in this case, the missing information. Courtesy : NVIDIA

The basic workflow is as follows: feed the network an input image with “holes” or “patches” that need to be filled. These patches can be considered a hyperparameter required by the network since the network has no way of discerning what actually needs to be filled in. For instance, a picture of a person with a missing face conveys no meaning to the network except changing values for pixels.

To enable the neural network understand what part of the image actually needs filling in, we need a separate layer mask that contains pixel information for the missing data. The input image then goes through several convolutions and deconvolutions as it traverses across the network layers. The network does produce an entirely synthetic image generated from scratch. The layer mask allows us to discard those portions that are already presented in the incomplete image, since we don’t need to fill those parts in. The new generated image is then superimposed on the incomplete one to yield the output.