skip to Main Content

This article originally appeared in Japanese on Codezine, below follows HACARUS translation of the same:

This series of articles is for engineers who want to jump into machine learning and who have experience with deep learning and machine learning. The focus will be on a technique called “sparse modeling” – where you can perform analysis without large amounts of data. In the last article, image reconstruction using dictionary learning was covered. This time, three advanced methods will be introduced: deficit interpolation for damaged images, anomaly detection and super-resolution. We will cover some examples where the the applications of dictionary learning are sure to have you intrigued. Let’s now take a look at them in order.

## Deficit Interpolation for Missing Data in Images

In real-world problems, there is often missing data. In such cases, the assumption of sparseness can be used to interpolate missing data.

When image reconstruction was introduced in the last article, we found that if you train a dictionary by separating natural images into patches, each patch can be represented sparsely using the given dictionary. In fact, such a dictionary and sparse representation can be used even if the training image has some deficits.

The reason why this is possible is that; for example, if you consider a simultaneous equation with 100 variables, with 1000 equations you can determine the values of all the variables. If you have 10,000 equations as data, you need only use 1% of them to solve the problem. So using this logic, if you knew that the solutions to the equation are almost all zero, it is possible to obtain a solution from a small number of equations (compared to the number of variables). In other words, you don’t need 10000 equations to solve a simultaneous equation with 10000 variables. This is the reason why sparse modeling is said to be highly compatible with small amounts of data.

From this, if we consider that the dictionary and sparse code obtained by dictionary learning are equivalent to variables in a simultaneous equation, we can successfully estimate the dictionary and sparse code using only the equations obtained from the non-missing parts. That is to say, there is no need to have 100% of the data.

Now let’s actually apply dictionary learning to a missing image. First, let’s create an image with 50% of the image missing at random, as shown below.

deficit_rate = 0.5
img = np.asarray(Image.open("img/recipe.jpg").convert('L'))
mask = (np.random.rand(img.shape[0], img.shape[1]) > deficit_rate)
deficit_img = mask * img

Creating an image with Missing Data

We will apply dictionary training to this image. However, the values of the missing areas are set to 0, so the algorithm needs to be rewritten accordingly. The library we used last time, spm-image, already has a dictionary training algorithm that assumes missing data. Let’s use this algorithm for now. If you’re interested, you might want to read the internal implementation for more detail. In fact, if you state missing_value=0 as below, you can perform learning, with damaged data already in consideration.

from sklearn.feature_extraction.image import extract_patches_2d, reconstruct_from_patches_2d
from spmimage.decomposition import KSVD
from spmimage.decomposition import sparse_encode_with_mask
from sklearn.preprocessing import StandardScaler

# 辞書学習のパラメータを設定
patch_size = (8, 8)
n_nonzero_coefs = 5
n_components = 64

# 画像からパッチを切り出す
patches = extract_patches_2d(deficit_img, patch_size).reshape(-1, np.prod(patch_size)).astype(np.float64)

# 辞書学習を実行し、辞書DとスパースコードXを求める
model = KSVD(n_components=n_components, transform_n_nonzero_coefs=n_nonzero_coefs, max_iter=15, missing_value=0)
X = model.fit_transform(patches)
D = model.components_

Using the dictionary and sparse representation trained from the missing images, the reconstructed image was constructed as follows:

Image Reconstruction with Missing Data

As you can see, we were able to reconstruct the damaged images to some extent. Have a go and try to see how far you can restore various damaged images!

## Anomaly Detection

The next example is about image anomaly detection using dictionary learning.

Anomaly detection is the problem of judging whether an image is normal or not. There are a variety of methods available, ranging from the classical method of pattern matching to the latest methods using deep learning.

First, I will introduce the basic idea of anomaly detection using dictionary learning. As training data, several normal images need to be prepared to find a dictionary and a sparse representation. At this time, if the dictionary’s representation power is reduced, an image with similar characteristics to the training image can be recreated from combining the calculated dictionaries. However, other images cannot be reconstructed by simply combining dictionaries.

Therefore, you can say that if the original image can be successfully recreated from the sparse dictionary created from the new image, the new image is normal. However, if you cannot recreate such an image, then the new image has an anomaly. Thus anomaly detection can be achieved from dictionary learning.

Let’s try to see if we can perform anomaly detection with dictionary learning. In this example, we have prepared the following two wooden boards.

As you can see, there are six scratches or holes in the abnormal images. Since the grain pattern differs slightly between the normal and damaged images, it is difficult to detect anomalies from simple differences. Therefore, we need to detect anomalies by learning the grain pattern well. Let’s try to see if we can do so.

First, dictionary learning using only normal images will be performed. In this case, the resulting dictionary should only reconstruct the grain pattern of the tree, not other patterns. To achieve this, the size of the dictionaries and the number of combinations of dictionaries will be set to a small value. The whole flow this process is as follows:

# Read Normal Image
ok_img = np.asarray(Image.open("img/wood-ok.jpg").convert('L'))

# Reduce the dictionary's representation power in order to only be able to recreate normal images
patch_size = (16, 16)
n_components = 10
transform_n_nonzero_coefs = 3

# Prepare Training Data
scl = StandardScaler()
patches = extract_simple_patches_2d(ok_img, patch_size)
patches = patches.reshape(-1, np.prod(patch_size)).astype(np.float64)
Y = scl.fit_transform(patches)

# Dictionary Learning
ksvd = KSVD(n_components=n_components, transform_n_nonzero_coefs=transform_n_nonzero_coefs, max_iter=max_iter)
X = ksvd.fit_transform(Y)
D = ksvd.components_

From dictionary D and sparse code X, the images below were constructed – showing that this model worked very well.

Reconstructed Result of OK Image

Next, let’s perform image reconstruction on abnormal images.

# Load the Abnormal Image
ng_img = np.asarray(Image.open("img/wood-ng.jpg").convert('L'))

# Calculate the Sparse Code for the Abnormal Image
patches = extract_simple_patches_2d(ng_img, patch_size)
patches = patches.reshape(-1, np.prod(patch_size)).astype(np.float64)
Y = scl.transform(patches)
X = ksvd.transform(Y)

The obtained reconstructed image is as follows:

Reconstructed Result of Abnormal Image

This shows that while the normal parts of the abnormal image can be reconstructed without any problems, the anomalous areas are either missing scratches or are poorly reconstructed. In fact, if we draw a histogram of these absolute margins of error, – bipolarized with a threshold of 10 – it looks like the following:

Histogram of Reconstruction Error for Anomalous Images

Identifying Where the Errors Are

In this way, by extracting only the areas where the reconstruction error margin is extremely large, it is possible to expose the anomalous parts of the image. The normal images can be well represented in the resulting dictionary, while only the abnormal images are insufficiently reconstructed, which is evident from this result.

In general, both normal and abnormal data need to be used as training data when detecting anomalies in machine learning – but in most cases, such as for industrial products, it is difficult to collect a sufficient amount of abnormal data.

However, with dictionary learning, you can determine what is normal from normal data only, and it is possible to determine anomalies from their differences compared to normal states. There is no longer any need to prepare a vast amount of data to train your anomaly detection model. In fact, HACARUS’ SPECTRO – HACARUS’s flagship visual inspection solution –  adopts this dictionary-learning-based anomaly detection algorithm.

## Super-Resolution

Finally, let’s take a look into super-resolution. Once a high-resolution image is converted to a low-resolution image, the high-frequency components are lost. In principle, it is impossible to recover these high-frequency components from a low-resolution image alone. However, super-resolution is a way to estimate the high-frequency components of the image.

In many cases, a large amount of data – of pairs of high and low-resolution images of the same domain need to be collected beforehand. The principles for converting such low-resolution images to high-resolution images are then learned. The super-resolution is then achieved by converting a new low-resolution image to a high-resolution one. In this section, how to achieve the conversion from low-resolution to high-resolution images using dictionary learning will be discussed.

Firstly, using a pair of low resolution and high-resolution images(Dl and Dh), learning is performed for each one. The fact that two dictionaries are learned is key. One for representing patches of low-resolution images and the other for high-resolution images are learned to have the same sparse representation. In other words, the image number l in the low-resolution dictionary corresponds to the image number l in the high-resolution dictionary.

Super Resolution Learning with Dictionary Learning

Using the learned low-resolution dictionary Dl and the high-resolution dictionary Dh, we will now try to create a high-resolution image from a low-resolution one. The flow of the process is as shown in the following figure.

How Super Resolution is Performed with Dictionary Learning

Using the learned low-resolution dictionary Dl and the high-resolution dictionary Dh, we will now try to create a high-resolution image from a low-resolution one. The flow of the process is as shown in the following figure.

It may be difficult to see the difference, but the loss of high-frequency components in low-resolution image results in a slightly blurred outcome. On the other hand, in the super-resolution image, the edges are far clearer. Comparing the PSNR and SSIM of the original high-resolution image and the super-resolution image, the result was PNSR:23.141, SSIM;.802

## In Conclusion

In this article, three examples of advanced applications of dictionary learning were explained: deficit interpolation for damaged images, anomaly detection, and super-resolution. As covered, the idea of learning a dictionary has many different use cases.

The source code used here in this article is available on GitHub. If you’re interested in the detailed theory and algorithms of deficit interpolation for damaged images, and super-resolution, I recommend you read more about it in “Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing” by Michael Elad.

Stay tuned for the next and final article in this series, which will cover recent advancements in the academic field of sparse modeling.