Survey: Deep learning with Self-Organizing Map

Self-Organizing Map is one of my favorite bionics models. In most cases, it is applied to visualize data with high dimension, and indeed it can generate pretty amazing results:

SOM is also closely related to Vector Quantization (VQ). After training, the reference vectors in SOM can represent a specific type of sample in the input space. However, based on my reading, the self-organizing mechanism is inspired by how the cortical system perceives the world, namely, how human sees things. I tried my best to study Self-organization of orientation sensitive cells in the striate cortex, but it contains so many biological terms that I can hardly understand its content. I will keep updating this post.

Still, I am very interested in how to use SOM to improve current computer vision methods, especially deep learning. Surprisingly, I can find a limited number of papers that are related this direction:

The first paper is Convolutional Self Organizing Map. Reading this paper is quite suffering. It is has poor writing, the illustrations are badly drawn, the detail of algorithm is vague, and it lacks convincing experiments. Although this paper relates to my goal, I believe it does not show the full capability of SOM in visual tasks. Anyway, it proposes two kinds of CSOM:

The first type of CSOM replaces the convolutional layer with a SOM layer. After training the SOM layer with standard SOM algorithm, patches are used to calculate the correlation coefficients against the learned kernels. Pooling operation is in those correlations.

The second type of CSOM is to replace the pooling layer. A winner array is used to record the position and the euclidian distance between the winner neuron with the input. And then the position with minimum distance is chosen in the pooling layer. It is wired that the output of this structure is a grid of positions (of the SOM map). Both two CSOM is mainly used to visualize data space. Again, this paper is really substandard in either writing or experiments.

The next paper is Deep Self-Organizing Map for Visual Classification. It uses the traditional training method of SOM to train multiple maps from patches. Each SOM corresponds to an area in the original image. And it is trained layer by layer unsupervisedly.

However, when combining multiple SOMs, the writing is somehow vague. If my understanding is correct, for each location, it uses the indices of the winner neurons of the corresponding SOM as the output.

Two things can be improved from my perspective. Firstly, SOMs are different over different areas, which means each of them aims at representing patches at one location of the image. This might result in low generality against shift variance. SOMs should be shared across to learn a general representation of visual appearance.

Secondly, using the index of the neuron as output seems too non-linear to me. If one fails to encode the image, then the following layers are unable to recover the image or to compensate. Residual information about the original image should be preserved.

The last paper to introduce is Classifier with Hierarchical Topographical Maps as Internal Representation. It is more like “improved Radial Basis Function Network” than “improved SOM” to me. It borrows SOM’s ideas like Winner-takes-all and Neighborhood-Updates.

In the gradient, we can see that the update formula of reference vectors of SOM is different from the standard one. The reference vector $W$ is not longer always pulled to the input vector $O$ . Now its direction is related to supervised information $\delta$

Compared with normal SOM, CRSOM can cluster object with context (supervised information):

But it is odd that it only conducts a very simple classification experiment with only 5 classes on MNIST. And the comparison on the generality is not convincing enough to prove the superiority of SOM over other feature abstracting methods.

To sum up, bionics is a promising field to works on. Methods in Computer Intelligence can be transferred to deep learning, like coevolution to GAN. And SOM is one mechanism that I will bet my effect on. If you are also intrigued by SOM, feel welcome to discuss with me.