Week 17 - Pixel extension

1 minute read

One of the possible causes that the strategy of combining a CNN with an LSTM does not work correctly is that images with a single active pixel are being used, complicating the task of finding spatial correlations for the network. To analyze this fact, we have decided to expand the pixel and train the network again.

Pixel extension

The moving object that is currently being used is a single active pixel with the value of 255, as shown in the image:

Discrete pixel

I expand the active area of the image, the object size, gradually reducing the intensity level of the pixels around the active pixel, making use of an isotropic Gaussian function center at said pixel. For this I convolve a 5x5 Gaussian filter with the original image getting this result:

Extended pixel

CNN+LSTM Network results

To check if this idea really provides an improvement I have trained the CNN+LSTM network with the same dataset as last week (URM) with the same 800 samples and tested with 100.

Extended pixel results

Although we can see an improvement in the results (from a mean relative error of 4% to 2.5%), this improvement is not enough to consider the use of CNN with a subsequent LSTM as a good strategy.