Week 23: Driving videos, Pilotnet multiple (stacked), Metrics table, Basic LSTM
Driving videos
Pilotnet network [whole image]
I’ve used the predictions of the Pilotnet network (regression network) to driving a formula 1 (test3):
- Simple circuit clockwise (simulation time: 1min 41s):
- Simple circuit anti-clockwise (simulation time: 1min 39s):
- Monaco circuit clockwise (simulation time: 1min 21s):
- Monaco circuit anti-clockwise (simulation time: 1min 23s):
- Nurburgrin circuit clockwise (simulation time: 1min 03s):
- Nurburgrin circuit anti-clockwise (simulation time: 1min 06s):
Tinypilotnet network [whole image]
I’ve used the predictions of the Tinypilotnet network (regression network) to driving a formula 1:
- Simple circuit clockwise (simulation time: 1min 39s):
- Simple circuit anti-clockwise (simulation time: 1min 38s):
- Monaco circuit clockwise (simulation time: 1min 19s):
- Monaco circuit anti-clockwise (simulation time: 1min 20s):
- Nurburgrin circuit clockwise (simulation time: 1min 05s):
- Nurburgrin circuit anti-clockwise (simulation time: 1min 06s):
Biased classfication network [cropped image]
I’ve used the predictions of the classification network according to w (7 classes) and v (4 classes) to driving a formula 1 (simulation time: 1min 38s):
Results table (cropped image)
Driving results (regression networks) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manual |
|
Pilotnet v + w |
|
TinyPilotnet v + w |
|
Stacked v+w |
|
Stacked (diff) v+w |
|
LSTM-Tinypilotnet v + w |
|
DeepestLSTM-Tinypilot. |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
1min 35s |
100% |
1min 40s |
100% |
1min 40s |
100% |
1min 41s |
100% |
1min 39s |
100% |
1min 40s |
100% |
1min 37s |
Simple (anti-clockwise) |
100% |
1min 32s |
100% |
1min 45s |
100% |
1min 40s |
10% |
|
100% |
1min 38s |
100% |
1min 38s |
100% |
1min 38s |
Monaco (clockwise) |
100% |
1min 15s |
85% |
|
85% |
|
85% |
|
45% |
|
50% |
|
55% |
|
Monaco (anti-clockwise) |
100% |
1min 15s |
100% |
1min 20s |
100% |
1min 18s |
15% |
|
5% |
|
35% |
|
55% |
|
Nurburgrin (clockwise) |
100% |
1min 02s |
100% |
1min 04s |
100% |
1min 04s |
8% |
|
8% |
|
40% |
|
100% |
1min 04s |
Nurburgrin (anti-clockwise) |
100% |
1min 02s |
100% |
1min 05s |
100% |
1min 05s |
80% |
|
50% |
|
50% |
|
80% |
|
Driving results (classification networks) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manual |
|
1v+7w biased |
|
4v+7w biased |
|
1v+7w balanced |
|
4v+7w balanced |
|
1v+7w imbalanced |
|
4v+7w imbalanced |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
1min 35s |
100% |
2min 16s |
100% |
1min 38s |
100% |
2min 16s |
98% |
|
100% |
2min 16s |
100% |
1min 42s |
Simple (anti-clockwise) |
100% |
1min 32s |
100% |
2min 16s |
100% |
1min 38s |
100% |
2min 16s |
100% |
1min 41s |
100% |
2min 16s |
100% |
1min 39s |
Monaco (clockwise) |
100% |
1min 15s |
45% |
|
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
Monaco (anti-clockwise) |
100% |
1min 15s |
15% |
|
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
Nurburgrin (clockwise) |
100% |
1min 02s |
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
Nurburgrin (anti-clockwise) |
100% |
1min 02s |
80% |
|
90% |
|
80% |
|
80% |
|
80% |
|
80% |
|
Results table (whole image)
Driving results (regression networks) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manual |
|
Pilotnet v + w |
|
TinyPilotnet v + w |
|
Stacked v+w |
|
Stacked (diff) v+w |
|
LSTM-Tinypilotnet v + w |
|
DeepestLSTM-Tinypilot. |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
1min 35s |
100% |
1min 41s |
100% |
1min 39s |
100% |
1min 40s |
100% |
1min 43s |
100% |
1min 39s |
100% |
1min 39s |
Simple (anti-clockwise) |
100% |
1min 32s |
100% |
1min 39s |
100% |
1min 38s |
100% |
1min 46s |
10% |
|
10% |
|
100% |
1min 41s |
Monaco (clockwise) |
100% |
1min 15s |
100% |
1min 21s |
100% |
1min 19s |
50% |
|
5% |
|
100% |
1min 27s |
50% |
|
Monaco (anti-clockwise) |
100% |
1min 15s |
100% |
1min 23s |
100% |
1min 20s |
7% |
|
5% |
|
50% |
|
100% |
1min 21s |
Nurburgrin (clockwise) |
100% |
1min 02s |
100% |
1min 03s |
100% |
1min 05s |
50% |
|
8% |
|
100% |
1min 08s |
100% |
1min 05s |
Nurburgrin (anti-clockwise) |
100% |
1min 02s |
100% |
1min 06s |
100% |
1min 06s |
80% |
|
50% |
|
50% |
|
100% |
1min 07s |
Driving results (classification networks) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manual |
|
1v+7w biased |
|
4v+7w biased |
|
1v+7w balanced |
|
4v+7w balanced |
|
1v+7w imbalanced |
|
4v+7w imbalanced |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
1min 35s |
100% |
2min 17s |
70% |
|
75% |
|
7% |
|
100% |
2min 17s |
40% |
|
Simple (anti-clockwise) |
100% |
1min 32s |
100% |
2min 17s |
10% |
|
100% |
2min 16s |
7% |
|
100% |
2min 16s |
10% |
|
Monaco (clockwise) |
100% |
1min 15s |
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
Monaco (anti-clockwise) |
100% |
1min 15s |
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
5% |
|
Nurburgrin (clockwise) |
100% |
1min 02s |
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
Nurburgrin (anti-clockwise) |
100% |
1min 02s |
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
8% |
|
Pilotnet multiple (stacked)
In this method (stacked frames), we concatenate multiple subsequent input images to create a stacked image. Then, we feed this stacked image to the network as a single input. In this case, we have stacked 2 images separated by 10 frames. The results are:
Driving results (regression networks) |
|
|
|
|
|
|
|
|
|
stacked const v whole image |
|
stacked whole image |
|
stacked const v cropped image |
|
stacked cropped image |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
3min 45s |
100% |
1min 40s |
100% |
3min 46s |
100% |
1min 41s |
Simple (anti-clockwise) |
100% |
3min 47s |
100% |
1min 46s |
100% |
3min 46s |
10% |
|
Monaco (clockwise) |
100% |
2min 56s |
50% |
|
100% |
2min 56s |
85% |
|
Monaco (anti-clockwise) |
7% |
|
7% |
|
7% |
|
15% |
|
Nurburgrin (clockwise) |
8% |
|
50% |
|
8% |
|
8% |
|
Nurburgrin (anti-clockwise) |
100% |
2min 27s |
80% |
|
90% |
|
80% |
|
We have also tried to stack 2 images, but separated but one is the image in the instantaneous it and the other is the difference image of it and it-10. The results are:
Driving results (regression networks) |
|
|
|
|
|
|
|
|
|
stacked const v whole image |
|
stacked whole image |
|
stacked const v cropped image |
|
stacked cropped image |
|
Circuits |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Percentage |
Time |
Simple (clockwise) |
100% |
3min 45s |
100% |
1min 43s |
100% |
3min 46s |
100% |
1min 39s |
Simple (anti-clockwise) |
100% |
3min 36s |
10% |
|
100% |
3min 46s |
100% |
1min 38s |
Monaco (clockwise) |
45% |
|
5% |
|
50% |
|
45% |
|
Monaco (anti-clockwise) |
5% |
|
5% |
|
7% |
|
5% |
|
Nurburgrin (clockwise) |
8% |
|
8% |
|
8% |
|
8% |
|
Nurburgrin (anti-clockwise) |
90% |
|
50% |
|
90% |
|
50% |
|
Metrics table (cropped image)
- Metrics results (Classification networks) (Train data):
Classification 7w biased |
|
|
|
|
Classification 4v biased |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
97% |
99% |
98% |
97% |
97% |
98% |
99% |
98% |
98% |
98% |
Classification 7w balanced |
|
|
|
|
Classification 4v balanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
95% |
99% |
96% |
95% |
95% |
94% |
97% |
95% |
95% |
95% |
Classification 7w imbalanced |
|
|
|
|
Classification 4v bimbalanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
98% |
99% |
99% |
99% |
99% |
98% |
99% |
98% |
98% |
98% |
- Metrics results (Classification networks) (Test data):
Classification 7w biased |
|
|
|
|
Classification 4v biased |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
94% |
99% |
95% |
95% |
95% |
95% |
98% |
95% |
95% |
95% |
Classification 7w balanced |
|
|
|
|
Classification 4v balanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
93% |
99% |
94% |
94% |
94% |
92% |
96% |
94% |
93% |
93% |
Classification 7w imbalanced |
|
|
|
|
Classification 4v bimbalanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
95% |
99% |
95% |
95% |
95% |
95% |
97% |
95% |
95% |
95% |
- Metrics results (Regression networks) (Train data):
Pilotnet w |
|
Pilotnet v |
|
Pilotnet w multiple (stacked) |
|
Pilotnet v multiple (stacked) |
|
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
0.001754 |
0.027871 |
0.626956 |
0.452977 |
0.110631 |
0.230633 |
5.215044 |
1.563034 |
- Metrics results (Regression networks) (Test data):
Pilotnet w |
|
Pilotnet v |
|
Pilotnet w multiple (stacked) |
|
Pilotnet v multiple (stacked) |
|
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
Mean squared error |
Mean absolute error |
0.002206 |
0.030515 |
0.849241 |
0.499219 |
0.108316 |
0.226848 |
5.272124 |
1.552658 |
Metrics table (whole image)
- Metrics results (Classification networks) (Train data):
Classification 7w biased |
|
|
|
|
Classification 4v biased |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
97% |
99% |
97% |
97% |
97% |
97% |
99% |
98% |
98% |
98% |
Classification 7w balanced |
|
|
|
|
Classification 4v balanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
95% |
99% |
96% |
96% |
96% |
90% |
95% |
90% |
90% |
90s% |
Classification 7w imbalanced |
|
|
|
|
Classification 4v bimbalanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
98% |
99% |
98% |
98% |
98% |
96% |
98% |
96% |
96% |
96% |
- Metrics results (Classification networks) (Test data):
Classification 7w biased |
|
|
|
|
Classification 4v biased |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
95% |
99% |
95% |
95% |
95% |
94% |
97% |
95% |
95% |
95% |
Classification 7w balanced |
|
|
|
|
Classification 4v balanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
93% |
99% |
94% |
93% |
93% |
89% |
95% |
91% |
89% |
90% |
Classification 7w imbalanced |
|
|
|
|
Classification 4v bimbalanced |
|
|
|
|
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
Accuracy |
Accuracy top 2 |
Precision |
Recall |
F1-score |
95% |
99% |
95% |
95% |
95% |
94% |
97% |
95% |
95% |
95% |
- Metrics results (Regression networks) (Train data):
Pilotnet w |
|
Pilotnet v |
|
Stacked w |
|
Stacked v |
|
DeepestLSTM-Tinypilotnet w |
|
DeepestLSTM-Tinypilotnet v |
|
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
0.000660 |
0.015514 |
0.809848 |
0.548209 |
0.068739 |
0.167565 |
8.973208 |
1.997035 |
1.997035 |
0.021000 |
0.491759 |
0.383216 |
- Metrics results (Regression networks) (Test data):
Pilotnet w |
|
Pilotnet v |
|
Stacked w |
|
Stacked v |
|
DeepestLSTM-Tinypilotnet w |
|
DeepestLSTM-Tinypilotnet v |
|
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
MSE |
MAE |
0.000938 |
0.017433 |
1.374714 |
0.659400 |
0.067305 |
0.164354 |
9.402403 |
2.039585 |
0.000982 |
0.020472 |
0.549310 |
0.399267 |
Basic CNN+LSTM
I have created a network cnn + lstm and I have trained it with a set of 10 images. There are very few data, but so I tested the network that did not work with the original dataset.The code is:
import glob
import cv2
import numpy as np
from time import time
from matplotlib import pyplot as plt
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Flatten, Dense, Conv2D, BatchNormalization, Dropout, Reshape, MaxPooling2D, Activation
from keras.layers.recurrent import LSTM
from keras.optimizers import Adam
def get_images(list_images):
# We read the images
array_imgs = []
for name in list_images:
img = cv2.imread(name)
img = cv2.resize(img, (img.shape[1] / 6, img.shape[0] / 6))
array_imgs.append(img)
return array_imgs
def lstm_model(img_shape):
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=img_shape, activation="relu"))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(BatchNormalization(axis=-1))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), padding='same', activation="relu"))
model.add(BatchNormalization(axis=-1))
model.add(Conv2D(128, (3, 3), padding='same', activation="relu"))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Reshape((1024, 1)))
model.add(LSTM(10, return_sequences = True))
model.add(Dropout(0.2))
model.add(LSTM(10))
model.add(Dropout(0.2))
model.add(Dense(5, activation="relu"))
model.add(Dense(1))
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss="mse", metrics=['accuracy', 'mse', 'mae'])
return model
if __name__ == "__main__":
# Load data
list_images = glob.glob('Images/' + '*')
images = sorted(list_images, key=lambda x: int(x.split('/')[1].split('.png')[0]))
y = [71.71, 56.19, -44.51, 61.90, 67.86, -61.52, -75.73, 44.75, -89.51, 44.75]
# We preprocess images
x = get_images(images)
X_train = x
y_train = y
X_t, X_val, y_t, y_val = train_test_split(x, y, test_size=0.20, random_state=42)
# Variables
batch_size = 8
nb_epoch = 200
img_shape = (39, 53, 3)
# We adapt the data
X_train = np.stack(X_train, axis=0)
y_train = np.stack(y_train, axis=0)
X_val = np.stack(X_val, axis=0)
y_val = np.stack(y_val, axis=0)
# Get model
model = lstm_model(img_shape)
model_history_v = model.fit(X_train, y_train, epochs=nb_epoch, batch_size=batch_size, verbose=2,
validation_data=(X_val, y_val))
print(model.summary())
# We evaluate the model
score = model.evaluate(X_val, y_val, verbose=0)
print('Evaluating')
print('Test loss: ', score[0])
print('Test accuracy: ', score[1])
print('Test mean squared error: ', score[2])
print('Test mean absolute error: ', score[3])
The results are:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 39, 53, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 39, 53, 32) 128
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 17, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 13, 17, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 13, 17, 64) 18496
_________________________________________________________________
batch_normalization_2 (Batch (None, 13, 17, 64) 256
_________________________________________________________________
conv2d_3 (Conv2D) (None, 13, 17, 64) 36928
_________________________________________________________________
batch_normalization_3 (Batch (None, 13, 17, 64) 256
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 8, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 6, 8, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 6, 8, 128) 73856
_________________________________________________________________
batch_normalization_4 (Batch (None, 6, 8, 128) 512
_________________________________________________________________
conv2d_5 (Conv2D) (None, 6, 8, 128) 147584
_________________________________________________________________
batch_normalization_5 (Batch (None, 6, 8, 128) 512
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 3, 4, 128) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 3, 4, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1536) 0
_________________________________________________________________
dense_1 (Dense) (None, 1024) 1573888
_________________________________________________________________
activation_1 (Activation) (None, 1024) 0
_________________________________________________________________
batch_normalization_6 (Batch (None, 1024) 4096
_________________________________________________________________
dropout_4 (Dropout) (None, 1024) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 1024, 1) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 1024, 10) 480
_________________________________________________________________
dropout_5 (Dropout) (None, 1024, 10) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 10) 840
_________________________________________________________________
dropout_6 (Dropout) (None, 10) 0
_________________________________________________________________
dense_2 (Dense) (None, 5) 55
_________________________________________________________________
dense_3 (Dense) (None, 1) 6
=================================================================
Total params: 1,858,789
Trainable params: 1,855,909
Non-trainable params: 2,880
_________________________________________________________________
None
Evaluating
('Test loss: ', 5585.3828125)
('Test accuracy: ', 0.0)
('Test mean squared error: ', 5585.3828125)
('Test mean absolute error: ', 72.8495864868164)
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM
from keras.datasets import imdb
We load dataset of top 1000 words
num_words = 1000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=num_words)
We need to divide this dataset and create and pad sequences (using sequence from keras.preprocessing)
In the padding we used number 200, meaning that our sequences will be 200 words long
X_train = sequence.pad_sequences(X_train, maxlen=200)
X_test = sequence.pad_sequences(X_test, maxlen=200)
Define network architecture and compile
model = Sequential()
model.add(Embedding(num_words, 50, input_length=200))
model.add(Dropout(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(250, activation=’relu’))
model.add(Dropout(0.2))
model.add(Dense(1, activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
We train the model
model.fit(X_train, y_train, batch_size=64, epochs=15)
We evaluate the model
score = model.evaluate(X_test, y_test)
print(‘Test loss:’, score[0])
print(‘Test accuracy:’, score[1])
</pre>
We got the accuracy of 86.42%.
ETA: 0s - loss: 0.2874 - acc: 0.825000/25000 [==============================] - 134s 5ms/step - loss: 0.2875 - acc: 0.8776
25000/25000 [==============================] - 47s 2ms/step
('Test loss:', 0.32082191239356994)
('Test accuracy:', 0.86428)
Basic LSTM
I’ve followed a LSTM tutorial to create an LSTM network in Keras. We’ve classified reviews from the IMDB dataset. The LSTM networks aren’t keeping just propagating output information to the next time step, but they are also storing and propagating the state of the so-called LSTM cell. This cell is holding four neural networks inside – gates, which are used to decide which information will be stored in cell state and pushed to output. So, the output of the network at one time step is not depending only on the previous time step but depends on n previous time steps.
The dataset was collected by Stanford researchers back in 2011. It contains 25000 movie reviews (good or bad) for training and the same amount of reviews for testing. Our goal is to create a network that will be able to determine which of these reviews are positive and which are negative. Words are encoded as real-valued vectors in a high dimensional space, where the similarity between words in terms of meaning translates to closeness in the vector space.
The code is the following: