Week 6 - Refactoring code

2 minute read

Meeting summary

  • All the planned training sessions have been carried out, obtaining a fairly reasonable result on the NoRNN and ConvLSTM networks, and poor performance on the LSTM networks.
  • With these results we decided to discard the LSTM network due to its low performance and focus on the other two types.
  • On the other hand we decided to put the focus back on the modeled images to make a comparison of the performance and expecting quite good results in view of what has been obtained so far.
  • The possibility of analyzing two other types of dynamics is also established: parabolic and sinusoidal.

To Do

The tasks proposed for this week are

  • Refactor the code to be able to train and evaluate the networks with modeled data.
  • Generate the datasets for the new dynamics: parabolic and sinusoidal.

Code refactoring

Different files have to be adapted, which are described below.

For training:


  data_model: modeled #Raw or Modeled

First we must establish in the configuration file that we will train with modeled samples.


  data_model = conf['data_model']
  dim = (int(samples_dir.split('_')[-2]), int(samples_dir.split('_')[-1]))
  if data_model == "raw":
    print("Modeled images")
    loss = conf['func_loss']
    dim = (int(samples_dir.split('_')[-2]), int(samples_dir.split('_')[-1]))
    filename = root + "_Modeled/"
    _, trainX, trainY = frame_utils.read_frame_data(data_dir + 'train/', 'modeled_samples', dim)
    _, valX, valY = frame_utils.read_frame_data(data_dir + 'val/', 'modeled_samples', dim)
    train_data = [trainX, trainY]
    val_data = [valX, valY]

    # Model settings
    in_dim = trainX.shape[1:]
    out_dim = np.prod(in_dim[1:])
    if net_type == "NoRec":
        to_train_net = Net.Convolution1D(activation=activation, loss=loss, dropout=dropout,
                                         drop_percentage=drop_percentage, input_shape=in_dim,
    else:  # net_type == "Rec"
        to_train_net = Net.Lstm(activation=activation, loss=loss, dropout=dropout,
                                drop_percentage=drop_percentage, input_shape=in_dim,

This new type of structure makes that we need to design different networks for training, in addition, the way of reading the data will also change.


  def get_modeled_samples(samples_paths, dim):
      dataX = []
      dataY = []

      for p in samples_paths:
          sample = pd.read_csv(p)
          positions = np.fliplr(sample.values).astype(np.float)
          for i in range(len(positions)):
              positions[i][0] /= dim[0]
              positions[i][1] /= dim[1]

      return np.array(dataX), np.array(dataY)

Each sample is stored in a .csv file so in order to obtain the data we need to read the data from the file, invert the order to match the dimensions, scale it to the range [0,1) dividing by the height and width of the image, and separate the 20 entry positions and the target.

For testing:


  sample_type = conf['data_path'].split('/')[-1]
  dim = (int(samples_dir.split('_')[-2]), int(samples_dir.split('_')[-1]))
  if sample_type == "raw_samples":
    parameters, testX, testY = frame_utils.read_frame_data(conf['data_path'], sample_type, dim)
    if net_type == "NOREC":
        print('Puting the test data into the right shape...')
        to_test_net = Net.Convolution1D(model_file=conf['model_path'])
        print('Puting the test data into the right shape...')
        to_test_net = Net.Lstm(model_file=conf['model_path'])

As in the case of training, the networks and the way to read the samples vary from the raw images. The way of reading the samples in this case coincides with the way we did it in training


  raw = True
  if "modeled" in data_type:
      raw = False
  predict_values, real_values, maximum = frame_utils.get_positions(predict, test_y, dim, raw)

To obtain both the real and predicted positions, a flag is introduced that allows distinguishing between modeled and raw samples.


  def get_positions(predictions, real, dim, raw):
    for i, p in enumerate(predictions):
        if raw:
            predict_pos.append(utils.scale_position(p, dim[1], dim[0]))
            real_pos.append(utils.scale_position(real[i], dim[1], dim[0]))

        maximum.append(np.linalg.norm(np.array((0, 0)) - np.array(dim)))

    return np.array(predict_pos), np.array(real_pos), np.array(maximum)


  def scale_position(pos, x_max, y_max):
    return [int(pos[0] * y_max), int(pos[1] * x_max)]

For modeled images it is only necessary to rescale the values (y,x) obtained to the range [0,h) and [0,w).