Working with DIFODO on mrpt platform

10 minute read

Testing DIFODO - MRPT Installation

There are several ways to install the MRPT platform, as can be seen here https://www.mrpt.org/download-mrpt/.

I will download the latest version for ubuntu (Bionic-Beaver) 1:1.5.5-1 at the time of writing, which is quite updated (2017-12-18). In ubuntu installing is as simple as:

sudo apt-get install mrpt-apps

This will install the mrpt-apps, so all the binaries with different algorithms can be used, so we can try DIFODO.

If you want to develop and also install the source code to link to it on C++ install:

sudo apt-get install libmrpt-dev

Now a lot of application (binaries) are installed along mrpt-apps. The ones that concern us the most are:

For a full list of the apps available go here

Testing with the DifOdometry-Datasets app

First I would like to test DIFODO on a known dataset provided by the MRPT here. I downloaded the following:

wget http://downloads.sourceforge.net/project/mrpt/Datasets%20%28Rawlogs%29/Datasets/RGBD-datasets/rawlog_rgbd_dataset_freiburg1_360.tgz

Extract the file downloaded to get a set of files (groundtruth, image sequences, raw data…).

Secondly, both DifOdometry-Camera and DifOdometry-Datasets uses a configuration file that can be generated with the following option, –create-config or -cc, here an example:

DifOdometry-Datasets --create-config config_files/config_dataset.txt

Edit the config_dataset.txt and point the filename to your .rawlog file extracted. In my case it will look like this:

filename = /home/omar/workspaces/TFM/DIFODO/rawlog/rawlog_rgbd_dataset_freiburg1_360/rgbd_dataset_freiburg1_360.rawlog

Now in order to run the program, choose your configuration file as follows:

DifOdometry-Datasets --config config_files/config_dataset.txt 

At least for me (Ubuntu 18.04.3 LTS, kernel 4.15.0-70) this app didnt work. It opens but the keys that when pressed should start the app or exit it, did not do anything.

NOTE: I also test the same apporach with a different dataset and on Windows with the windows installer provided here

I had the same problem in Windows with version 1.5.7, keyboard wasnt recognized.

So back at my Ubuntu I remove all packages related to mrpt to install a different version:

sudo apt-get --purge remove libmrpt*
sudo apt-get --purge remove mrpt*

Now I will try their own repository using their ppa, as here

sudo add-apt-repository ppa:joseluisblancoc/mrpt
sudo apt-get update
sudo apt-get install libmrpt-dev mrpt-apps

This installed the MRPT v1.9.9 in my system.

Still not working, not recognizing the keyboard.

Testing with the DifOdometry-Datasets app (More errors) Part 2

It seems like there was a bug in the mrpt code, after I openned an Issue on github it was fixed quite fast. More info here -> issue994

Just a quick note here: This kind of commits and last fixes can be found on their ppa not in the ubuntu offical one which is a stable version that they dont update quite often but the one on their repo is updated with each commit Ill say.

To update all the mprt libs:

apt list --installed | grep mrpt
#To update all libs within the list with
apt install libmrpt* mrpt*

Once I got the lastest fixes I did:

DifOdometry-Datasets --config config_files/config_dataset.txt 

But, it seems to be something wrong, the grountruth and the images can be seen but there is no pose retrieved from the DIFODO algorithm.

The following error appears:

Eigensolver couldn’t find a solution. Pose is not updated

This error comes from mrpt/libs/vision/src/CDifodo.cpp specifically from CDifodo::filterLevelSolution().

Ill open a new Issue on github and try to solve it by myself.

The parts of the mrpt that are interesting about DIFODO can be found here:

  • mrpt/libs/vision/src/CDifodo.cpp
  • mrpt/libs/vision/include/mrpt/vision/CDifodo.h

And the DIFODO camera app can be found here:

  • mrpt/apps/DifOdometry-Camera/

The code that implements DIFODO algorithm is on the CDifodo module and an example and also graphic UI is implemented in the DifOdometry-Camera also in the DifOdometry-Datasets app.

Working on MRPT platform to get DIFODO to work again

In order to do this I must get to know the MRPT platform. Following are some links of interest regarding documentation, how to compile etc:

So in order to start changing the code and compile, I created my own CMakefiles.txt since the ones provided takes too much time since they link against all the libraries again in a hierarchical way. I just want to link against the libs that /mrpt/apps/DifOdometry-Datasets and /mrpt/apps/DifOdometry-Camera are using which can be found on their original CMakefiles.txt.

    
#-----------------------------------------------------------------
# CMake file for the MRPT application:  ReactiveNav3D-Demo
#
#  Run with "cmake ." at the root directory
#
#  January 2014, Mariano Jaimez Tarifa <marianojt@uma.es>
#-----------------------------------------------------------------
project(DifOdometry-Datasets)

# So it doesnt complain with older versions
cmake_minimum_required(VERSION 2.4)
if(COMMAND cmake_policy)
      cmake_policy(SET CMP0003 NEW)  # Required by CMake 2.7+
endif()

find_package(mrpt-gui REQUIRED)
find_package(mrpt-opengl REQUIRED)
find_package(mrpt-vision REQUIRED)

# Another way to do it on one line
# find_package(mrpt REQUIRED gui opengl vision)

# ---------------------------------------------
# TARGET:
# ---------------------------------------------
# Define the executable target:
add_executable(${PROJECT_NAME}
			   DifOdometry_Datasets_main.cpp
			   DifOdometry_Datasets.cpp
			   DifOdometry_Datasets.h
			    ${MRPT_VERSION_RC_FILE})

# Dependencies on MRPT libraries for this files
target_link_libraries(${PROJECT_NAME}
	mrpt::gui
	mrpt::opengl
	mrpt::vision
	)
# I function declare on another file in MRPT /cmakemodules directories
# DeclareAppForInstall(${PROJECT_NAME})


# Set optimized building:
if(CMAKE_COMPILER_IS_GNUCXX AND NOT CMAKE_BUILD_TYPE MATCHES "Debug")
	set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3")
endif()
    
  

Now that I can compile the apps I have to go and look for the fail in the program which is much behind than this app.

UPDATE: Issue996 solves the problem. Now the app is fully functional so I can start to test the results of DIFODO and test with different scenarios and test with different configuration values.

Testing with the DifOdometry-Datasets app (Working finally) Part 3

So the app DifOdometry-Datasets is working lets test with different configuration and test the results. Several datasets that are in rawlog format (format that the app uses) can be found here DifOdometry-Datasets.

I will use two of the samples datasets, one easier and one harder:

Easy - rawlog_rgbd_dataset_freiburg1_desk

This is an easy dataset since there are low movements and they are smooth. The results are great for different configurations.

Processing Time

Resolution Pyramid Levels Average processing time
160x120 1 4ms
160x120 3 8.5ms
160x120 5 9ms
320x240 1 15ms
320x240 3 27ms
320x240 5 30ms
640x480 1 80ms
640x480 3 130ms
640x480 5 130ms

The results are similar to the original paper. We can notice that with the increase in resolution the computational time is increased and also with the pyramid levels the computational time is increased. (Their paper has a lot of good tables comparing times and different resolutions that are worth checking out).

Since there is no easy way to measure the RME error between the groundtruth and the DIFODO results (there isprobably a way but i havent search in depth) Ill use images to analise the results by eye.

NOTE: There are two trjectories shown in the images. One green and the other red. The red one represents the groundtruth trajectory, the real one. The green one is the estimation from DIFODO algorithm.

Different resolutions

Fig.1 - 160x120 5 levels.
Fig.2 - 320x240 5 levels.
Fig.3 - 640x320 5 levels.

Different pyramid levels

Fig.4 - 320x240 1 levels.
Fig.5 - 320x240 3 levels.
Fig.6 - 320x240 5 levels.

As one can see using the pictures and also by reading the paper, the differences when using less than 5 pyramid levels are remarkable. While the decrease of resolution also gives worth results, the differences are not so hard as with the pyramid levels. Also the penalty in time is bigger when increasing the resolution compare to increasing the levels of the pyramid. The best configuration is going to deppend on the dataset and the environment in most cases but as a good average configuration we find those with 5 levels of gaussian pyramid and resolutions of (160x120, 320x240, 640x480) lower resolutions are also taken into account in the paper but they start to give worst results.

The best results, by looking at one of the tables of the paper are achieved with the higher resolution (640x480) however the decrease in performance when the resolution used are 320x240 and 160x120 is almost unnoticeable (just a few cms. 3-8cm). While this error is something to take into account the speed increase is a requirement more important when we talk about real time operation. Following theres a table extracted from DIFODO paper to show the differences of performance between differents resolutions.

As one can see the differences between 320x240 and 160x120 are extremely low. The performance is almost the same while the processing speed is decreased from around ~30ms (30FPS) to only ~9ms (solid 90 FPS). While both options can be considered real time, the advantages to have 3x speed using the QQVGA resolution (160x120) are promising. For example, having up to 90FPS, we can get more frames por second than we will have with a lets say 30FPS camera, which translates into lower traslation and rotation between frames, making easier for the algorithm to give a good estimation of the pose. Remember that the algorithm makes the assumption of low displacements, the lower the closer it gets to a linear model so the better the results.

Another advantage is that If we attempt to implement an SLAM algorithm, we still have to spend computational power to do the mapping part and also the loop closure that SLAM algorithms benefit from. So having less computational charge is always a plus.

Hard - rawlog_rgbd_dataset_freiburg1_360

This dataset is a little bit harder than the previous one due to higher speed movements and changing direction very roughly.

In this dataset we can notice on the images how the displacement in this case is much bigger between the estimation and the groundtruth. Is clear that the speed environment are the things that have more impact to the algorithms.

Fig.7 - 160x120 5 levels.
Fig.8 - 320x240 5 levels.
Fig.9 - 640x320 5 levels.

Once again the results seems similar for the same level of pyramid and different resolutions.

Making DIFODO app with intel realsense D435 sensor

As we have already test, there is a simple app with a visualizer that allows to test the algorithm against some datasets. The problem is that this datasets are in a special format called rawlog created by mrpt itself.

They also have an application to work with depth cameras, called DifOdometry-Camera but it doesnt works with the intel realsense D435, at least not plug and play.

So we have to think how to make the realsense D435 work along Difodo from mrpt. To do this they are several approaches that I can think of right now:

  • Use their rawlog format: Look for a way to translate to their own format, although this doesnt seem the approach for a real time application it could be worth to check if this is possible to do just to test against some recorder videos from the realsense. (They seem to have a rawlog-grabber which records some available sensors whose drivers are implemented in the mrpt platform (list of sensors) although the intel realsense D435 doesnt appear in the list, they have 3D camera support for those who use OpenNI2. The realsense doesnt support the openNI2 standard directly but it seems like a possibility. More info here.

  • Use ros packages: The intel realsense ros package allows to send the streams of the camera over ros packages. (They seem to have a ros package to convert between ros messages and mrpt classes). Check if they already have a conversion between the ros messages that the realsense streams the depth information and the mrpt platform.

  • Use the realsense drivers: We dont neccesarily need to use rawlog so we can just create a C++ program that takes the depth frame using the realsense drivers are then converting it to the desired format by the mrpt which is just a float matrix that they have implemented.

  • Use ros messages: Just like in the previous option, we could use the ros messages and traslate them to the desired format or class required by mrpt. This would be the option that could bring more people to use the algoritm since ros messages are kind of standart.