Weeks 5-6: Caffe implementation

1 minute read

These weeks I’ve followed the Jupyter notebook that shows how to use CPMs models trained with Caffe. I’ve divided the workload in two classes: PersonDetector and PoseEstimator.

  • PersonDetector is fed with an image containing one or more humans and returns a heatmap. These images must have a size multiple of eight because of the model architecture, so it must be padded with zeros to meet this requirement. The returned heatmap has greater intensity values where the probability of finding a human is higher, as it is shown in the following image. In order to discard false humans detected, a maximum filter and a threshold are applied, resulting in the human center coordinates. Once the humans have been detected, the heatmap is resized back to the original image size, that is 8 times bigger.

  • PoseEstimator first crops all the humans that have been found in the original image to fit the model input. The input of the pose estimation model will be these cropped images and a 2D gaussian.

PoseEstimator outputs 14 heatmaps containing each one the probability of finding every joint of the human body (ankles, shoulders, head, hips…). Another heatmap containing the probability of every joint together is returned.

Finding the coordinates of every joint is as easy as finding the maximum values in those heatmaps. The resulting points can be linked together and drew over the original image after being resized, as it is shown in the next image.

As we want to build a JdeRobot component that can predict based on both TensorFlow and Caffe models, I have wrapped all this procedure in another Python will be loaded when the component starts and then, every time a new frame is captured it will go through the prediction method.

The whole process is far from real-time, oscillating between 30-50 seconds for images with a 640x360 resolution with an i5 processor without GPU acceleration.

I’m currently having issues installing JdeRobot with OpenCV and Caffe, so next step will be setting a stable environment in order to build the first version of the pose estimation component. Because of that some of the pictures are missing. Whenever I manage to get Caffe to work again I will upload them. Besides that, the TensorFlow version of the CPMs has to be explored and adapted as well.

Categories:

Updated: