Weeks 22-25: LSP + MPII datasets
In order to increase the number of samples for future training and evaluation, LSP and MPII datasets have been combined. A script for parsing the labels of both datasets and storing them in a concordant structure has been developed. More specifically, the script performs the following tasks:
- Images and labels from both datasets are parsed.
- HDF5 files for train, validation and test datasets are initialized. The percentage of samples that each of these datasets contains is defined by the user. Datasets are divided in two subsets, one of them contains the images of the humans labeled in MPII and LSP. The size of these images is dependant of the boxsize chosen by the user. The other subset contains the labels for those images. Each label is composed by the following fields:
dataset
: LSP or MPII.fname
: original image file name. It works as an identifier of each sample.scale
: size of the human in the image with respect to the image.center
: center coordinates of the human in the image (x, y).joints
: each joint coordinates (x, y).headsize
: size of the diagonal within the head bounding box of the human in the image. It is later used for evaluation.
- The samples and labels previously parsed are shuffled and stored in those HDF5 datasets.
- HDF5 files are saved.
The new HDF5 files are much lighter than the ‘‘images folder + annotations’’ provided by MPII and LSP. Besides that, the evaluation is faster now because the datasets contain the ‘‘ready-to-go’’ images, i.e. humans cropped from the image with the appropiate boxsize, instead of the original ones. Hopefully, this will also decrease computational cost during training.
LSP headsize
As we have mentioned before, one of the attributes stored for each sample is its headsize
. This parameter is required for computing the PCKh metric and its already provided in MPII annotations. However, LSP images doesn’t have and associated headsize. In order to solve this issue, the heads present in the LSP images have been annotated. This task has been accomplished thanks to the on-line annotation tool LabelMe.
After annotating every image, the resulting labels can be downloaded in .xml files. These files are then parsed and stored in a NumPy .npy file which is read later to build our datasets.