Training semantic segmentation with simulated data

1 minute read

Simulated dataset

With our unstructured map built using properly categorized static meshes, we can now leverage CARLA’s semantic segmentation camera to create simulated datasets. For this purpose, we’ve utilized ROS bags. For example:

ros2 bag record -o gaia_sim_dataset_train /carla/hero/rgb_front/image /carla/hero/semantic_segmentation_front/image /segmentation/rgb/carla/hero/semantic_segmentation_front/image

To validate our complete pipeline, we’ve created a minimal dataset recorded at 20 fps, consisting of:

Train: A jeep on autopilot traversing our unstructured map.
Validation: The same jeep manually driven for a few seconds.
Test: The same jeep on autopilot, traversing the map in the opposite direction.

We extracted all images and masks from the generated ROS bags using our rosbag_to_dataset.py script. After keeping only those timestamps with both RGB and semantic segmentation data available, we ended up with 1,886 samples for training, 405 for validation, and 1,827 for testing.

Fine-tuning ACF-Net

We adapted the Jupyter notebook previously built by Rebeca to fine-tune an ACF-Net with an EfficientNet backbone (pretrained on ImageNet) for 30 epochs. This configuration has performed best in the past when working with real datasets (more info). The complete code is available at Perception/3_SemanticSegmentation/CARLA/SEM_ACFNET_EFFNET_CARLA.ipynb.

Our newly trained model achieves an accuracy of 88% when evaluated against our test set, compared to 56% accuracy for the same model fine-tuned with Rellis3D. The following plots show a qualitative comparison between both approaches:

sim_dataset_test_0 — Samples from simulated test dataset.

sim_dataset_test_1 — Samples from simulated test dataset.

Validating our pipeline

While these results may not be particularly meaningful—given the similarity of objects and atmospheric conditions between our train and test datasets—they serve as a proof of concept, validating our entire pipeline for:

Simulating unstructured outdoor environments
Generating customized semantic segmentation ground truth data
Building synthetic datasets using CARLA sensors data
Training our own models using synthetic data
Performing real-time inference over CARLA sensors

For this final step, we demonstrate in the following video an rviz dashboard featuring a third-person view, LiDAR, ego RGB image, and ego semantic segmentation ground truth and estimation.

Share on

Twitter Facebook LinkedIn

David Pascual-Hernández

Training semantic segmentation with simulated data

Simulated dataset

Fine-tuning ACF-Net

Validating our pipeline

Share on

You May Also Enjoy

Evaluating LiDAR semantic segmentation with DetectionMetrics

Fine-tuning RandLA-Net with Rellis3D

Intro to LiDAR semantic segmentation

CARLA’s semantic LiDAR