Significant Driving Improvements: Hybrid Dataset, Cropping and Balancing Strategies
May 29, 2026
Following the promising cropping results of Week 31, this week we designed a comprehensive data collection and training pipeline targeting three core objectives: zero oscillations, lane departure avoidance, and robust turning & lane keeping. A systematic construction of 15 datasets (balancing + cropping variants) and parallel training of 15 PilotNet models reveal that oversampling combined with top 35% cropping yields stable, oscillation‑free driving with excellent turn execution.
0. Work Plan & Objectives
🎯 Target driving characteristics:
- No oscillations
- No lane departures
- Correct turns and lane keeping
Experimental workflow:
- Expand dataset with images from Town 01, Town 02, Town 04, Town 12.
- Define target composition: 50% turns, 15% lane recoveries, 35% straight driving.
- Apply augmentations: vertical cropping (top 35% / top 50%), horizontal reflection, horizontal shifting.
- Generate balanced dataset variants (oversampling and class weighting).
1. Data Acquisition & Composition
Recordings were performed on full routes of Town 01 using on‑demand sampling (G29 wheel buttons). The following driving logs were collected:
- 🔹 Bubble forward driving: 16,209 samples
- 🔸 Bubble right turns: 3,205 samples
- 🔹 Bubble left turns: 633 samples (augmented via mirroring: +527)
- 🔸 DAgger “drunk” strong (intensity 0.6): 8,963 samples
- 🔹 DAgger medium (intensity 0.45): 16,482 samples
- 🔸 DAgger weak (intensity 0.3): 11,760 samples
- 🔹 Recoveries from right lane divider: 12,483 samples
- 🔸 Recoveries from left lane: 8,657 samples
- 🔹 Recoveries from centre line: 9,241 samples
From the total pool, a stratified random selection was used to build a dataset with 7,443 samples following the target composition: 50% turns (3,721), 15% recoveries (1,116), 35% straight (2,606).
This composition ensures the network sees balanced steering commands and explicit corrective demonstrations.
2. Dataset Variants – Balancing & Cropping
Fifteen different dataset configurations were generated by combining three factors:
- Balancing method: none (raw), oversampling of minority steering bins, or class‑weighted loss.
- Cropping: none, top 35% crop, top 50% crop.
Naming convention:
- Dataset 00 : raw distribution, no cropping.
- 01–02 : raw + crop 35% / crop 50%.
- 03–05 : oversampled / weighted (without crop, with crop 35%, with crop 50%).
- 06–08 : additional oversampled/weighted with crops (full factorial design).
All models share the same PilotNet architecture (Table 1, Appendix). Training: 20 epochs, batch size 64, Adam, learning rate 1e‑4.
3. Quantitative & Qualitative Results
The table below summarises the driving performance for the most representative dataset variants (full 15‑model evaluation condensed). The row corresponding to Dataset 04 (oversampled + crop35) is highlighted in green and marked with ⭐ as the best performing configuration.
| Dataset (balancing + crop) | Oscillations | Lane departures | Lane keeping | Centre line driving | Right turns | Left turns | Off‑road / critical failure | Video link |
|---|---|---|---|---|---|---|---|---|
| 00 (raw, no crop) | Fails | Fails | Fails | OK | OK | OK | Turn failures | watch |
| 01 (raw + crop35) | OK | OK | Medium | OK | OK | OK | OK | watch |
| 02 (raw + crop50) | OK | OK | Medium | OK | OK | OK | Head‑on failure (wide spaces) | watch |
| 03 (oversampled, no crop) | Fails | Fails | Fails | Fails | Fails | Fails | Head‑on failure | watch |
| 03‑weighted (weighted loss, no crop) | Fails | Fails | Fails | Fails | Fails | Fails | Head‑on failure | watch |
| 04 (oversampled + crop35) | OK | OK | Medium | OK | OK | OK | OK | watch |
| 04‑weighted (weighted + crop35) | OK | OK | Medium | OK | OK | OK | Fails on turns | pending |
| 05 (oversampled + crop50) | Fails | Fails | OK | Fails | Fails | Fails on left turns | Fails | watch |
| 05‑weighted (weighted + crop50) | Fails | Fails | Fails | OK | Fails | Fails | Turn failures | watch |
| 06‑08 (oversampled/weighted with crop35 or crop50) | OK / Medium | OK | OK/Medium | OK | OK | OK | OK (some left‑turn issues) | multiple pending |
🔍 Main observations:
- Cropping alone (35%) dramatically reduces oscillations and lane departures even without balancing. However, the best overall driving quality is achieved when oversampling is combined with top 35% cropping (Dataset 04) – highlighted row above.
- Crop 50% introduces a “tunnel vision” effect: in wide open spaces the vehicle loses global context, leading to head‑on failures.
- Oversampling without cropping (Dataset 03) worsens performance because it amplifies the original bias without removing distracting sky/horizon features.
- Left turns remain the weakest point: even in the best models, left‑turn execution shows occasional lane‑centre mis‑take. This is attributed to an insufficient number of raw left‑turn examples (mitigated by mirroring, but still not perfect).
Conclusion: The configuration Oversampling + top 35% cropping + balanced turn/recovery composition (Dataset 04) gives nearly ideal driving: no oscillations, rare lane departures, correct turns, and stable lane keeping on Town 01.
4. Interpretation – The Synergy between Cropping and Oversampling
🧠 Why does crop 35% work so well? Removing the top 35% of the image eliminates the sky, horizon, and distant trees – features that are invariant to steering. The network is forced to focus on the road surface and near‑field lane markings. This inductive bias reduces variance in the steering output and prevents overfitting to background scenery.
Oversampling corrects the imbalance (straight vs. turn samples). When applied together with cropping, the model learns smooth, high‑fidelity turning commands without the distracting visual noise. The result is a robust end‑to‑end policy that works across multiple towns.
The failure of 50% cropping indicates that too much removal (half of the image) discards useful near‑field cues, especially in wide intersections or highways where forward planning is necessary.
5. Next Steps – Week 33 Plan
🚀 Refinements based on Week 32 results:
- Increase left‑turn examples – collect additional real left turns (instead of relying solely on mirroring) to eliminate the occasional “wrong lane take” during left manoeuvres.
- Scale to multiple towns – evaluate the best model (oversampled + crop35) on Town 02, Town 04, and Town 12, then perform final validation on completely unseen Town 07.
- Quantitative metric upgrade – replace subjective oscillation scoring with time outside right lane (TORL) and steering smoothness (integral of jerk).
- Generalisation test – run the best model in dynamic traffic scenarios (Town 05) to measure robustness.
We expect that after augmenting left‑turn data, the model will achieve near‑human performance with zero departures on all test routes.
All code, trained models and datasets will be released on Hugging Face after final validation. A full technical report is being prepared for the lab’s internal repository.
6. References & Team Notes
- [1] Rodríguez, J. (2025): Semana 10 – Data augmentation with cropping. https://roboticslaburjc.github.io/2025-tfg-jorge-rodriguez/semana10/
- [2] Bojarski, M., et al. (2016): End to end learning for self‑driving cars (PilotNet). arXiv:1604.07316
- [3] Ross, S., et al. (2011): DAgger: A reduction of imitation learning to no‑regret online learning. AISTATS 2011
Special thanks to Jorge Rodríguez for sharing his dataset pipeline and to the Robotics Lab team for continuous feedback. The systematic dataset construction and 15‑model training campaign was made possible by the lab’s computing cluster.
— Armando Mateus, Robotics Lab URJC
📌 WEEK 32 SUMMARY – MAY 29, 2026
📊 Built a balanced dataset (50% turns, 15% recoveries, 35% straight) from 8 different driving logs (total raw pool >85k images).
✂️ Top 35% cropping + oversampling yields oscillation‑free driving, correct turns, and only rare lane departures (best model: Dataset 04).
⚠️ Crop 50% leads to “tunnel vision” failures in wide spaces; oversampling alone (without cropping) degrades performance.
🔜 Week 33: augment left‑turn examples, evaluate on Town 02/04/12/07, and introduce advanced smoothness metrics.
Appendix – PilotNet Architecture (identical for all 15 models)
Model: "sequential"
Layer (type) Output Shape Param #
conv2d (Conv2D) (None, 58, 78, 24) 1,824
batch_normalization (None, 58, 78, 24) 96
activation (None, 58, 78, 24) 0
dropout (None, 58, 78, 24) 0
conv2d_1 (None, 27, 37, 36) 21,636
batch_normalization_1 (None, 27, 37, 36) 144
activation_1 (None, 27, 37, 36) 0
dropout_1 (None, 27, 37, 36) 0
conv2d_2 (None, 12, 17, 48) 43,248
batch_normalization_2 (None, 12, 17, 48) 192
activation_2 (None, 12, 17, 48) 0
dropout_2 (None, 12, 17, 48) 0
conv2d_3 (None, 10, 15, 64) 27,712
batch_normalization_3 (None, 10, 15, 64) 256
activation_3 (None, 10, 15, 64) 0
conv2d_4 (None, 8, 13, 64) 36,928
batch_normalization_4 (None, 8, 13, 64) 256
activation_4 (None, 8, 13, 64) 0
flatten (None, 6656) 0
dense (None, 100) 665,700
batch_normalization_5 (None, 100) 400
activation_5 (None, 100) 0
dropout_3 (None, 100) 0
dense_1 (None, 50) 5,050
batch_normalization_6 (None, 50) 200
activation_6 (None, 50) 0
dropout_4 (None, 50) 0
dense_2 (None, 10) 510
batch_normalization_7 (None, 10) 40
activation_7 (None, 10) 0
dense_3 (None, 1) 11
Total params: 804,203 (3.07 MB)
Trainable params: 803,411 (3.06 MB)
Non‑trainable params: 792 (3.09 KB)
All models trained with ADAM, learning rate 1e‑4, batch size 64, 20 epochs. Input resolution: 66×200 (after cropping).