Armando Mateus – Report April 30, 2026 (Week 30)

Systematic Evaluation of Recovery Samples & Dataset Balancing: From Center Lane Bias to Right‑Lane Adherence

April 30, 2026

After fixing the DAgger stabilisation bug (Week 29), we designed six controlled datasets incorporating bubble driving, strong/weak DAgger, targeted lane‑recovery examples (from the center divider), and different balancing strategies. Oversampling yields the most consistent right‑lane keeping, while horizontal shift augmentation is now the next priority.

0. Recap: Week 29 – Critical DAgger Bug Fixed

🔧 Stabilisation bug eliminated: The DAgger script used to record steering commands before vehicle stabilisation, generating transient, non‑recoverable samples. A 1.5 s delay was added, and all subsequent datasets were regenerated. Without this fix, DAgger corrections were ineffective against the center lane bias (driving on the midline instead of the right lane).

Week 30 strategy: Instead of relying solely on DAgger thresholds, we injected explicit recovery demonstrations (from the lane divider back to the right lane) and applied dataset balancing. Horizontal shift augmentation (PilotNet [2]) remains in the pipeline for next week.

1. Experimental Design: Six Datasets (Week 30)

All datasets were built on top of the corrected DAgger (weak threshold 0.3, strong threshold 0.7) plus bubble driving (human expert recording under normal conditions). To attack the right‑lane abandonment we added:

Recovery samples: manual recordings that bring the vehicle from the lane divider (centre line) back to the centre of the right lane.
Balancing strategies: weighted balancing (down‑sampling dominant classes) and oversampling (increasing underrepresented steering categories).
Diverse towns: some datasets incorporate recovery images from Town 02, Town 04 and Town 12 to improve generalisation.

Below we summarise each dataset, its distribution, steering histogram (schematic) and the corresponding autonomous driving video. Training used the PilotNet architecture on Town 01 (rural segments) to isolate lane‑centering performance before tackling complex 90° turns.

📊 Dataset 1 – Weak/Strong DAgger + bubble driving (no recovery samples)

Total valid samples: 48,333 | Steering range: [-0.99, 0.79] | Mean: 0.021

Category	Count	Percentage
Very left	2177	4.50%
Left	5469	11.32%
Straight (center)	28914	59.82%
Right	10759	22.26%
Very right	1014	2.10%

📊 Figure 1: Histogram – strong centre bias (straight dominates). Steering distribution highly unbalanced.

🎥 Video 1 (Dataset 1): https://youtu.be/jHOEDOY8CQw – The vehicle still drives over the midline, Center Lane Bias persists.

📊 Dataset 2 – Added 1000 extra bubble samples + same DAgger (weak/strong)

Steering category	Count	Percentage
Left strong	2707	5.3%
Left medium	4939	9.6%
Left soft	7202	14.1%
Straight	17312	33.8%
Right soft	7004	13.7%
Right medium	9471	18.5%
Right strong	2617	5.1%

📊 Figure 2: More balanced but still lacking explicit recovery from centre line.

🎥 Video 2 (Dataset 2): https://youtu.be/QEZT45MvVI8 – Slight improvement but vehicle often drifts to the centre lane.

📊 Dataset 3 – DAgger + bubble + 2,991 recovery images (from lane divider to right lane, Town 01)

Steering category	Count	Percentage
Left strong	2707	5.3%
Left medium	4939	9.6%
Left soft	7202	14.1%
Straight	17312	33.8%
Right soft	7004	13.7%
Right medium	9471	18.5%
Right strong	2617	5.1%

Note: distribution unchanged from Dataset 2 because recovery images were added without modifying original counts; total increased but percentages kept same structure for direct comparison.

📊 Figure 3: Histogram shape similar to Dataset 2, but recovery examples shift behaviour during evaluation.

🎥 Video 3 (Dataset 3): https://youtu.be/sisaBPvhwiY – Noticeable reduction of centre drifting, but still inconsistent on narrow curves.

📊 Dataset 4 – Dataset 3 + weighted balancing (down‑sampling overrepresented straight/right‑medium classes)

Category	Count	Percentage
Left strong	1387	2.8%
Left medium	4939	9.9%
Left soft	7202	14.5%
Straight	17312	34.8%
Right soft	7004	14.1%
Right medium	9471	19.0%
Right strong	2439	4.9%

📊 Figure 4: Weighted balancing reduces the dominance of straight steering.

🎥 Video 4 (Dataset 4): https://youtu.be/HdAyx_1lpYs – Better right‑lane holding, but the vehicle shows slight oscillation when correcting from the divider.

📊 Dataset 5 – Oversampling (Dataset 3 up‑sampled minority steering classes)

Category	Count	Percentage
Left strong	4034	7.1%
Left medium	7106	12.4%
Left soft	7202	12.6%
Straight	17312	30.3%
Right soft	7004	12.3%
Right medium	10349	18.1%
Right strong	4153	7.3%

📊 Figure 5: Oversampling produces a flatter distribution, emphasising corrective manoeuvres.

🎥 Video 5 (Dataset 5): https://youtu.be/WAANA6yqFeY – Best performance so far: vehicle consistently stays in the right lane, recovers robustly after deviations, and cornering is smoother.

📊 Dataset 6 – Oversampling + 13,902 recovery images (Town 02, 04, 12) – most diverse set

Category	Count	Percentage
Left strong	4034	5.7%
Left medium	7120	10.0%
Left soft	11045	15.5%
Straight	22571	31.8%
Right soft	11382	16.0%
Right medium	10757	15.1%
Right strong	4153	5.8%

📊 Figure 6: Even more balanced, with increased left‑soft and right‑soft due to multi‑town recovery.

🎥 Video 6 (Dataset 6): https://youtu.be/kLgxz3V4EQs – Not a better performance. Vehicle does not state on the right on the road and constantly leave the lane.

Preliminary loss curves indicate good generalisation across different road markings and lighting conditions.

2. Comparative Analysis & Key Findings

✅ Plain DAgger + bubble driving (Dataset 1 & 2) does not eliminate centre lane bias – the model still prefers the midline because the training distribution lacks explicit recovery demonstrations.
✅ Adding 2,991 recovery samples (Dataset 3) reduces centre drifting but does not fully solve the problem; the network learns to react to off‑centre states but remains hesitant.
✅ Weighted balancing (Dataset 4) improves reactivity but can induce mild oscillations – the model becomes sensitive without enough diverse right‑lane examples.
✅ Oversampling (Dataset 5) yields the most consistent right‑lane adherence. The ego‑vehicle actively seeks the centre of the right lane and recovers robustly. This is the current best strategy.
⏳ Dataset 6 (oversampling + larger multi‑town recovery) is still training; expected to improve generalisation to new towns (Town 07, complex intersections).

Conclusion: Under the proposed methodology, simply adding recovery examples is not enough – oversampling the underrepresented corrective actions (left/right strong and medium) is crucial. The ego‑vehicle “learns to stay in the right lane” only when the training set contains a balanced proportion of recovery manoeuvres. Weighted balancing helps but oversampling yields superior stability.

3. Next Steps – Horizontal Shift & Image Cropping (Week 31)

🔄 Horizontal shift augmentation (PilotNet, Sec. 5.3 [2]): Following the NVIDIA recommendations, we will synthesise off‑centre views by translating camera images horizontally and adjusting steering targets accordingly. Three magnitudes (0.002, 0.003, 0.004) will be applied on top of the best oversampled dataset (Dataset 5). This is expected to enforce generalisation to any lateral displacement without collecting additional real data.

✂️ Image cropping: Removing redundant sky and hood areas will further focus the network on road geometry and lane markings. Cropping will be combined with horizontal shift to maximise robustness against the centre bias.

Train PilotNet with Dataset 5 + horizontal shift augmentation (3 magnitudes). Evaluate on Town 07 rural circuit, measuring time spent in right lane and curb contacts.
Integrate image cropping as a preprocessing step; compare with pure shift augmentation.
Combine corrected DAgger + oversampling + shift augmentation – the ultimate hybrid solution.
Re‑evaluate 90° turn performance – preliminary oversampling results already show smoother cornering; shift augmentation may further improve turn trajectories.
Release the final datasets on Huggingface (oversampled + augmented variants) after validation.

Literature reference: Bojarski et al. (arXiv 2020) – “The NVIDIA pilotnet experiments” explicitly recommends horizontal shift to remedy lane‑centering issues: https://arxiv.org/pdf/2010.08776.

📌 WEEK 30 SUMMARY – APRIL 30, 2026

📊 Six datasets systematically evaluated: DAgger + bubble driving + recovery samples from lane divider + balancing strategies.

🏆 Oversampling (Dataset 5) yields the most consistent right‑lane keeping, dramatically reducing centre lane bias.

✅ Weighted balancing helps but oversampling is superior; adding recovery images alone (without balancing) is insufficient.

🔜 Week 31: Implement horizontal shift augmentation (PilotNet 2020) and image cropping on top of Dataset 5.

📚 Foundational references: DAgger (Ross et al. 2011) and PilotNet augmentation shared with the team.

4. Shared Literature & Team Notes

[1] Ross, S., Gordon, G., & Bagnell, D. (AISTATS 2011): DAgger – a reduction of imitation learning to no‑regret online learning. https://proceedings.mlr.press/v15/ross11a
[2] Bojarski, M., et al. (arXiv 2016): End to end learning for self‑driving cars (original PilotNet). https://arxiv.org/pdf/1604.07316
[3] Bojarski, M., et al. (arXiv 2020): The NVIDIA pilotnet experiments – Section 5.3 on horizontal shift augmentation for lane centering. https://arxiv.org/pdf/2010.08776

Special thanks to @jmplaza and @Juan Calderon for continuous feedback and discussion on imitation learning pitfalls.

— Armando Mateus, Robotics Lab URJC

Detailed technical note: All experiments continue using a 1 s frame margin (temporal redundancy removal, validated Week 22) to avoid oscillatory zigzag behaviour. The stabilised DAgger recording delay (1.5 s) is now part of the standard pipeline. The next milestone is to practically eliminate the centre lane bias on Town 07 and then systematically benchmark turning performance.