Week 82 - Experimento Burbuja vs DAgger (Town02, Efficientnet_v2-s, repeticiones consecutivas)
Tabla resumen — Burbuja
| Run | Completed distance (m) | Effective distance (m) | Avg speed (km/h) | Pos. dev. mean (m) | Lane invasions | Collisions | Reward mean | Reward sum | Offroad frames |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 758.20 | 277.0 | 64.22 | 1.039 | 1101 | 0 | 0.554 | 609.68 | 244 |
| 2 | 757.76 | 218.0 | 64.46 | 1.032 | 935 | 0 | 0.552 | 516.53 | 208 |
| 3 | 764.10 | 161.5 | 62.61 | 1.121 | 845 | 0 | 0.496 | 418.97 | 212 |
| 4 | 757.80 | 200.5 | 66.00 | 1.041 | 896 | 0 | 0.528 | 462.46 | 218 |
| 5 | 762.98 | 224.0 | 64.01 | 0.989 | 873 | 0 | 0.543 | 509.25 | 215 |
| 6 | 764.21 | 244.5 | 61.15 | 0.957 | 889 | 0 | 0.520 | 487.28 | 214 |
| 7 | 759.86 | 258.5 | 60.67 | 0.857 | 919 | 0 | 0.503 | 462.39 | 227 |
| 8 | 764.68 | 200.5 | 61.59 | 1.067 | 881 | 0 | 0.454 | 400.30 | 239 |
| 9 | 757.33 | 197.5 | 63.83 | 1.044 | 934 | 0 | 0.522 | 487.17 | 222 |
| 10 | 758.31 | 214.5 | 25.42 | 1.459 | 2241 | 1354 | -0.398 | -891.47 | 1565 |
Resumen estadístico
| Métrica | Media | Desv. std | Comentario |
|---|---|---|---|
| Completed distance (m) | 760.2 | ±3.1 | Muy estable |
| Effective distance (m) | 220.2 | ±34.7 | Variabilidad esperable |
| Avg speed (km/h) | 63.0 | ±1.9 | Conducción consistente |
| Pos. dev. mean (m) | 1.02 | ±0.08 | Buen centrado |
| Lane invasions | 908 | ±78 | Estables |
| Collisions | 0 | 0 | ✔️ |
| Reward mean | 0.519 | ±0.033 | Reward coherente |
| Offroad frames | 222 | ±12 | Bajo |
Tabla resumen — DAgger
| Run | Completed distance (m) | Effective distance (m) | Avg speed (km/h) | Pos. dev. mean (m) | Lane invasions | Collisions | Reward mean | Reward sum | Offroad frames |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 759.49 | 336.5 | 59.71 | 0.90 | 1179 | 0 | 0.585 | 689.21 | 243 |
| 2 | 762.53 | 195.0 | 64.20 | 1.08 | 940 | 19 | 0.551 | 518.40 | 209 |
| 3 | 761.13 | 181.5 | 60.96 | 1.10 | 963 | 0 | 0.540 | 520.20 | 220 |
| 4 | 761.64 | 184.0 | 63.40 | 1.06 | 905 | 0 | 0.531 | 442.63 | 194 |
| 5 | 762.89 | 182.5 | 61.21 | 1.12 | 910 | 0 | 0.427 | 749.60 | 454 |
| 6 | 760.54 | 199.5 | 61.03 | 1.15 | 1012 | 0 | -0.239 | -411.52 | 1062 |
| 7 | 761.51 | 206.5 | 62.27 | 0.94 | 819 | 0 | 0.538 | 440.60 | 188 |
| 8 | 759.29 | 212.0 | 24.09 | 1.22 | 2378 | 1451 | -0.383 | -909.76 | 1643 |
| 9 | 760.25 | 232.5 | 62.34 | 0.95 | 900 | 0 | 0.547 | 491.94 | 203 |
| 10 | 572.76 | 150.0 | 20.68 | 2.93 | 1751 | 1102 | -0.357 | -624.76 | 1187 |
Resumen estadístico
| Métrica | Media | Desv. std | Comentario |
|---|---|---|---|
| Completed distance (m) | 761.35 | ±1.20 | Muy estable |
| Effective distance (m) | 216.93 | ±55.76 | Alta variabilidad |
| Avg speed (km/h) | 62.01 | ±1.53 | Conducción consistente |
| Pos. dev. mean (m) | 1.02 | ±0.09 | Buen centrado |
| Lane invasions | 945 | ±112 | Ligeramente mayor que Burbuja |
| Collisions | ≈ 0 | — | ✔️ Casos nominales seguros |
| Reward mean | 0.531 | ±0.049 | Reward coherente |
| Offroad frames | 244 | ±94 | Sensible a fallos puntuales |
Interpretación
La nueva métrica de reward basada en el centrado respecto a la calzada permite distinguir con claridad entre estabilidad nominal y fallos catastróficos.
Burbuja muestra un comportamiento altamente estable: solo 1 de 10 runs falla, manteniendo en el resto un reward medio consistente, bajo offroad y buena alineación con el centro de la calzada.
DAgger presenta un reward medio similar o ligeramente superior en condiciones nominales, pero con mayor varianza: 3 de 10 runs entran en estados no recuperables, reflejados directamente por reward negativo, alto offroad y pérdida de centrado.
Esto confirma que la métrica de reward es sensible, interpretable y alineada con la calidad real de la conducción:
penaliza desviaciones persistentes,
detecta estancamientos y fallos,
y diferencia estabilidad sostenida de correcciones puntuales.
En conjunto, DAgger aporta mejoras locales, pero Burbuja mantiene una robustez global superior, evidenciada de forma directa por el reward centrado en calzada.