Archive — Previous work (Phase 1: I-JEPA + MLP Planning)
View current researchLeveraging I-JEPA Vision Representations for Data-Efficient Motion Planning
This project demonstrates a novel approach to autonomous vehicle trajectory planning that achieves state-of-the-art performance using 90% less labeled training data. By leveraging self-supervised learning with pre-trained vision models, we show that effective planning can be learned with minimal supervision.
The system processes camera inputs to predict safe, efficient trajectories for autonomous vehicles navigating complex urban environments. The key innovation is the use of pre-trained visual representations that already understand driving scenes, requiring only a lightweight planning head to be trained on a small subset of labeled data.
End-to-end pipeline from multi-camera input to trajectory prediction
Per-view fusion (L0+F0+R0) significantly outperforms single front camera:
Sweet spot at 50% trainable encoder layers balances transfer learning with adaptation:
Comparison with TransFuser + different SSL backbones (100% trainable, 30 epochs):
The lightweight I-JEPA + MLP approach (82.06% PDM) matches or exceeds the complex TransFuser architecture (81.88% PDM) while using significantly fewer parameters and requiring no LiDAR data. This demonstrates the effectiveness of strong visual representations for planning tasks.
Multi-node distributed training (DDP) across 4 nodes × 4 GPUs, with automatic checkpointing and recovery for production-grade reliability.
Mixed-precision training (FP16) for 2× speedup, optimized data pipelines, and efficient memory management for large-scale experiments.
Interactive browser showcase with real-time visualization, supporting both cached replays and live inference for easy demonstration and evaluation.
Experiment tracking, model versioning, and deployment pipelines for reproducible research and production readiness.
Experience the planning system in action with real-time trajectory visualization and performance metrics
Launch DemoThis project demonstrates research capabilities in computer vision, deep learning, and autonomous systems.