Image

Research

Image

We’re looking to collaborate with researchers who are working on or interested in the below open problems in machine learning and computer vision

1. Dynamic Agent Replay and Tracking from Real-world Logs

We are working on accurately replaying and tracking dynamic objects, like cars and pedestrians, from real-world logs into controlled synthetic scenes. Current approaches like 4D Gaussian splatting offer promising visual realism but don’t provide easy control of object trajectories or efficient adjustments. We’re evaluating these methods alongside simpler alternatives, such as decoupling objects from backgrounds with precise 3D bounding boxes from integrated LiDAR, to find practical solutions.

2. Controlled Scenario Variation with GANs and Diffusion Models

We need better ways to introduce controlled visual variations (weather, lighting, road conditions) in synthetic scenes. GAN-based techniques offer efficiency and control but struggle with multi-camera consistency. Diffusion methods like NVIDIA COSMOS provide excellent realism but face significant computational costs and limited fine-grained control. We’re testing both approaches to find the best balance of visual quality, consistency, and computational practicality.

3. Automated 3D Lane and Parking Detection

Detecting and annotating lanes and parking spaces accurately is a major bottleneck. We are researching automated methods using BEV-based multi-view detection. Our goal is to fuse predictions from multiple camera angles reliably into coherent, accurate 3D annotations. This approach aims to significantly reduce manual labeling and improve data quality.

4. LiDAR Integration for Improved Scene Geometry and Accuracy

Image-based reconstruction often produces errors, especially in textureless areas. We are integrating LiDAR sensors to obtain accurate depth and geometry information. Inspired by recent research such as LI-GS and LiGSM, we aim to significantly enhance scene accuracy, camera pose estimation, and depth supervision in our synthetic replicas.

5. Generative Diffusion Models for Autonomous Scenario Creation

We are partnering with NVIDIA to explore their diffusion model, COSMOS, for generating realistic synthetic scenarios. The main challenges we are addressing include reducing hallucinations, improving fine-grained control over scenario details, and ensuring multi-view consistency. Our goal is to make these generative models practically usable for accurate, reliable simulation scenarios.

6. Multi-Camera Semantic and Geometric Consistency

Maintaining semantic and geometric consistency across multiple synchronized camera views remains a significant unsolved problem. GAN and diffusion-based methods often process views independently, causing visual inconsistencies. We are actively investigating how to reliably synchronize semantic content and geometry across camera arrays, enabling realistic multi-sensor simulation essential for advanced AV perception testing.

We’re open to building research partnerships, co-developing solutions, and even publishing together on problems that matter to both academia and industry. Reach out with a brief summary of your work or interests, and we’ll find ways to collaborate.