February 25 2025

Diagnose Model Failure Points with PD Replica Sim

The Parallel Domain Team

In our previous post, “Signs of Trust: Matching Real-World Performance with PD Replica Sim,” we shared our latest research qualitatively and quantitatively assessing the simulation to real gap. We found that sign-detection models trained exclusively on real data exhibit nearly identical behavior on both real and Replica Sim validation data sets, a strong indication that PD Replica Sim is a trustworthy environment for validation.

In this post, we’ll delve into how ML teams can use PD Replica Sim to push their autonomous system to its limits, interrogating model weaknesses and testing its generalization more rigorously than real-world data typically allows —covering tricky edge cases, handling occlusion and orientation sensitivity, and even evaluating model performance on unseen traffic sign types.

Expanding the Testing Playground with PD Replica Sim

Capturing enough diverse, high-quality data for robust testing is expensive, time-consuming, and often incomplete. Most of the labeled data that ML teams manage to collect ends up going toward training, leaving little in the validation pool for comprehensive testing. These blind spots can be effectively addressed using simulation, where PD Replica Sim offers two key advantages:

  • Trustworthiness: Models trained on real data perform similarly in PD Replica Sim as real-world data for parking spot detection and for sign detection, confirming its use as a validation tool in those areas. We are continuing our research to quantitatively and qualitatively validate other components for testing.
  • Flexibility: The simulation environment allows ML teams to manipulate scene elements—adjusting conditions, inserting objects, and fine-tuning parameters—providing a richer, more systematic approach to testing.

Investigating Model Failure Points Using Simulation

Common Failure Points

Even high-performing models often struggle with specific failure points in real-world deployment. Some common failure points, and areas we explored in our latest research include:

  • Occlusion: Objects partially hidden behind others.
  • Distance: Distant or extremely close objects in the camera frame.
  • Object Orientation: Signs angled or tilted in unusual ways.

These failure scenarios are challenging to capture in large numbers in real-world datasets, but PD Replica Sim allows developers to intentionally generate and study them.

Orientation

If a traffic sign detection model struggles with angled signs, simulation enables precise experimentation. With PD Replica Sim, developers can:

  • Programmatically place traffic signs in a scene.
  • Adjust sign rotation angle across multiple test cases.
  • Swap in different sign types and locations.

This methodology allows teams to pinpoint the exact angle at which detection performance drops. In one of our internal tests, we observed a sharp decline in detection confidence when signs exceeded a certain tilt, despite strong front-facing performance. This angle as seen in the table below is around 30%. Identifying this failure mode in a controlled environment provided actionable insights—such as collecting additional training data at those orientations or refining the model architecture to improve viewpoint robustness.

Rotation experiment results running against a Mapillary trained perception model for sign detection:

bbox ratio width/hight)<30%30-40%40-60%>60%
missed6300
detected14732
% detection rate14%57%100%100%

Occlusion

What happens when a traffic sign is partially blocked by another object? Data Lab makes it possible to:

  • Place an occluding object (e.g., another sign, a tree, or a pole) in front of the primary sign.
  • Vary the type of occluding object (e.g., labeled vs. unlabeled in training data).
  • Systematically adjust the occlusion levels and measure detection outcomes.

A surprising pattern emerged in our experiments:

  • When the occluding sign was not part of the training labels, the model was more robust to occlusion, likely because it learned to ignore the front sign.
  • When the occluding sign was labeled in the training data, detection rates dropped, indicating the model had rarely encountered overlapping labeled signs.

This insight revealed a dataset bias—training data lacked examples of overlapping labeled signs. The solution? Either collect or synthesize more training images featuring these occlusion scenarios or refine the label ontology to better handle occluded signs.

Generalization on Unseen Traffic Signs

Even if a model performs well on known traffic sign types, can it generalize to novel variants? For instance, if a training set includes only “30 mph” and “70 mph” speed limit signs but lacks “50 mph,” how does the model handle the missing class?

PD Replica Sim enables precise evaluation by generating simulation tests featuring:

  • The unseen sign type (e.g., “50 mph”) in various locations and lighting conditions.
  • A mix of new and familiar elements to test feature generalization.

By testing model behavior in these generated scenarios, ML teams can determine if their models are learning abstract traffic sign features or simply memorizing specific designs. If generalization fails, additional real-world or simulation data can be gathered to strengthen the training set.

Image
Image

Next Steps and Takeaways

  • Fill the Gaps in Testing – If your current validation sets lack coverage of certain scenarios like sign orientations, or levels of occlusion, you can easily produce these scenarios with PD Replica Sim. This frees you from the unpredictable and often prohibitive constraints of real-world data collection.
  • Diagnose Model Weaknesses – By systematically varying one parameter at a time—like rotation angle or sign category—you can isolate precisely why a model fails. This leads to more targeted improvements, whether that’s gathering more data, updating your label definitions, or refining the neural network architecture.
  • Explore Generalization – PD Replica Sim allows you to insert unseen sign types or novel configurations of existing classes. This is critical for evaluating whether your model has learned robust features or is merely overfitting to the limited classes in your training set.
  • Accelerate Iterations – Data generation in simulation is rapid and controlled. You don’t have to wait for a rare real-world scenario to happen. Spin up new test sets on demand, assess the model’s performance, and iterate quickly—shortening your overall development cycle.

Conclusion

Realistic sensor simulation is a powerful ally in testing autonomous systems. By leveraging PD Replica Sim’s trusted fidelity and flexible, iterative control, you can be more creative with your validation scenarios: pinpoint critical edge cases, validate model generalization, and continually refine your solutions before they ever hit the road or fly in the air.

Ready to See PD Replica Sim in Action?

If you’re looking for a reliable, multi-sensor, high-resolution simulation that’s fully controllable and repeatable, PD Replica Sim stands out from the pack. Fill out the form below, and our team will be happy to provide you with a personalized demo.

Other Articles

Lines

Sign up for our newsletter