The Parallel Domain SDK (or short: PD SDK) allows the community to access Parallel Domain synthetic data easily. It’s designed for two key purposes: to quickly load data from local or cloud storage into memory for use in machine learning pipelines, and to convert data into various dataset formats, including the DGP format. These capabilities make the PD SDK a valuable tool for working with synthetic data in machine learning projects.
The PD SDK can decode different data formats into its Python objects, including Dataset Governance Policy (DGP) format as well as CityScapes, NuImages and NuScenes (more public dataset formats will be supported in the future). Currently, local file system and S3 buckets are supported as dataset locations for decoding. Please head over to the PD SDK Github page to use it, and the PD SDK documentation page to learn more about it.