Invited Talk
Workshop: DataPerf: Benchmarking Data for Data-Centric AI

Open Images: Lessons Learned from Collecting and Annotating 9M images

Jordi Pont-Tuset


Open Images is a large-scale public dataset of 9M images richly annotated with 16M boxes, 2.8M instance segmentations, 3.3M relationship annotations, 675k localized narratives, and 60M image-level labels. Collecting and annotating such a large and varied dataset has been a gargantuan effort not free of significant challenges. In this talk I'll present and discuss some of the learnings from creating Open Images, the challenges we found, and the solutions we came up with.

