ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision

Version 3 2021-10-22, 12:03

Version 2 2021-06-11, 11:52

Version 1 2021-04-07, 09:13

dataset

posted on 2021-04-07, 09:13 authored by Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone StumpfSimone Stumpf, Cecily Morrison, Edward Cutrell, Katja Hofmann

Object recognition has made great advances in the last decade, but predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variation that these applications will face when deployed in the real-world. To close this gap, we present the ORBIT dataset, grounded in a real-world application of teachable object recognizers for people who are blind/low vision. The full dataset contains 4,733 videos of 588 objects recorded by 97 people who are blind/low-vision on their mobile phones, and we mark a subset of 3,822 videos of 486 objects collected by 77 collectors as the benchmark dataset. We propose a user-centric evaluation protocol to evaluate machine learning models for a teachable object recognition task on this benchmark dataset. The code for loading the dataset, computing all benchmark metrics, and running the baseline models is available at https://github.com/microsoft/ORBIT-Dataset