Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DMLR Workshop: Data-centric Machine Learning Research

Put on your detective hat: What's wrong in this video?

Rohith Peddi · Shivvrat Arya · Bharath Challa · Likhitha Pallapothula · Akshay Vyas · Qifan Zhang · Jikai Wang · Vasundhara Komaragiri · Eric Ragan · Nicholas Ruozzi · Yu Xiang · Vibhav Gogate


Abstract:

Following step-by-step procedures is an essential component of various activities carried out by individuals in their everyday lives. These procedures serve as a guiding framework that helps achieve goals efficiently, whether assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and an ability to reason about the structure of the activity. To this end, we collected a new ego-centric 4D dataset comprising 380 recordings (90 hrs) of people performing recipes in kitchen environments. This dataset consists of two distinct activity types: one in which participants adhere to the provided recipe instructions and another where they deviate and induce errors. We provide 5K step annotations and 10K fine-grained action annotations for 20\% of the collected data and benchmark it on two tasks: error detection and procedure learning.

Chat is not available.