Recognising intent in collaborative human robot tasks can improve team performance and human perception of robots. Intent can differ from the observed outcome in the presence of mistakes which are likely in physically dynamic tasks. We created a dataset of 1227 throws of a ball at a target from 10 participants and observed that 47% of throws were mistakes with 16% completely missing the target. Our research leverages facial images capturing the person’s reaction to the outcome of a throw to predict when the resulting throw is a mistake and then we determine the actual intent of the throw. The approach we propose for outcome prediction performs 38% better than the two-stream architecture used previously for this task on front-on videos. In addition, we propose a 1D-CNN model which is used in conjunction with priors learned from the frequency of mistakes to provide an end-to-end pipeline for outcome and intent recognition in this throwing task.
All Science Journal Classification (ASJC) codes
- Artificial Intelligence