Exploring the Design Space of Automatically Generated Emotive Captions for Deaf or Hard of Hearing Users

Saad Hassan, Yao Ding, Agneya Kerure, Christi W. Miller, John Burnett, Emily Biondo, Brenden Gilbert

April 2023

Abstract

Caption text conveys salient auditory information to deaf or hard-of-hearing (DHH) viewers. However, the emotional information within the speech is not captured. We developed three emotive captioning schemas that map the output of audio-based emotion detection models to expressive caption text that can convey underlying emotions. The three schemas used typographic changes to the text, color changes, or both. Next, we designed a Unity framework to implement these schemas and used it to generate stimuli videos. In an experimental evaluation with 28 DHH viewers, we compared DHH viewers’ ability to understand emotions and their subjective judgments across the three captioning schemas. We found no significant difference in participants’ ability to understand the emotion based on the captions or their subjective preference ratings. Open-ended feedback revealed factors contributing to individual differences in preferences among the participants and challenges with automatically generated emotive captions that motivate future work.

Type

Conference paper

Publication

In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

captioning