Sarcasm, the most appreciated means of comedy by those who understand it, is something that even humans struggle with. Oscar Wilde called it “the lowest form of wit but the highest form of intelligence” and it remains today a tricky skill to master.
In written speech, it’s almost impossible to detect, and in spoken language it’s not much easier either. The subtle changes in tone that convey sarcasm often confuse people, so, understandably, computer algorithms struggle to detect the form of humour, limiting virtual assistants and content analysis tools.
Traditional sarcasm detection algorithms often rely on a single parameter to produce their results, which is the main reason they often fall short. However, Xiyuan Gao, Shekhar Nayak and Matt Coler of Speech Technology Lab at the University of Groningen, have developed a multimodal algorithm for improved sarcasm detection that examines multiple aspects of audio recordings for increased accuracy. They used two complementary approaches — sentiment analysis, using text, and emotion recognition, using audio — for a more complete picture.
“We extracted acoustic parameters such as pitch, speaking rate and energy from speech, then used Automatic Speech Recognition to transcribe the speech into text for sentiment analysis”, Gao explained. “Next, we assigned emoticons to each speech segment, reflecting its emotional content. By integrating these multimodal cues into a machine learning algorithm, our approach leverages the combined strengths of auditory and textual information along with emoticons for a comprehensive analysis.”
The researchers are confident they have designed a robust system for detecting sarcasm in human speech. However, Gao, Nayak and Coler are still working on improving the algorithm, aiming to integrate a wider range of expressions and gestures, as well as multiple languages.
Yet, those who have mastered the art of sarcasm convey and understand it with no tone changes or even facial expressions, with the best example being the late Stephen Hawking. With his famous robotic voice and extremely limited facial movement, Hawking often manage to employ sarcasm in his interviews and speeches, a sign of his genius perhaps, judging by Wilde.
Whether or not AI will beat humans to understanding sarcasm any time soon, the researchers say the same algorithm could be used for purposes beyond humour recognition. Using sentiment analysis and emotion recognition, the system could eventually be calibrated to focus on other tone and speech inflections. “Traditionally, sentiment analysis mainly focuses on text and is developed for applications such as online hate speech detection and customer opinion mining. Emotion recognition based on speech can be applied to AI-assisted health care”, Gao said.