× close
Credit: Pixabay/CC0 Public Domain
Audio is important for detecting suicidal thoughts and is key to understanding the mental and emotional state of people experiencing suicidal thoughts. Suicide hotline counselors are trained to quickly analyze changes in voice to better assist callers through a crisis.
But just as no system is perfect, there is also room for error when interpreting a caller's voice. To help hotline counselors properly assess callers' conditions, Dr. Concordia and his student Alaa Nfissi developed a model of speech emotion recognition (SER) using artificial intelligence tools. Did. This model analyzes and encodes the waveform modulation of the caller's voice. He says the model could lead to improved responder performance in real-world suicide surveillance.
This research 2024 IEEE 18th International Conference on Semantic Computing (ICSC).
“Traditionally, SER has been performed manually by trained psychologists who annotate audio signals, which requires a high degree of time and expertise,” he says. “Our deep learning model automatically extracts audio features relevant to emotion recognition.”
Nfissi is a member of the Center for Research and Interventions on Suicide, Ethical Issues, and End-of-Life Practices (CRISE). His paper was first presented at his IEEE 18th International Conference on Semantic Computing in California in February 2024, where he won the Best Student Paper Award.
Instantly read emotions
To build the model, Nfissi used a database of actual calls made to a suicide hotline and integrated it with a database of recordings from different people expressing specific emotions. Both sets of recordings were segmented and annotated by a trained researcher, or an actor voicing the recordings, following a protocol tailored for this task.
Each segment was annotated to reflect a specific state of mind, such as angry, neutral, sad, or fearful/concerned/worried. Actor recordings enhanced the range of emotions in the original dataset, underrepresenting states of anger, fear, concern, and anxiety.
Nfissi's deep learning model then analyzed the data using neural networks and gated recurrent units. These deep learning architectures are used to process data sequences to extract local, time-dependent features.
“This method conveys emotions through a time process, which means we can detect emotions by events that occurred before a certain moment. We know what happened and what happened before. We believe that this will allow us to better detect emotional states at specific points in time. ”
According to Nfissi, this model is an improvement on an existing architecture. Older models required segments to be the same length to process, and usually he was in the range of 5-6 seconds. His model uses variable-length management signals and can handle different time segments without the need for hand-crafted functionality.
The results validated Nfissi's model. Accurately recognized her four emotions in the integrated dataset. Correctly identified fear/concern/anxiety 82% of the time. Neutral, 78%. Sad, 77%. And they are angry 72% of the time.
The model proved particularly good at accurately identifying segments recorded by experts, with success rates ranging from 78% for sadness to 100% for anger.
This research is personal for Nfissi, who had to study suicide hotline interventions in detail while developing the model.
“Many of these people are suffering, and sometimes a simple intervention from a counselor can be of great help. However, not all counselors are equally trained to process the client's emotions. However, it may take some people more time to understand.”
He said he hopes his model can be used to develop real-time dashboards that will help counselors choose appropriate intervention strategies when speaking with emotional clients.
“The hope is that this will ensure that the intervention can help them and ultimately prevent suicide.”
The paper was co-authored by Professor Nizar Bouguila from Concordia Institute for Information Systems Engineering, along with Wassim Bouachir from Teluk University and CRISE, and Brian Mishara from UQÀM and CRISE.
For more information:
Alaa Nfissi et al. Uncovering the emotional state of high-risk suicide callers using voice analysis. 2024 IEEE 18th International Conference on Semantic Computing (ICSC) (2024). DOI: 10.1109/ICSC59802.2024.00012
