With five minutes of recorded voice, machine learning tries to distinguish between psychosis-related disorders and bipolar disorder and major depression

In a new paper, researchers co-led by BBRF awardees demonstrate a computer-based method for analyzing five minutes of record-breaking speech to stress schizophrenia, psychosis, bipolar disorder (BD) and distinguish between people with a variety of mental illnesses.

This research is dedicated, among other things, to help diagnose mental disorders early. In particular, we include those where mental illness exists. The consequences of psychotic disorders tend to be significantly worsened if the illness is not treated promptly following the initial psychotic episode. This issue is exacerbated by the fact that people who are considered at high risk of psychosis due to family history or genetics often do not get sick, but it is currently impossible to decide who and who do not. Are there any clues to their speech that will help guide their care?

For several years, researchers have recognized that patterns of language and language use are associated with major psychotic symptoms, such as impaired thinking, altered voice expression, and flattened effects. Various features of speech, including pitch, rhythm, and connectivity, may be related to changes in brain motor control and patterns of neural connectivity that may correlate with the severity of psychotic symptoms.

Recently, efforts have been made to combine knowledge of various voice characteristics with machine learning algorithms to detect psychotic diseases, measure symptoms severity, and predict recurrence. Similar to the studies reported in the new paper, these efforts are based on the hypothesis that computer- or AI-inducing analysis of recorded voice can provide useful clinical information.

PhD Julianna Olah and Sunny X. Tang, MD, 2022 A team led by BBRF Young Investigator wanted to address some of the issues that have prevented such speech analytics from being adopted by clinicians. In addition to overall accuracy, these issues include the lack of a standardized method of collecting audio samples in team accounts (most studies rely on recordings produced in a controlled laboratory setting). Relatively small sample size. A narrow clinical range (for example, focusing on psychosis, but not in the way that psychosis is reliably distinguished from other conditions).

In setting their approach, researchers stated, “Previous research has focused primarily on binary classification between 'healthy' individuals and others with one specific disorder, such as BD or schizophrenia. This does not lead to the identification diagnosis or detection of subtle changes that clinicians perform on a daily basis. “Binary” discrimination between healthy and disease also provides little insight into whether speech-based machine learning models can capture speech changes that are unique to certain disorders, such as specific disorders, schizophrenia, and psychosis.

The team's attempts to address these issues began with the recruitment of a large-scale, diagnosed sample of people each completed two surveys. One is used to measure pre-psychotic (“pro”) symptoms, and the other is a symptoms of depression. The 1,140 participants included 84, 227 BD, and 343 had a psychotic subclinical experience (SPE) diagnosed with schizophrenia spectral disorder (SSD). The latter subgroup included those who were thought to have some vulnerability to psychosis but did not meet the diagnostic criteria for psychotic disorders. Schizophrenia – In addition to the main focus on the bipolar spectrum, the team collected utterances from 156 people with major depressive disorder (MDD) who received a psychiatric diagnosis without psychosis. Finally, speeches were collected from 330 healthy individuals, i.e. no psychiatric diagnosis and low scores on two surveys.

Speeches were collected remotely using an online platform in which participants provided audio samples in response to standardized prompts. I had five different voice-based tasks and it took about 20 minutes to complete together. Participants took one week to complete five tasks.

Overall, over 943 hours of speeches were recorded. Each participant was asked. Remembering your recent dreams. Please explain 8 black and white photos. Here's a quick explanation of four neutral topics (e.g. “Describe your favorite food”). Read simple pre-created neutral text. And read three short, pre-made emotional stories (anger, happiness, horrifying).

After automatic transcription of the recording, the team uses natural language processing (NLP) techniques to “extract features that reflect abnormalities in semantics, syntactic, and speech morphology” (building blocks that form words). According to the team, feature selection was guided by previous research showing their ability to help identify whether speakers have psychotic disorders. The analysis also sought to capture paralytic features of recorded speech, i.e., paralytic features, i.e., acoustic features of recorded speech, indicating changes in emotional states (e.g., flattened effects) and changes in clarification (potentially related to changes in motor control). Overall, 116 paralytic parameters were extracted from the audio file.

The team said, “Even when collected remotely and online, speeches have sufficient intergroup variability to distinguish between different forms and stages of psychosis spectral conditions and between emotional states (IE, MDD) and psychotic states (cases of psychosis, such as schizophrenia or psychosis).

Of the five tasks assigned to each participant, the team found that each had a different predictive power. Increasing the length of the audio sample did not necessarily lead to better predictions. The most useful task was that participants had to generate their own speeches (e.g. “Describe your favorite food”), rather than reciting pre-made texts.

The changes in motor control effects and deviations associated with speech formation can be “easily captured from acoustic information,” the team noted. Disabled thinking, an important symptom of psychosis, is more related to language changes, they said.

The machine learning model used by the team was able to use five minutes of speech to distinguish between healthy controls and schizophrenia spectrum disorder or 86% accuracy bipolar disorder. The same numbers apply to the ability of the model to identify among healthy controls, schizophrenia, and bipolar patients.

This suggested to investigators that “screening for mental disorders is possible through a fully automated remote voice assessment pipeline.” The accuracy of the machine learning model's ability to identify people on the psychosis spectrum may have clinical benefits between major depression and bipolar disorder.

The team believes this method, if tested with further testing, could serve as a valuable complement to future clinical decision-making. However, prior to clinical implementation, the team states that it is important to test the actual feasibility, acceptance and accuracy of the method. Therefore, they are currently working on clinical implementation and surveillance at various psychiatric and behavioral clinics across the United States.

Julianna Olah, Ph.D., the first author of the team's paper; Translational Psychiatrypreviously in King's College London, he is an executive at King's College spin-out company called King's College spin-out company, and is looking to develop digital biomarkers for serious mental illness. Dr. Tan is a scientific advisor to Pilin, and other research participants have connections with the company.

Source link