Research report from Frank Rudzicz, University of Toronto

Screen Shot 2014-04-09 at 20.44.48

One of the things that I find unpleasant about the world is the nature of press reporting of research – so often it’s out-of-context, misrepresented, sensationalized and, just to put the cherry on the cake, completely referenced. I’d like the world to be a little better so from time to time I’m going to be inviting researchers to come share information about their work, hopefully in a way that is accurate and accountable while also being detailed and accessible.  First up we have the wonderful Dr Frank Rudzicz of the Department of Computer Science,in the University of Toronto, Canada, talking about some recent work involving Parkinson’s. You can read more in the proper peer reviewed study at this reference:

Zhao, S., Rudzicz, F., Carvalho, L.G., Márquez-Chin, C., Livingstone, S. (2014) Automatic detection of expressed emotion in Parkinson’s disease Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP14), 4–9 May, Firenze Italy.

Although you may have to wait a little while, as it’s not been presented at conference yet. 🙂

Patients with Parkinsons Disease (PD) frequently have  trouble producing emotional speech. At the University of Toronto and the Toronto Rehabilitation Institute, in collaboration with researchers at Ryerson University, we have devised tools to classify emotional speech in patients with PD and to aid in the diagnosis of PD itself using short sentences of speech.

In our study, participants were recorded speaking short statements with different emotional emphasis in the voice (e.g., angry, scared, happy). This emphasis is generally called ‘prosody’ in speech and refers to changing one’s pitch, loudness, or duration of speech to convey meaning (or, in our case, emotion).

Screen Shot 2014-04-09 at 20.56.56

Identifying emotion.

We used state-of-the-art machine learning methods (i.e., naïve Bayes, random forests, and support vector machines, for those interested) to identify the emotion in the voice and to identify people with PD from ‘controls’. These methods were given 209 unique acoustic features which are statistics and measurements derived from the voice. Our systems achieve accuracies of 65.5% in identifying the emotion and 73.33% in distinguishing between PD vs.  control, which is significantly more accurate than human speech-language pathologists whose training includes people with PD. We should point out that our accuracy measurements were much more stringent than some other measures used in similar studies – we used a method called ‘leave one speaker out’ whereas other recent studies allow everybody’s data to be included in both training and testing.

Screen Shot 2014-04-09 at 20.56.41

Identifying speaker

Our hope is that these results may assist in the future development of automated early detection systems for diagnosing patients with PD and in therapeutic software that aids in strengthening the ability to produce speech with the more common emotional prosody.

Leave a Reply