Audio Visual Emotion Recognition Based on the Triple Stream DBN Models This publication appears in: Computer Engineering Authors: L. Lv, D. Jiang, F. Wang, H. Sahli and W. Verhelst Volume: 38 Pages: 161-166 Publication Date: Mar. 2012
Abstract: This paper presents a triple stream Dynamic Bayesian Networks(DNB) mode (T_AsyDBN) for audio visual emotion recognition, in which the two audio stream are synchronous at the state level, which they are asynchronous with the visual stream within controllable constraints. MFCC features and local prosodic features are extracted as audio features, while dimensional geometric features as well facial action units' coefficients are extracted as visual features. Emotion recognition experiment show that by adjusting the asynchrony constraint, T_AsDBN performs better than the two stream audio visual DBN model(Asy_DBN), with average recognition rate improves from 52.14% to 63.71%.
|