Audio Visual Fusion Speech Recognition Model Based on Articulatory Feature This publication appears in: Computer Engineering Authors: P. Wu, D. Jiang, F. Wang, H. Sahli and W. Verhelst Volume: 2011 Issue: 22 Pages: 268-269 Publication Date: Nov. 2011
Abstract: A multi-stream Dynamic Bayesian Network(DBN) model(AF_AV_DBN) based on Articulatory Feature(AF) is proposed for audio visual speech recognition. Conditional probability distribution of each node and the degree of asynchrony between the AFs are defined, and speech recognition experiments are carried out on an audio visual connected digit database. Compared with the audio-only AF_A_DBN model, the state synchronous DBN model and the state asynchronous DBN model, the designed AF_AV_DBN model gets the highest recognition rate under various signal to noise ratios, and is more robust to background noise.
|