2009 IEEE International Conference on
Systems, Man, and Cybernetics |
![]() |
Abstract
Human sounds can be roughly divided into two categories: speech and non-speech. Traditional audio scene analysis research puts more emphasis on the classification of audio signals into human speech, music, and environmental sounds. We take a different perspective in this paper. We are mainly interested in the analysis of non-speech human sounds, including laugh, scream, sneeze, and snore. Toward this goal, we investigate many commonly used acoustic features and select useful ones for classification using multivariate adaptive regression splines (MARS) and support vector machine (SVM). To evaluate the robustness of the selected features, we also perform extensive simulations to observe the effect of noise on the accuracy of the classification. Finally, for the class of snoring sounds, we propose a robust approach to further categorize them into simple snores and snores of subjects with obstructive sleep apnea (OSA).