• Home

Conference_programme: 21.1 - Source localization and acoustic array processing



Lecture: Study of speaker localization in a room with circular azimuth modeling

Author(s): Maymon Yanir, Schymura Christopher, Kolossa Dorothea, Rafaely Boaz

Summary:
Localization of a speaker in a room is an important audio signal processing task in speech enhancement, source separation, and robot audition, for example. A localization method that is robust to room reverberation has recently been developed. The method is based on spherical microphone arrays and signal processing in the spherical harmonics domain. A direct-path- dominance (DPD) test is employed for the identification of time-frequency components that are dominated by the direct sound from the speaker, and the computation of the direction-of-arrival (DOA) from these components. The method has been further developed to improve performance for short speech segments by using classification based on a Gaussian mixture model (GMM) on the estimated DOAs. The GMM-based method shows high robustness to reverberation and noise. However, with a GMM that employs Gaussian distributions over both azimuth and elevation, the circular nature of the azimuth angle is not accounted for in the model. This potential limitation has not been studied in any comprehensive manner for this reverberation-robust DOA estimation approach. In this paper, two DOA estimation methods that are designed for a circular azimuth representation are investigated for DOA estimation under reverberation. The first approach is based on averaging the spherical harmonics vectors directly, representing the direct-sound component at each time-frequency bin that passed the DPD test. Due to the spherical harmonics representation, the circular nature of the azimuth angle is maintained. The second approach uses the von Mises distribution to model DOAs over the azimuth angle. This model is designed to represent circular variables. A simulation experiment compares these two approaches to the reference GMM-based approach, clearly showing the advantage of the proposed approached when the speaker azimuth direction is near the discontinuity of the azimuth angle.

Corresponding author

Name: Prof Boaz Rafaely

e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Country: Israel