"ASTRA" An Automatic Speaker Tracking System based on SOSM measures and an Interlaced Indexing

Author: Sayoud H.   Ouamour-Sayoud S.   Boudraa M.  

Publisher: S. Hirzel Verlag

ISSN: 0001-7884

Source: Acta Acustica, Vol.89, Iss.4, 2003-07, pp. : 702-710

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

This paper proposes a new system called ASTRA or "Automatic Speaker TRAcking" system, for the speaker tracking under noisy conditions. ASTRA is a part of a global project called "ASTRAC" (Automatic Speaker TRAcking system by Camera) which focuses on audiovisual tracking. This paper discusses only the ASTRA part, where we investigate the speaker tracking problem in a multi-speaker environment. Herein, speaker tracking can broadly be divided into two problems: Locating the points of speaker change (Segmentation) and Identifying the speaker in each segment (Labeling). Furthermore, we introduce in ASTRA a new Indexing technique, called ISI (i.e. Interlaced Speech Indexing), which improves considerably the tracking performance. Our approach uses a speaker identification method based on Second Order Statistical Measures (SOSM). As SOSM measures, we choose the "μGc" one, which is based on the covariance matrix. However, the experiments show that this method needs, at least, a speech length of 2 seconds, which means that the segmentation resolution will be 2 seconds. By combining the SOSM with the new Indexing technique (ISI), we prove that the average segmentation error is reduced to only 0.5 second, which is interesting for real time applications. The results indicate that the association SOSM-ISI provides a high resolution and a high tracking performance: the tracking score (percentage of correctly labeled segments) is 95% on TIMIT database and 92.4% on Hub4 database.