A New Technique for Analyzing Speech by Computer

Author: Meo A. R.   Righini G.  

Publisher: S. Hirzel Verlag

ISSN: 1610-1928

Source: Acta Acustica united with Acustica, Vol.25, Iss.5, 1971-11, pp. : 261-268

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

A new technique for analyzing speech is introduced and applied to the problem of determining the curve of the fundamental period as a function of time.The starting point is the evaluation of the following function: ε(p, N) = 1/N ΣN–1i=0 | x([p + i]T) −x([p + i + N]T)| where x (k T) denotes the amplitude of the waveform sample at instant k T. This function is calculated for a certain number of values of p and N corresponding to zero-crossings of signal, so that the set of extracted parameters can be interpreted as an extension of the conventional zero-crossing patterns. The information content of this parameter set can be summarized by introducing a number of symbols or points into a discret bidimensional diagram similar to a matrix. Here the row which a certain symbol σ belongs to is associated with a value of p, the abscissa of σ denotes on a suitable scale the value of N, and indicates the value of ε(p, N) in an appropriate code.The problem of determining the curve of the pitch as a function of time from these data appears to be a typical pattern recognition problem, that man on inspecting the diagram can solve rather easily but it is very difficult to treat on a computer. The approach adopted in the present case is a “structural” one, in the sense that the analyzed pattern is described in more or less formal terms before classification. The concepts of point, elemental and composed arc, chain arc are introduced for the purposes of this description, and to each of these elements a suitable weight is assigned. At the end of a certain procedure the chain of minimum weight will be assumed as the one representing the curve of pitch.The definition in the fundamental period measurement is 0.1 ms, and, as concerns accuracy, the procedure appears to be very promising, because in all the 30 cases analyzed in detail the resulting curves agree with the curves drawn by man after a careful inspection of the waveform. The procedure leads also to a rough classification of signal that can be useful for segmentation.