Amplitude spectrum correction to improve speech signal classification quality

Authors

  • Stanisław Gmyrek Wrocław University of Technology, Department of Acoustics, Multimedia and Signal Processing
  • Robert Hossa Wrocław University of Technology, Department of Acoustics, Multimedia and Signal Processing
  • Ryszard Makowski Wrocław University of Technology, Department of Acoustics, Multimedia and Signal Processing

Abstract

The speech signal can be described by three key
elements: the excitation signal, the impulse response of the
vocal tract, and a system that represents the impact of speech
production through human lips. The primary carrier of semantic
content in speech is primarily influenced by the characteristics of
the vocal tract. Nonetheless, when it comes to parameterization
coefficients, the irregular periodicity of the glottal excitation
is a significant factor that leads to notable variations in the
values of the feature vectors, resulting in disruptions in the
amplitude spectrum with the appearance of ripples. In this
study, a method is suggested to mitigate this phenomenon. To
achieve this goal, inverse filtering was used to estimate the
excitation and transfer functions of the vocal tract. Subsequently,
using the derived parameterisation coefficients, statistical models
for individual Polish phonemes were established as mixtures of
Gaussian distributions. The impact of these corrections on the
classification accuracy of Polish vowels was then investigated. The
proposed modification of the parameterisation method fulfils the
expectations, the scatter of feature vector values was reduced.

Additional Files

Published

2024-07-18

Issue

Section

Acoustics