EMD-based time-frequency analysis methods of audio signals


  • Marcin Lewandowski Warsaw University of Technology
  • Qizhang Deng University of New South Wales, Sydney


To ensure that any time series data is appropriately interpreted, it should be analyzed with proper signal processing tools. The most common analysis methods are kernel-based transforms, which use base functions and modifications to represent time series data. This work discusses an analysis of audio data and two of those transforms - the Fourier transform and the wavelet transform based on a priori assumptions about the signal's linearity and stationarity. In audio engineering, these assumptions are invalid because the statistical parameters of most audio signals change with time and cannot be treated as an output of the LTI system. That is why recent approaches involve the decomposition of a signal into different modes in a data-dependent and adaptive way, which may provide advantages over kernel-based transforms. Such tools include empirical mode decomposition-based methods and Holo-Hilbert Spectral Analysis. Simulations were performed with speech signal for kernel-based and data-dependent decomposition methods, which revealed that evaluated decomposition methods are promising approaches to analyzing nonstationary and nonlinear audio data.


S. Bochner, "Fourier Integrals: Introduction to the Theory of Fourier

Integrals. By EC Titchmarsh. Oxford, Clarendon Press, 1937.", Science,

t. 87, nr2260, s. 370–370, 1938. https://www.science.org/doi/abs/10.1126/science.87.2260.370.a

N. E. Huang et al., "The empirical mode decomposition and the Hilbert

spectrum for nonlinear and non-stationary time series analysis", Proc. R. Soc. Lond. Ser. Math. Phys. Eng. Sci., t. 454, nr 1971, s. 903–995, mar. 1998, doi:10.1098/rspa.1998.0193. https://royalsocietypublishing.org/doi/abs/10.1098/rspa.1998.0193

P. J. Brockwell and R. A. Davis, Time series: theory and methods.

Springer science & business media, 2009.

J. Allen, "Short term spectral analysis, synthesis, and modification by

discrete Fourier transform", IEEE Trans. Acoust. Speech Signal Process., t. 25, nr 3, s. 235–238, 1977. https://ieeexplore.ieee.org/abstract/document/1162950

Y. Meyer, “Wavelets and Operators: Volume 1”. Cambridge university

press, 1992.

Z. Wu i N. E. Huang, "Ensemble empirical mode decomposition: a noise-assisted data analysis method", Adv. Adapt. Data Anal., t. 1, nr 01,

s. 1–41, 2009. https://www.worldscientific.com/doi/abs/10.1142/S1793536909000047

M. E. Torres, M. A. Colominas, G. Schlotthauer, and P. Flandrin, "A complete ensemble empirical mode decomposition with adaptive noise", in

IEEE international conference on acoustics, speech and signal processing (ICASSP), 2011, s. 4144–4147. https://ieeexplore.ieee.org/abstract/document/5947265

M. A. Colominas, G. Schlotthauer, and M. E. Torres, "Improved complete ensemble EMD: A suitable tool for biomedical signal processing",

Biomed. Signal Process. Control, t. 14, s. 19–29, 2014. https://www.sciencedirect.com/science/article/abs/pii/S1746809414000962

M. S. Fabus, A. J. Quinn, C. E. Warnaby, and M. W. Woolrich,

"Automatic decomposition of electrophysiological data into distinct nonsinusoidal oscillatory modes", J. Neurophysiol., t. 126, nr 5, s. 1670–1684, 2021. https://journals.physiology.org/doi/full/10.1152/jn.00315.2021

R. Deering and J. F. Kaiser, "The use of a masking signal to improve

empirical mode decomposition", in Proceedings.(ICASSP'05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., 2005, t. 4, s. iv–485. https://ieeexplore.ieee.org/abstract/document/1416051

A. J. Quinn et al., "Within-cycle instantaneous frequency profiles report oscillatory waveform dynamics", J. Neurophysiol., t. 126, nr 4, s. 1190–

, 2021. https://journals.physiology.org/doi/full/10.1152/jn.00201.2021

Y. Yang, J. Deng, and D. Kang, "An improved empirical mode

decomposition by using dyadic masking signals", Signal Image Video

Process., t. 9, nr 6, s. 1259–1263, 2015. https://link.springer.com/article/10.1007/s11760-013-0566-7

O. B. Fosso and M. Molinas, "EMD mode mixing separation of signals

with close spectral proximity in smart grids", w 2018 IEEE PES

innovative smart grid technologies conference Europe (ISGT-Europe), 2018, s. 1–6. https://ieeexplore.ieee.org/abstract/document/8571816

S. Cole and B. Voytek, "Cycle-by-cycle analysis of neural oscillations",

J. Neurophysiol., t. 122, nr 2, s. 849–861, 2019. https://journals.physiology.org/doi/full/10.1152/jn.00273.2019

A. V. Oppenheim, “Discrete-time signal processing”. Pearson Education India, 1999.

I. Daubechies, J. Lu, and H.-T. Wu, "Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool", Appl. Comput. Harmon. Anal., t. 30, nr 2, s. 243–261, 2011. https://www.sciencedirect.com/science/article/pii/S1063520310001016

F. Auger et al., "Time-frequency reassignment and synchrosqueezing: An overview", IEEE Signal Process. Mag., t. 30, nr 6, s. 32–41, 2013. https://ieeexplore.ieee.org/abstract/document/6633061

J. B. Elsner and A. A. Tsonis, Singular spectrum analysis: a new tool in

time series analysis. Springer Science & Business Media, 1996.

M. G. Frei and I. Osorio, "Intrinsic time-scale decomposition:

time–frequency–energy analysis and real-time filtering of non-stationary signals", Proc. R. Soc. Math. Phys. Eng. Sci., t. 463, nr 2078, s. 321–342, luty 2007, doi: 10.1098/rspa.2006.1761. https://royalsocietypublishing.org/doi/abs/10.1098/rspa.2006.1761

P. Singh, S. D. Joshi, R. K. Patney, and K. Saha, "The Fourier

decomposition method for nonlinear and non-stationary time series

analysis", Proc. R. Soc. Math. Phys. Eng. Sci., t. 473, nr 2199, s. 20160871, 2017. https://royalsocietypublishing.org/doi/full/10.1098/rspa.2016.0871

N. E. Huang and in., "On Holo-Hilbert spectral analysis: a full

informational spectral representation for nonlinear and non-stationary

data", Philos. Trans. R. Soc. Math. Phys. Eng. Sci., t. 374, nr 2065, s. 20150206, kwi. 2016, doi: 10.1098/rsta.2015.0206. https://scholarlypublications.universiteitleiden.nl/handle/1887/112702

C.-H. Juan et al., "Revealing the dynamic nature of amplitude modulated neural entrainment with Holo-Hilbert spectral analysis", Front. Neurosci., t. 15, s. 673369, 2021. https://www.frontiersin.org/articles/10.3389/fnins.2021.673369/full

L. R. Rabiner, R. W. Schafer, and others, "Introduction to digital speech processing", Found. Trends® Signal Process., t. 1, nr 1–2, s. 1–194, 2007

J. Benesty, M. M. Sondhi, Y. Huang, and others, Springer handbook of

speech processing, t. 1. Springer, 2008. https://link.springer.com/book/10.1007/978-3-540-49127-9

D. Kapilow, Y. Stylianou, and J. Schroeter, "Detection of non-

stationarity in speech signals and its application to time-scaling", 1999.

R. S. Holambe and M. S. Deshpande, Advances in non-linear modeling

for speech processing. Springer Science & Business Media, 2012. https://link.springer.com/book/10.1007/978-1-4614-1505-3

P.-L. Lee et al., "The Full Informational Spectral Analysis for Auditory

Steady-State Responses in Human Brain Using the Combination of

Canonical Correlation Analysis and Holo-Hilbert Spectral Analysis", J. Clin. Med., t. 11, nr 13, s. 3868, 2022. https://www.mdpi.com/2077-0383/11/13/3868

N. Moradi, P. LeVan, B. Akin, B. G. Goodyear, and R. C. Sotero, "Holo-Hilbert spectral-based noise removal method for EEG high-frequency

bands", J. Neurosci. Methods, t. 368, s. 109470, 2022. https://www.sciencedirect.com/science/article/abs/pii/S0165027021004052

W.-K. Liang, P. Tseng, J.-R. Yeh, N. E. Huang, and C.-H. Juan,

"Frontoparietal beta amplitude modulation and its interareal cross-frequency coupling in visual working memory", Neuroscience, t. 460, s. 69–87, 2021. https://www.sciencedirect.com/science/article/abs/pii/S0306452221000865

Additional Files