Development of Speaker Voice Identification Using Main Tone Boundary Statistics for Applying To Robot-Verbal Systems

Authors

  • Yedilkhan Amirgaliyev Institute of Information and computing technologies of the Science Committee of RK MES and Al-Farabi Kazakh National University
  • Timur Musabayev Institute of Information and computing technologies of the Science Committee of RK MES
  • Didar Yedilkhan Institute of Information and computing technologies of the Science Committee of RK MES, Astana IT University
  • Waldemar Wojcik Lublin Technical University and Institute of Information and computing technologies of the Science Committee of RK MES
  • Zhazira Amirgaliyeva Institute of Information and computing technologies of the Science Committee of RK MES

Abstract

Abstract—Hereby there is given the speaker identification basic system. There is discussed application and usage of the voice interfaces, in particular, speaker voice identification upon robot and human being communication. There is given description of the information system for speaker automatic identification according to the voice to apply to robotic-verbal systems. There is carried out review of algorithms and computer-aided learning libraries and selected the most appropriate, according to the necessary criteria, ALGLIB. There is conducted the research of identification model operation performance assessment at different set of the fundamental voice tone. As the criterion of accuracy there has been used the percentage of improperly classified cases of a speaker identification.

Author Biographies

Yedilkhan Amirgaliyev, Institute of Information and computing technologies of the Science Committee of RK MES and Al-Farabi Kazakh National University

Head of the Laboratory of Artificial Intelligence and Robotics

Timur Musabayev, Institute of Information and computing technologies of the Science Committee of RK MES

Senior Researcher, Artificial Intelligence and Robotics Laboratories

Didar Yedilkhan, Institute of Information and computing technologies of the Science Committee of RK MES, Astana IT University

Senior Researcher, Artificial Intelligence and Robotics Laboratories

References

REFERENCES

J. P. Campell and Jr., Speaker Recognition: A Tutorial, Proceeding of IEEE, vol. 85, pp. 1437–1462, (1997).

Osman Buyuk and Lavent M. Arslan, HMM-based Text-dependent Speaker Recognition with Handset-channel Recognition, IEEE ICSPCA, pp. 383–386, (2010).

D. A. Reynolds and R. C. Rose, Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models, IEEE Transaction on SAP, vol. 3, no. 1, pp. 72–83, (1995).

R. E. Wohiford, E. H. Jr. Wrench and B. P. Landell, A Comparison of Four Techniques for Automatic Speaker Recognition, Proceedings of IEEE ICASSP, vol. 5, pp. 908–911, (1980).

B. Atal, Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification, The Journal of the Acoustical Society America, vol. 55, pp. 1304–1312, (1974).

Sangeeta Biswas’, Shamim Ahmadt and Md Khademul Islam Molladt, Speaker Identification using Cepstral Based Features and Discrete Hidden Markov Model, Proceedings of IEEE ICICT, pp. 303–306, (2007).

Latha, Robust Speaker Identification Incorporating High Frequency Features, Procedia Computer Science, vol. 89, 2016, pp. 804-811.

https://ru.wikipedia.org/wiki/ Speech recognition.

F. Alonso-Martin, J. F. Gorostiza, M. Malfaz, and M. Salichs. Multimodal Fusion as Communicative Acts during Human-RobotInteraction. Cybernetics and Systems, 44(8):681–703, 2013.

E. Dalmasso, F. Castaldo, P. Laface, D. Colibro, and C. Vair. Loquendo - Speaker recognition evaluation system. In Acoustics, Speech and Signal Processing, ICASSP 2009. IEEE

F. Alonso Martin, A. Ramey, M. A. Salichs. Speaker identification using three signal voice domains during human-robot interaction. HRI’14. 2014.

Y. Kida, H. Yamamoto, C. Miyajima, K. Tokuda, T. Kitamura. Minimum Classification Error Interactive Training for Speaker Identification. Proceedings. (ICASSP ’05). 2005.

Alisa (voice helper) // https://ru.wikipedia.org/wiki/Alisa: 24.11.2017

Kovalj S.L., Labutin P. V., Malaya Ye. V., Proshina Ye. А. Speakers identification based on the main voice tone statistic comparison // Informatization and information security of law-enforcement agencies: proceedings of the XV International scientific conference — М.: Russia Ministry of the Interior Academy of management, 2006. –p.p. 324–327.

Bulgakova Ye.V., Sholokhov А.V., Tomashenko N.А. Speakers identification method based on phonemes length statistics comparison // Scientific-technical vestnik of information technologies, mechanics and optics. –2015. – No 1. – p.p. 70–77.

Lukiyanov D. I., Mikhailova А. S. Human being automatic identification according to the voice using an algorithm based on Gaussian mixtures model// Vestnik of RSRTU. – 2017. – No 61. – p.p. 19-24.

Math.NET Numerics // https://numerics.mathdotnet.com/: 28.07.2017.

Statistics – Math.NET Numerics Documentation. Extension methods to return basic statistics on set of data // https://numerics.mathdotnet.com/api/ MathNet.Numerics.Statistics/Statistics. htm:24.11.2017.

Vetrov D.P., Kropotov D.А. Bayesian method of computer-aided learning. – Study guide – М., 2007. – 132 p.

Glushkov V.M., Amosov N.М., Artyemenko I.А. Cybernetics encyclopedia. Volume 2. – К.: Main office of Ukrainian soviet encyclopedia, 1974. – 624 p.

Mussabayev R.R., Amirgaliyev Ye. N., Tairova A.T., Mussabayev T.R., Koibagarov K. Ch. The technology for the automatic formation of the personal digital voice pattern // 10th IEEE International Conference on Application of Information and Communication Technologies (AICT). – Azerbaijan, Baku, 2016. – P. 422-426

General concepts. Library of algorithms ALGLIB // http://alglib.source s.ru/dataanalysis/generalprinciples.php: 18.08.2017.

Full set of sentence recordings for downloading. The Centre for Speech Technology Research // http://www.cstr.ed.ac.uk/projects/eustace/download.html: 25.08.2017.

Downloads

Published

2024-04-19

Issue

Section

Space technologies, Astronomy