Development of Speaker Voice Identification Using Main Tone Boundary Statistics for Applying To Robot-Verbal Systems

Yedilkhan Amirgaliyev; Timur Musabayev; Didar Yedilkhan; Waldemar Wojcik; Zhazira Amirgaliyeva

Development of Speaker Voice Identification Using Main Tone Boundary Statistics for Applying To Robot-Verbal Systems

Authors

Yedilkhan Amirgaliyev Institute of Information and computing technologies of the Science Committee of RK MES and Al-Farabi Kazakh National University
Timur Musabayev Institute of Information and computing technologies of the Science Committee of RK MES
Didar Yedilkhan Institute of Information and computing technologies of the Science Committee of RK MES, Astana IT University
Waldemar Wojcik Lublin Technical University and Institute of Information and computing technologies of the Science Committee of RK MES
Zhazira Amirgaliyeva Institute of Information and computing technologies of the Science Committee of RK MES

Abstract

Abstract—Hereby there is given the speaker identification basic system. There is discussed application and usage of the voice interfaces, in particular, speaker voice identification upon robot and human being communication. There is given description of the information system for speaker automatic identification according to the voice to apply to robotic-verbal systems. There is carried out review of algorithms and computer-aided learning libraries and selected the most appropriate, according to the necessary criteria, ALGLIB. There is conducted the research of identification model operation performance assessment at different set of the fundamental voice tone. As the criterion of accuracy there has been used the percentage of improperly classified cases of a speaker identification.

Author Biographies

Yedilkhan Amirgaliyev, Institute of Information and computing technologies of the Science Committee of RK MES and Al-Farabi Kazakh National University

Head of the Laboratory of Artificial Intelligence and Robotics

Timur Musabayev, Institute of Information and computing technologies of the Science Committee of RK MES

Senior Researcher, Artificial Intelligence and Robotics Laboratories

Didar Yedilkhan, Institute of Information and computing technologies of the Science Committee of RK MES, Astana IT University

Senior Researcher, Artificial Intelligence and Robotics Laboratories

References

REFERENCES

J. P. Campell and Jr., Speaker Recognition: A Tutorial, Proceeding of IEEE, vol. 85, pp. 1437–1462, (1997).

Osman Buyuk and Lavent M. Arslan, HMM-based Text-dependent Speaker Recognition with Handset-channel Recognition, IEEE ICSPCA, pp. 383–386, (2010).

D. A. Reynolds and R. C. Rose, Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models, IEEE Transaction on SAP, vol. 3, no. 1, pp. 72–83, (1995).

R. E. Wohiford, E. H. Jr. Wrench and B. P. Landell, A Comparison of Four Techniques for Automatic Speaker Recognition, Proceedings of IEEE ICASSP, vol. 5, pp. 908–911, (1980).

B. Atal, Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification, The Journal of the Acoustical Society America, vol. 55, pp. 1304–1312, (1974).

Sangeeta Biswas’, Shamim Ahmadt and Md Khademul Islam Molladt, Speaker Identification using Cepstral Based Features and Discrete Hidden Markov Model, Proceedings of IEEE ICICT, pp. 303–306, (2007).

Latha, Robust Speaker Identification Incorporating High Frequency Features, Procedia Computer Science, vol. 89, 2016, pp. 804-811.

https://ru.wikipedia.org/wiki/ Speech recognition.

F. Alonso-Martin, J. F. Gorostiza, M. Malfaz, and M. Salichs. Multimodal Fusion as Communicative Acts during Human-RobotInteraction. Cybernetics and Systems, 44(8):681–703, 2013.

E. Dalmasso, F. Castaldo, P. Laface, D. Colibro, and C. Vair. Loquendo - Speaker recognition evaluation system. In Acoustics, Speech and Signal Processing, ICASSP 2009. IEEE

F. Alonso Martin, A. Ramey, M. A. Salichs. Speaker identification using three signal voice domains during human-robot interaction. HRI’14. 2014.

Y. Kida, H. Yamamoto, C. Miyajima, K. Tokuda, T. Kitamura. Minimum Classification Error Interactive Training for Speaker Identification. Proceedings. (ICASSP ’05). 2005.

Alisa (voice helper) // https://ru.wikipedia.org/wiki/Alisa: 24.11.2017

Kovalj S.L., Labutin P. V., Malaya Ye. V., Proshina Ye. А. Speakers identification based on the main voice tone statistic comparison // Informatization and information security of law-enforcement agencies: proceedings of the XV International scientific conference — М.: Russia Ministry of the Interior Academy of management, 2006. –p.p. 324–327.

Bulgakova Ye.V., Sholokhov А.V., Tomashenko N.А. Speakers identification method based on phonemes length statistics comparison // Scientific-technical vestnik of information technologies, mechanics and optics. –2015. – No 1. – p.p. 70–77.

Lukiyanov D. I., Mikhailova А. S. Human being automatic identification according to the voice using an algorithm based on Gaussian mixtures model// Vestnik of RSRTU. – 2017. – No 61. – p.p. 19-24.

Math.NET Numerics // https://numerics.mathdotnet.com/: 28.07.2017.

Statistics – Math.NET Numerics Documentation. Extension methods to return basic statistics on set of data // https://numerics.mathdotnet.com/api/ MathNet.Numerics.Statistics/Statistics. htm:24.11.2017.

Vetrov D.P., Kropotov D.А. Bayesian method of computer-aided learning. – Study guide – М., 2007. – 132 p.

Glushkov V.M., Amosov N.М., Artyemenko I.А. Cybernetics encyclopedia. Volume 2. – К.: Main office of Ukrainian soviet encyclopedia, 1974. – 624 p.

Mussabayev R.R., Amirgaliyev Ye. N., Tairova A.T., Mussabayev T.R., Koibagarov K. Ch. The technology for the automatic formation of the personal digital voice pattern // 10th IEEE International Conference on Application of Information and Communication Technologies (AICT). – Azerbaijan, Baku, 2016. – P. 422-426

General concepts. Library of algorithms ALGLIB // http://alglib.source s.ru/dataanalysis/generalprinciples.php: 18.08.2017.

Full set of sentence recordings for downloading. The Centre for Speech Technology Research // http://www.cstr.ed.ac.uk/projects/eustace/download.html: 25.08.2017.

Downloads

Published

2024-04-19

Issue

Vol. 66 No. 3 (2020)

Section

Space technologies, Astronomy

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

1. License

The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on https://creativecommons.org/licenses/by/4.0/.

2. Author’s Warranties

The author warrants that the article is original, written by stated author/s, has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author/s. The undersigned also warrants that the manuscript (or its essential substance) has not been published other than as an abstract or doctorate thesis and has not been submitted for consideration elsewhere, for print, electronic or digital publication.

3. User Rights

Under the Creative Commons Attribution license, the author(s) and users are free to share (copy, distribute and transmit the contribution) under the following conditions: 1. they must attribute the contribution in the manner specified by the author or licensor, 2. they may alter, transform, or build upon this work, 3. they may use this contribution for commercial purposes.

4. Rights of Authors

Authors retain the following rights:

- copyright, and other proprietary rights relating to the article, such as patent rights,

- the right to use the substance of the article in own future works, including lectures and books,

- the right to reproduce the article for own purposes, provided the copies are not offered for sale,

- the right to self-archive the article

- the right to supervision over the integrity of the content of the work and its fair use.

5. Co-Authorship

If the article was prepared jointly with other authors, the signatory of this form warrants that he/she has been authorized by all co-authors to sign this agreement on their behalf, and agrees to inform his/her co-authors of the terms of this agreement.

6. Termination

This agreement can be terminated by the author or the Journal Owner upon two months’ notice where the other party has materially breached this agreement and failed to remedy such breach within a month of being given the terminating party’s notice requesting such breach to be remedied. No breach or violation of this agreement will cause this agreement or any license granted in it to terminate automatically or affect the definition of the Journal Owner. The author and the Journal Owner may agree to terminate this agreement at any time. This agreement or any license granted in it cannot be terminated otherwise than in accordance with this section 6. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.

7. Royalties

This agreement entitles the author to no royalties or other fees. To such extent as legally permissible, the author waives his or her right to collect royalties relative to the article in respect of any use of the article by the Journal Owner or its sublicensee.

8. Miscellaneous

The Journal Owner will publish the article (or have it published) in the Journal if the article’s editorial process is successfully completed and the Journal Owner or its sublicensee has become obligated to have the article published. Where such obligation depends on the payment of a fee, it shall not be deemed to exist until such time as that fee is paid. The Journal Owner may conform the article to a style of punctuation, spelling, capitalization and usage that it deems appropriate. The Journal Owner will be allowed to sublicense the rights that are licensed to it under this agreement. This agreement will be governed by the laws of Poland.

By signing this License, Author(s) warrant(s) that they have the full power to enter into this agreement. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.

Development of Speaker Voice Identification Using Main Tone Boundary Statistics for Applying To Robot-Verbal Systems

Authors

Abstract

Author Biographies

Yedilkhan Amirgaliyev, Institute of Information and computing technologies of the Science Committee of RK MES and Al-Farabi Kazakh National University

Timur Musabayev, Institute of Information and computing technologies of the Science Committee of RK MES

Didar Yedilkhan, Institute of Information and computing technologies of the Science Committee of RK MES, Astana IT University

References

Downloads

Published

Issue

Section

License

Information

Current Issue