PRINCIPLES OF DIGITAL SPEECH SIGNAL PROCESSING AND ANALYSIS OF THE ACOUSTIC CHARACTERISTICS OF HUMAN AND SYNTHETIC SPEECH

Authors

  • S.U.Nasirov Tashkent University of Information Technologies named after Muhammad al-Khorezmi

Keywords:

Speech signal processing, Text-to-speech synthesis, Uzbek language speech synthesis, Acoustic features, Assistive technology for visually impaired, Synthetic speech detection

Abstract

This article examines the growing importance of digital speech signal processing in fields such as cybersecurity, biometrics, and human-computer interaction, particularly in response to synthetic speech and deepfake voice threats. It traces the historical development of speech synthesis from early mechanical models (Kratzenstein, von Kempelen) to modern systems, while highlighting a critical gap: the lack of comprehensive TTS systems for the Uzbek language. The article discusses the technical requirements for building a Uzbek-language speech synthesizer - including phonetic modeling, prosody assignment, and acoustic feature extraction (e.g., MFCCs) - and emphasizes its potential to improve digital accessibility for visually impaired users and enable voice-based interaction in underrepresented languages.

Downloads

Published

2026-06-14