This paper proposes a Packet Loss Concealment
(PLC) method for speech codecs based on linear predictive coding, utilizing attention mechanisms and Long Short-Term Memory networks to reconstruct the Linear Predictive Coefficients. A novel multiscale trend-aware multi-head self-attention network is designed to capture the long-term global correlations and short-term local dependencies of speech signals across different time scales, enabling effective global and local receptive fields during the reconstruction of lost packets. A new multiscale Stack Fusion method is introduced to further enhance reconstruction performance. It assigns higher weights to speech frames closer to the lost packets and lower weights to distant ones, enabling effective integration of global and local features across various
time scales. Additionally, a tailored loss function is proposed to guide model training by balancing the numerical precision, structural periodicity, and perceptual fidelity. Objective and subjective evaluations consistently indicate that the proposed method sustains robust performance across varying packet loss rates and speaker variability, underscoring its enhanced
generalization. The alignment of improvements across multiple evaluation metrics demonstrates that these advancements are architectural rather than dataset-specific. Notably, the proposed method reveals that integrating codec-internal parameters with multiscale temporal modeling provides intrinsic robustness than
post-processing PLC methods. Furthermore, the proposed model requires only 0.16 Giga Multiply-Accumulate Operations per second, underscoring its strong potential for high-quality realtime speech communication applications.<p></p>
Funding
Loughborough University (Grant No. GS1016)
China Scholarship Council (Grant No. 202208060237)
History
School
Loughborough University, London
Published in
IEEE Transactions on Audio, Speech and Language Processing
This accepted manuscript has been made available under the Creative Commons Attribution licence (CC BY) under the IEEE JISC UK green open access agreement.