Xem trước tài liệu

Đang tải tài liệu...

Thông tin chi tiết tài liệu

Định dạng: PDF
Số trang: 9 trang
Dung lượng: Đang cập nhật

Giới thiệu nội dung

Compensating Acoustic Mismatch Using Class-Based Histogram Equalization For Robust Speech Recognition

Authors: Youngjoo Suh, Sungtak Kim, and Hoirin Kim

Field: Signal Processing

Document Content:

This research article proposes a novel class-based histogram equalization (HEQ) method designed to enhance the robustness of speech recognition systems. The primary objective is to address the acoustic mismatch that occurs between training and testing environments, a common issue in real-world applications caused by additive noise and channel distortion. The proposed method aims to overcome the limitations of conventional HEQ, specifically the discrepancy between phonetic distributions of training and test data, and the nonmonotonic transformation that can arise from acoustic mismatch. This advanced HEQ technique employs multiple class-specific reference and test cumulative distribution functions. It categorizes noisy test features into their corresponding classes and then applies equalization using these class-specific distributions. To further mitigate the impact of additive noise, a minimum mean-square error log-spectral amplitude (MMSE-LSA)-based speech enhancement is integrated as a preprocessor before feature extraction. Experiments conducted on the Aurora2 database demonstrate the effectiveness of this approach, showing significant reductions in relative errors compared to existing methods.

Detailed Table of Contents:

  • 1. Introduction
  • 2. Speech Enhancement Based on MMSE-LSA
  • 3. Conventional Histogram Equalization
  • 4. Class-Based Histogram Equalization
  • 5. Experimental Results
  • 6. Conclusion