Chinese is a tonal language. The difference between tones can be determined by pitch contour. During the process of extracting pitch contour, the situation that a pitch's occurrence at half-frequency or at double-frequency constantly happens, which lead to a pitch contour's discontinuousness and mistakes in its tone recognition. Based on the traditional methods (Auto-Correlation Function (ACF), Average Magnitude Difference Function (AMDF) and Correlation Function (CF)), this study aims to improve the deficiency of pitch contour detection. We propose a modified method to reduce the impact of noise in speech and possibly find the precise fundamental frequency for each extracted signal. We use the methods of clustering and linear regression model to achieve correction and smoothness for the pitch at half frequency and double frequency. The test corpus consists of a total of 1331 Chinese words from tone 1 to tone 4, excluding tone 5. From the experimental results, compared to traditional methods, the modified method contributes to the higher recognition rate by about 20% (the highest achieved recognition rate is 95.54%). Meanwhile, the recognition rate of this study is higher than the maximum recognition rate up to 95.03%, which adopts the Unbroken Pitch Determination Using Dynamic Programming (UPDUDP); though the difference between these two rates is not remarkable. The modified method decreases the error rate of the pitch at half frequency or double frequency about 3%, compared to the one adopting the UPDUDP. In terms of tone 1, our modified method only has 0.1% error rate, which is far lower than the error rate of pitch at double frequency using UPDUDP is 8.28%. Thus, our method is proved to effectively improve the detection of pitch contour.
- Tone language
- Pitch contour