帳號:guest(3.237.16.210)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):賴百威
作者(外文):Pai-Wei Lai
論文名稱(中文):華語捲舌音與非捲舌音之偵測
論文名稱(外文):On the Detection of Retroflex and Non-retroflex for Mandarin Chinese
指導教授(中文):張智星
指導教授(外文):Jyh-Shing Roger Jang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:944332
出版年(民國):96
畢業學年度:95
語文別:英文
論文頁數:28
中文關鍵詞:捲舌音
外文關鍵詞:retroflex
相關次數:
  • 推薦推薦:0
  • 點閱點閱:167
  • 評分評分:*****
  • 下載下載:4
  • 收藏收藏:0
本論文為偵測華語捲舌音以及非捲舌音之研究。研究目標是希望能準確判斷某段經過切音的華語子音聲音區段是否具有捲舌音的特性。
本論文所使用的偵測方式近似於說話人辨認,首先我們使用高斯混合模型(GMM)來訓練捲舌音模型以及非捲舌音模型,然後藉著調整相似度比值(likelihood ratio)的門檻值來產生相等錯誤率(EER)。除了梅爾倒頻係數(MFCC)之外,本論文也採用了頻譜動差參數以及共振峰的語音特徵。
實驗結果顯示,頻譜動差參數以及共振峰皆有助於捲舌音之偵測,研究結果的最佳相等錯誤率為17.69%。
This thesis presents the detection of retroflex and non-retroflex for Mandarin Chinese. The objective of our research is to determine whether an initial within a syllable obtained from forced alignment has the characteristics of a retroflex or not.
The decision rule used in this paper is similar to speaker verification. Firstly, GMM-based retroflex and non-retroflex models are trained. Secondly, we adjust the threshold of the likelihood ratio to achieve the equal error rate (EER). In addition to MFCC, spectrum moments and formants are also used as our speech features.
The experimental results indicate that spectrum moments and formants are able to improve the performance of the retroflex and non-retroflex detection rate. The best equal error rate obtained from our experiments is 17.69%.
CONTENTS

ACKNOWLEDGEMENT ........................................i
ABSTRACT ..............................................ii
CONTENTS ..............................................iv
LIST OF FIGURES .......................................vi
LIST OF TABLES ......................................viii
CHAPTER 1 INTRODUCTION .................................1
1.1 Motivation ......................................1
1.2 Related Work ....................................1
1.3 Summary of the Thesis ...........................2
1.4 Organization of the Thesis ......................3
CHAPTER 2 TRAINING RETROFLEX AND NON-RETROFLEX MODELS ..4
2.1 Retroflex and Non-retroflex Consonants ..........4
2.2 Training Phase Overview .........................5
2.3 Feature Extraction ..............................5
2.3.1 Mel Frequency Ceptral Coefficients (MFCC) ..6
2.3.2 Spectrum Moments ...........................6
2.3.3 Formants ...................................8
2.4 Speech Segmentation .............................9
2.5 GMM Model Training .............................11
CHAPTER 3 RETROFLEX AND NON-RETROFLEX DETECTION .......12
3.1 Testing Phase Overview .........................12
3.2 Decision Function ..............................13
3.3 Performance Measurement ........................14
CHAPTER 4 EXPERIMENTAL RESULTS ........................15
4.1 Corpus Introduction ............................15
4.2 Universal Detection: All Retroflex v.s.
All Non-retroflex ............................17
4.3 Specific Detection: ㄓ, ㄔ, ㄕ v.s.
ㄗ, ㄘ, ㄙ ...................................22
CHAPTER 5 CONCLUSIONS AND FUTURE WORK .................26
BIBLIOGRAPHY ..........................................27
[1] Nabil N. Bitar and Carol Y. Espy-Wilson, “Knowledge-based Parameters for HMM Speech Recognition,” IEEE, 1996.
[2] Jinyu Li, Yu Tsao and Chin-Hui Lee, “A Study on Knowledge Source Integration for Candidate Rescoring in Automatic Speech Recognition,” ICASSP, 2005.
[3] Jinyu Li and Chin-Hui Lee, “On Designing and Evaluating Speech Event Detectors,” INTERSPEECH, 2005.
[4] Ligang Zhou, Hideaki Seki and Ken’iti Kido, “An Investigation of the Chinese Rolling Tongue Consonants by Time-frequency Analysis,” Proceedings of ICSP’98, pp.698-701.
[5] 翁秀民, 沈牧璋, 郭德惠, “華語捲舌子音及其相對非捲舌子音的比較研究,” 高雄應用科技大學學報, 第35卷, 第325-333頁, 民國95年.
[6] 鄭靜宜, “國語捲舌音和非捲舌音的聲學特性,” 台南大學人文研究學報, 第40卷第1期, 第27-48頁, 民國95年.
[7] Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993
[8] Paul H. Milenkovic, “Analysis and synthesis of unvoiced speech using spectrum moments,” Workshop on spectrum moments measures for Speech Language and Hearing Research, ASHA San Francisco, 1999.
[9] Paul H. Milenkovic, “Moments: batch speech spectrum moments analysis,” Department of Electrical Engineering, University of Wisconsin-Madison, Madison, WI, 2003. http://www.medsch.wisc.edu/~milenkvc/tools.html
[10] http://neural.cs.nthu.edu.tw/jang/books/dcpr/
[11] 王小川, 語音訊號處理, 全華科技圖書, 民國93年.
[12] TCC-300Edu Corpus, ACLCLP. http://www.aclclp.org.tw/use_mat.php#tcc300edu
[13] Steven Young, The HTK Book, Version 3.2.1, Cambridge University Engineering Department, 2002.
[14] W.-S. Lee and E. Zee, “Illustration of the IPA: Standard Chinese (Beijing),” Journal of the International Phonetic Association, Vol. 33(1), pp.109-112, 2003.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *