-
Phoneme recognition (Spikegram, MFCC, Spectrogram, Melspectrogram)ML/음성인식 2020. 1. 21. 16:41반응형
Phoneme recognition을 위해 다양한 feature를 사용하여 실험해보았다.
https://github.com/HanSeokhyeon/Deep_learning_for_Phoneme_recognition1. Spikegram
Obstruent - Stops : 0.5738 Obstruent - Affricate : 0.4219 Obstruent - Fricative : 0.7066 Sonorant - Glides : 0.5514 Sonorant - Nasals : 0.5915 Sonorant - Vowels : 0.5305 Others : 0.9224 Obstruent : 0.6576 Sonorant : 0.5412 Others : 0.9224 Non-mute : 0.5749 Mute : 0.9224 Total : 0.6526
2. MFCC
Obstruent - Stops : 0.5097 Obstruent - Affricate : 0.3061 Obstruent - Fricative : 0.6693 Sonorant - Glides : 0.5659 Sonorant - Nasals : 0.6236 Sonorant - Vowels : 0.5338 Others : 0.9196 Obstruent : 0.6106 Sonorant : 0.5499 Others : 0.9196 Non-mute : 0.5677 Mute : 0.9196 Total : 0.6550
3. Spectrogram
Obstruent - Stops : 0.5045 Obstruent - Affricate : 0.3606 Obstruent - Fricative : 0.6698 Sonorant - Glides : 0.5598 Sonorant - Nasals : 0.6039 Sonorant - Vowels : 0.5244 Others : 0.9179 Obstruent : 0.6123 Sonorant : 0.5399 Others : 0.9179 Non-mute : 0.5611 Mute : 0.9179 Total : 0.6496
4. Melspectrogram
Obstruent - Stops : 0.4918 Obstruent - Affricate : 0.3574 Obstruent - Fricative : 0.6723 Sonorant - Glides : 0.5543 Sonorant - Nasals : 0.6009 Sonorant - Vowels : 0.5370 Others : 0.9194 Obstruent : 0.6106 Sonorant : 0.5475 Others : 0.9194 Non-mute : 0.5661 Mute : 0.9194 Total : 0.6537
반응형'ML > 음성인식' 카테고리의 다른 글
Listen, Attend and Spell 논문 리뷰 (0) 2020.04.08 Kaldi 예제 Voxforge 데이터 (0) 2020.01.21 Kaldi, Kaldi gstreamer 설치 및 예제 실행 (0) 2020.01.21 Speech Emotion Recognition 연구 기록 (0) 2020.01.21