TY - JOUR
T1 - CASRA+
T2 - A colloquial Arabic speech recognition application
AU - Haraty, Ramzi A.
AU - El Ariss, Omar
PY - 2007
Y1 - 2007
N2 - The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direct word template matching, where speech features that are extracted from the input are directly compared to the word templates. Each word template in the direct matching model was stored as a vector of feature parameters. Thus, when the vocabulary size of the ASR system becomes large, the memory size for the word template will become humongous. In contrast, the model used here was phoneme-like template matching. Word templates are stored as phoneme-like template parameters. Thus, the memory size for the word templates will not grow as fast as that of the direct matching model.
AB - The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direct word template matching, where speech features that are extracted from the input are directly compared to the word templates. Each word template in the direct matching model was stored as a vector of feature parameters. Thus, when the vocabulary size of the ASR system becomes large, the memory size for the word template will become humongous. In contrast, the model used here was phoneme-like template matching. Word templates are stored as phoneme-like template parameters. Thus, the memory size for the word templates will not grow as fast as that of the direct matching model.
UR - http://www.scopus.com/inward/record.url?scp=34447133115&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34447133115&partnerID=8YFLogxK
U2 - 10.3844/ajassp.2007.23.32
DO - 10.3844/ajassp.2007.23.32
M3 - Article
AN - SCOPUS:34447133115
SN - 1546-9239
VL - 4
SP - 23
EP - 32
JO - American Journal of Applied Sciences
JF - American Journal of Applied Sciences
IS - 1
ER -