INDONESIAN TEXT-TO-SPEECH SYSTEM USING DIPHONE CONCATENATIVE SYNTHESIS

. Sutarman

Abstract


In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to  be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system  using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process) alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems). This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can  differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can  handle texts with abbreviation, there is the facility to add such words.

 

Keywords: diphone; text to speech; concatenative synthesis.


Full Text:

[PDF]

References


Arman, A. A. 2002. Converting Text to phonemes. http://www.sgu.ac.id/library/ garuda/swf/IT/2010/Irfan.swf.

Arman, A. A. 2003. Building a database Diphone (MBROLA based). from http://lss-gtw.ee.itb.ac.id/~aa/indotts/diphone_dev.html.

Aulia A. 2012. Speech Optimization for Indonesian Text-To-Speech System, Graduate Theses, Institut Teknologi Bandung.

d'Alessandro, C. L., J. 1996. Synthetic Speech Generation. . Survey of the State of the Art in Human Language Technology: 4-10.

Dutoit, T. and H. Leich. 1993. mbr-psola : Text-To-Speech Synthesis based on an MBE Re-Synthesis of the Segments Database. Speech Communication 13.

George. Q. 2001. The Learner's Dictionary of Today's Indonesian. Sydney :Allen & Unwin, ISBN 1864485434.

Handi D. R. B. and Miftahul H. 2011. Text Pre-Processing of Text To Speech Synthesis System for Speak of Indonesian Language. Tesis on Instute of Technology Bandung, Indonesia.

Johannes Tan. 2009. Bahasa Indonesia: Between FAQs and Facts, http://www.indotransnet.com/article1.html.

Lemmetty and Sami. 1999. Review of Speech Synthesis, Helsinky University of Technology.

Lenzo, B., A. W. Black and K. A. Lenzo. 2000. Building voices in the Festival speech synthesis system.

Lenzo, K. A. and A. W. Black. 2000. Diphone Collection And Synthesis.

Marsono. 1999. Fonetik. Jogjakarta, Gadjah Mada University Press.

Nur Aziza Azis, R. M. Hikmah, T. V. Tjahja and A. S. Nugroho. 2011. Evaluation of Text-to-Speech Synthesizer for Indonesian Language Using Semantically Unpredictable Sentences Test: IndoTTS, eSpeak, and Google Translate TTS. International Conference on Advanced Computer Science & Information Systems, Indonesia, Universitas Indonesia.

Richard M. and Aulia A. 2013, Indonesian Text-To-Speech System Using Syllable Concatenation:Speech Optimization, 3rd International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME) Bandung, November 7-8.

Rommel, E. 2005. Aplikasi SMS dengan Text To Speech bahasa Indonesia pada Sistem Operasi Symbian untuk Tuna Netra, Tugas Akhir.

Soebardi. 1973. Learn bahasa Indonesia pattern approach. Yogyakarta, Kanisius.


Refbacks

  • There are currently no refbacks.