DEVELOPMENT OF A MACHINE TRANSLATION SYSTEM FOR ENGLISH, IGBO AND YORÙBÁ LANGUAGES

Show simple item record

dc.contributor.author AYOGU, IGNATIUS IKECHUKWU
dc.date.accessioned 2021-08-09T09:46:18Z
dc.date.available 2021-08-09T09:46:18Z
dc.date.issued 2018-07
dc.identifier.uri http://196.220.128.81:8080/xmlui/handle/123456789/4439
dc.description PhD THESIS en_US
dc.description.abstract Machine translation (MT) describes the process by which computer software is used to express the meaning of some text or utterances in a given human language in another human language. Though MT research has developed high quality translation software systems for the translation of English and other privileged languages, research into the development of translation systems for Nigerian languages has not yielded demonstrable practical results. The aim of this research was therefore to develop prototype MT systems for English-Igbo, English-Yorùbá and Igbo-Yorùbá language pairs using data-driven, scalable, language-independent techniques. In order to realize the MT objectives, this research also developed part-of-speech tagging systems for Igbo and Yorùbá languages using algorithms from the Markov random family. A research-size parallel corpus was created for each language pair and used to train and evaluate the respective MT system in a two-stage experimental setup in which the baseline Phrase-based Statistical Machine Translation (Pb-SMT) systems were first developed, experimented and evaluated before being improved upon through linguistic enrichment by incorporating of part-of-speech information into the translation model. The baseline Pb-SMT systems were trained using plain, unannotated parallel text, attaining BLEU scores of 30.17, 29.64 and 19.02 respectively for English-Igbo, English-Yorùbá and Igbo-Yorùbá MT systems. Error analyses of these baseline systems revealed that a greater percentage of errors occurred at the lexical, grammatical and semantic levels for each of the three translation systems. To improve upon the baseline translation performance, the Pb-SMT models were enriched with part-of-speech information using the factored Pb-SMT framework in a number of investigative experimental configurations involving direct translations and single-step procedures that combines a generation step respectively. The best BLEU score gains for the factored models utilizing only the direct translation step was found to be 0.45 and 0.40 respectively for English-Igbo and English-Yorùbá systems while the best gain for models combining translation and generation steps with multiple decoding paths was 0.56 BLEU points for the English- Igbo system. An evaluation of the best models were made against GoogleTranslate; the results showed en_US
dc.description.sponsorship FEDRAL UNIVERSITY OF TECHNOLOGY AKURE en_US
dc.language.iso en en_US
dc.publisher FEDERAL UNIVERSITY OF TECHNOLOGY AKURE en_US
dc.subject MACHINE TRANSLATION SYSTEM en_US
dc.subject Linguistic communication en_US
dc.title DEVELOPMENT OF A MACHINE TRANSLATION SYSTEM FOR ENGLISH, IGBO AND YORÙBÁ LANGUAGES en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search FUTAspace


Advanced Search

Browse

My Account