| dc.contributor.author | AYOGU, IGNATIUS IKECHUKWU | |
| dc.date.accessioned | 2021-08-09T09:46:18Z | |
| dc.date.available | 2021-08-09T09:46:18Z | |
| dc.date.issued | 2018-07 | |
| dc.identifier.uri | http://196.220.128.81:8080/xmlui/handle/123456789/4439 | |
| dc.description | PhD THESIS | en_US |
| dc.description.abstract | Machine translation (MT) describes the process by which computer software is used to express the meaning of some text or utterances in a given human language in another human language. Though MT research has developed high quality translation software systems for the translation of English and other privileged languages, research into the development of translation systems for Nigerian languages has not yielded demonstrable practical results. The aim of this research was therefore to develop prototype MT systems for English-Igbo, English-Yorùbá and Igbo-Yorùbá language pairs using data-driven, scalable, language-independent techniques. In order to realize the MT objectives, this research also developed part-of-speech tagging systems for Igbo and Yorùbá languages using algorithms from the Markov random family. A research-size parallel corpus was created for each language pair and used to train and evaluate the respective MT system in a two-stage experimental setup in which the baseline Phrase-based Statistical Machine Translation (Pb-SMT) systems were first developed, experimented and evaluated before being improved upon through linguistic enrichment by incorporating of part-of-speech information into the translation model. The baseline Pb-SMT systems were trained using plain, unannotated parallel text, attaining BLEU scores of 30.17, 29.64 and 19.02 respectively for English-Igbo, English-Yorùbá and Igbo-Yorùbá MT systems. Error analyses of these baseline systems revealed that a greater percentage of errors occurred at the lexical, grammatical and semantic levels for each of the three translation systems. To improve upon the baseline translation performance, the Pb-SMT models were enriched with part-of-speech information using the factored Pb-SMT framework in a number of investigative experimental configurations involving direct translations and single-step procedures that combines a generation step respectively. The best BLEU score gains for the factored models utilizing only the direct translation step was found to be 0.45 and 0.40 respectively for English-Igbo and English-Yorùbá systems while the best gain for models combining translation and generation steps with multiple decoding paths was 0.56 BLEU points for the English- Igbo system. An evaluation of the best models were made against GoogleTranslate; the results showed | en_US |
| dc.description.sponsorship | FEDRAL UNIVERSITY OF TECHNOLOGY AKURE | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | FEDERAL UNIVERSITY OF TECHNOLOGY AKURE | en_US |
| dc.subject | MACHINE TRANSLATION SYSTEM | en_US |
| dc.subject | Linguistic communication | en_US |
| dc.title | DEVELOPMENT OF A MACHINE TRANSLATION SYSTEM FOR ENGLISH, IGBO AND YORÙBÁ LANGUAGES | en_US |
| dc.type | Thesis | en_US |