Abstract:
Head and neck cancers (HNC) otherwise known as craniocervical cancers are diseases
that grow out of control in the head and neck regions. 5-50% of all cancers globally and
are often associated with facial disfigurement that causes cosmetic embarrassment due to
delay in referral to the specialist concerned (ENT surgeons). They are of various types
and as such characterized as a heterogeneous disease. The incidence of HNC is on the
increase owing to several factors. There is often late presentation that can result in loss of
lives (mortality) especially in Africa due to paucity of specialists. These challenges
prompted the development of a stacked ensemble model for diagnosis of HNC to
facilitate prompt referral. The data used for this work were collected from the ENT/Head
and Neck and pathology departments in University Medical Sciences Teaching Hospital,
Akure, Federal Medical Centre, Owo and Obafemi Awolowo University Teaching
Hospital, Ile Ife. The dataset consists of 1473 instances with 18 features. Three filter
based feature selection methods: Consistency, Information Gain and Chi Square were
used to select the relevant features from the HNC dataset. Three supervised learning
algorithms were deployed for the base learners: Decision Tree (C4.5), K-Nearest
Neighbors and Naïve Bayes. The predictions of the base learners were combined and
passed to meta learners: Multinomial Logistic Regression (MLR) and Logistic Model
Tree (LMT). The results showed that Consistency, Information Gain and Chi Square
feature selection methods with stacked MLR were 94.90%, 95.38% and 95.38%
respectively. Consistency, Information Gain and Chi Square feature selection methods
with stacked LMT were 94.77%, 95.11% and 94.91%. It was deduced that both
Information Gain and Chi Square with stacked MLR produced highest results.