Abstract:
There is a growing awareness and usage of information technology; this necessitated the need for protection against network security attacks. Irrespective of the many security features available on our network today, it is a common thing to observe that network security breaches still occur. Since the probability of 100% protection of computer networks against all attacks is not certain, the key task is to provide a network forensic system as a compliment to the existing security system that could trace the source of attack and the resources that were compromised when attacks are successful. This system enables captured recording of network packets and events for investigative purposes. In this research, the design and implementation of a support vector machine-based network forensic model is presented. A Support Vector Machine (SVM) performs classification by constructing an N-dimensional hyper-plane that optimally separates data into two categories. In this thesis, a network forensic model was built using SVM and KDDCUP’99 dataset as a case study. Three experiments (Experiment_1, Experiment_2, and Experiment_3) were carried out. In Experiment_1, the dataset consist 25,000 records of training set containing 63 normal traffics and 24,937 attack traffics while the test set consist 5,200 records. Using a two-class support vector classification, the accuracy of this experiment was found to be 98.35% but with a penalty of misclassifying all normal traffic; hence, resulting to the logging of unnecessary misclassified traffics. In Experiment_2, the number of normal traffics was increased by 500 records while other conditions remain constant as in Experiment_1; the classification accuracy was found to be 96.46% but with an advantage of correctly classifying all normal traffics. Experiment_3 on the other hand uses one-class SVM algorithm and 100% accuracies were obtained for normal traffics, denial of service (DOS), remote to local (R2L) and user to root (U2Su) attacks while 73.19% accuracy was obtained for probing attacks. In conclusion, the classification accuracy of SVM as it was observed in this research depends largely on the number of training set and the number of each class of traffics in the training set relative to the test set. This model shows a high degree of accuracy in detecting both known and unknown attacks.