Web Document Classification Using Naïve Bayes

Adetunji, A.B. and Oguntoye, J.P. and Fenwa, O.D and Akande, N.O (2018) Web Document Classification Using Naïve Bayes. Journal of Advances in Mathematics and Computer Science, 29 (6). pp. 1-11. ISSN 2456-9968

[img] Text
24159-Article Text-45269-1-10-20181229.pdf - Published Version

Download (389kB)


World Wide Web has become a huge collection of documents and the amount of documents available is increasing on a daily basis. How to correctly classify the vast documents into a particular category and locate any document of interest easily has become a challenge researchers have been trying to solve for decades and different researchers have attempted different algorithms using different platform to achieve this aim. In this paper, a University web site was used as a case study and a machine learning workbench called WEKA (Waikato Environment for Knowledge Analysis) which provides a general-purpose environment for automatic classification, regression, clustering and feature selection was used as a machine learning platform. Running Naïve Bayes with 10-fold cross validation on the selected web data gives a 77% correctly classified instances in zero second with relative absolute error of 68.9937%. This shows the ability of Naïve Bayes algorithm to accurately classify vast amount of web document in a short time.

Item Type: Article
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA76 Computer software
Date Deposited: 31 Oct 2019 10:43
Last Modified: 31 Oct 2019 10:43
URI: https://eprints.lmu.edu.ng/id/eprint/2709

Actions (login required)

View Item View Item