Comparative Study of Supervised Machine Learning Algorithms on Thoracic Surgery Patients based on Ranker Feature Algorithms
Keywords:Ranker feature selection, Information gain, Gain Ratio, supervised Machine learning Algorithms, Thoracic Surgery, Cross-Validation.
Thoracic surgery refers to the information gathered for the patients who have to suffer from lung cancer. Various machine learning techniques were employed in post-operative life expectancy to predict lung cancer patients. In this study, we have used the most famous and influential supervised machine learning algorithms, which are J48, Naïve Bayes, Multilayer Perceptron, and Random Forest (RF). Then, two ranker feature selections, information gain and gain ratio, were used on the thoracic surgery dataset to examine and explore the effect of used ranker feature selections on the machine learning classifiers. The dataset was collected from the Wroclaw University in UCI repository website. We have done two experiments to show the performances of the supervised classifiers on the dataset with and without employing the ranker feature selection. The obtained results with the ranker feature selections showed that J48, NB, and MLP’s accuracy improved, whereas RF accuracy decreased and support vector machine remained stable.
S. Prabha, S. Veni and S. Prabha. “Thoracic Surgery analysis using data mining techniques”. International Journal of Computer Technology and Applications, vol. 5, no. 1, pp. 578-586, 2014.
K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis and D. I. Fotiadisa. “Machine learning applications in cancer prognosis and prediction”. Computational and Structural Biotechnology Journal, vol. 13, pp. 8-17, 2015.
A. S.Dusky and L. M. El Bakrawy. “Improved prediction of post-operative life expectancy after Thoracic Surgery”. Advances in Systems Science and Applications, vol. 16, no. 2, pp. 70-80, 2016.
J. Joshi, R. Doshi and J. Patel. “Diagnosis of breast cancer using clustering data mining approach”. International Journal of Computer Applications, vol. 101, no. 10, pp. 13-17, 2014.
S. Vanaja and K. R. Kumar. “Analysis of feature selection algorithms on classification: A survey”. International Journal of Computer Applications, vol. 96, no. 17, pp. 29-35, 2014.
M. Zięba, J. Tomczak, M. Lubicz and J. Świątek. “Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients”. Applied Soft Computing, vol. 14, pp. 99-108, 2014.
M. U. Harun and N. Alam. “Predicting outcome of thoracic surgery by data mining techniques”. International Journal of Advanced Research in Computer Science and Software Engineering, vol. 5, no. 1, pp. 7-10, 2015.
M. Lubicz, K. Pawelczyk, A. Rzechonek and J. Kolodziej. “UCI Machine Learning Repository: Thoracic Surgery Data Data Set”, 2021. Available from: https://archive.ics.uci.edu/ml/datasets/ thoracic+surgery+data [Last accessed on 2021 Oct 08].
S. Xu. “Machine Learning-Assisted Prediction of Surgical Mortality of Lung Cancer Patients”. The IEEE International Conference on Data Mining, 2019.
S. Subbiah and J. Chinnappan. “An improved short term load forecasting with ranker based feature selection technique”. Journal of Intelligent and Fuzzy Systems, vol. 39, no. 5, pp. 6783-6800, 2020.
D. El Zein and A. Kalakech. “Feature Selection for Android Keystroke Dynamics”. 2018 International Arab Conference on Information Technology, 2018.
H. Talabani and A. V. C. Engin. “Performance Comparison of SVM Kernel Types on Child Autism Disease Database”. International Conference on Artificial Intelligence and Data Processing, 2018.
F. Y. Osisanwo, J. E. T. Akinsola, O. Awodele, J. O. Hinmikaiye, O. Olakanmi and J. Akinjobi. “Supervised machine learning algorithms: Classification and comparison”. International Journal of Computer Trends and Technology, vol. 48, no. 3, pp. 128-138, 2017.
M. Rathi and V. Pareek. “Spam mail detection through data mining a comparative performance analysis”. International Journal of Modern Education and Computer Science, vol. 5, no. 12, pp. 31- 39, 2013.
J. Wong. “Decision Trees Medium”, 2021. Available from: https:// towardsdatascience.com/decision-trees-14a48b55f297 [Last accessed on 2021 Oct 08].
A. Yadav and S. Chandel. “Solar energy potential assessment of Western Himalayan Indian state of Himachal Pradesh using J48 algorithm of WEKA in ANN based prediction model”. Renewable Energy, vol. 75, pp. 675-693, 2015.
K. Vembandasamy, R. Sasipriya and E. Deepa. “Heart diseases detection using Naive Bayes algorithm”. International Journal of Innovative Science, Engineering and Technology, vol. 9, no. 29, pp. 441-444, 2015.
M. Khishe and A. Safari. “Classification of sonar targets using an MLP neural network trained by dragonfly algorithm”. Wireless Personal Communications, vol. 108, no. 4, pp. 2241-2260, 2019.
H. Talabani and A. V. C. Engin. “Impact of Various Kernels on Support Vector Machine Classification Performance for Treating Wart Disease”. International Conference on Artificial Intelligence and Data Processing, 2018.
Copyright (c) 2021 Hezha M.Tareq Abdulhadi, Hardi Sabah Talabani
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.