Prediction of Lung Cancer Disease Using Machine Learning Techniques

Authors

  • Rukhsar Hatam Qadir Department of Statistics and Informatics, College-of-administration-and-economics, University of Sulaimani, Sulaimani, Kurdistan Region-Iraq
  • Karwan Mohammed HamaKarim Department of Information Technology, College of science and technology, University of Human Development, Sulaimani, Kurdistan Region-Iraq

DOI:

https://doi.org/10.21928/uhdjst.v8n2y2024.pp75-83

Keywords:

Machine Learning, Classifiers, Data Mining Techniques, Data Analysis, Learning Algorithms, Supervised Machine Learning

Abstract

The pursuit of algorithms utilizing external examples to formulate extensive hypotheses predicting the occurrence of novel instances is recognized, as supervised machine learning (SML). One of the jobs that intelligent systems perform the most frequently is supervised classification. The goal of this work is to evaluate supervised learning algorithms, explain SML classification methodologies, and identify the most effective classification algorithm given the available data. Two distinct machine learning (ML) techniques were examined: Random Forest (RF) and Neural Networks (NN). The algorithms were implemented using Python for knowledge analysis. For the categorization, 310 cases from a lung cancer data set were employed, with 15 features serving as independent variables and one serving as the dependent variable. In comparison to NN classification methods, RF was found to be the algorithm with the highest precision and accuracy, according to the results. The study reveals that while the kappa statistic and mean square error (MSE) are factors on the one hand, the time required to create a model and precision (accuracy) are factors on the other. Consequently, to have supervised predictive ML algorithms need to be precise, accurate, and minimum error. Thus, as a consequence of the research, we are currently at this analysis. The categorizing of NNs accuracy is 0.75 the MSE is 0.25, The RF classification accuracy is 0.89 and the MSE is 0.21.

References

C. L. Hann, M. A. Wu, N. Rekhtman and C. M. Rudin. In: V. T. DeVita, T. S. Lawrence and S. A. Rosenberg. “Cancer Principles and Practice of Oncology”. Ch. 49. Wolters Kluwer, Netherlands, pp. 671-700, 2019.

J. Liu, X. Kong, F. Xia, X. Bai, L. Wang, Q. Qing and I. Lee. “Artificial intelligence in the 21st century”. IEEE Access, vol. 6, pp.34403- 34421, 2018.

B. Mahesh. “Machine learning algorithms-a review”. International Journal of Science and Research, vol. 9, no. 1, pp. 381-386, 2020.

S. Badillo, B. Banfai, F. Birzele, I. I. Davydov, L. Hutchinson, T. Kam-Thong, and Zhang. An introduction to machine learning. Clinical pharmacology & therapeutics, vol. 107. no. 4, pp. 871-885, 2020.

A. Alanazi. “Using machine learning for healthcare challenges and opportunities”. Informatics in Medicine Unlocked, vol. 30, p. 100924, 2022.

Available from: https://www.foreseemed.com/blog/machinelearning-in-healthcare [Last accessed on 2024 Jul 12].

J. Wang, Q. Liu, S. Yuan, W. Xie, Y. Liu, Y. Xiang, N. Wu, L. Wu, X. Ma, T. Cai, Y. Zhang, Z. Sun and Y. Li. “Genetic predisposition to lung cancer: Comprehensive literature integration, meta-analysis, and multiple evidence assessment of candidate-gene association studies”. Scientific Reports, vol. 7. p. 8371, 2017.

K. Ten Haaf, C. M. Van der Aalst, H. J. De Koning, R. Kaaks and M. C. Tammemägi. “Personalising lung cancer screening: An overview of risk-stratification opportunities and challenges”. International Journal of Cancer, vol. 149, no. 2, pp. 250-263, 2021.

I. Toumazis, M. Bastani, S. S. Han and S. K. Plevritis. “Riskbased lung cancer screening: A systematic review”. Lung Cancer, vol. 147, pp. 154-186.

A. S. Tsao, G. V. Scagliotti, P. A. Bunn Jr., D. P. Carbone, G. W. Warren, C. Bai, H. J. De Koning, A. U. Yousaf-Khan, A. McWilliams, M. S. Tsao, P. S. Adusumilli, R. Rami-Porta,

H. Asamura, P. E. Van Schil,… & H. I. Pass. “Scientific advances in lung cancer 2015”. Journal of Thoracic Oncology, vol. 11, no. 5, pp. 613-638, 2016.

K. Ten Haaf, J. Jeon, M. C. Tammemägi, S. S. Han, C. Y. Kong, S. K. Plevritis, E. J. Feuer, H. J. De Koning, E. W. Steyerberg and R. Meza. “Risk prediction models for selection of lung cancer screening candidates: A retrospective validation study”. PLoS Medicine, vol. 14, no. 4, p. e1002277, 2017.

J. Sands, M. C. Tammemägi, S. Couraud, D. R. Baldwin, A. Borondy-Kitts, D. Yankelevitz, J. Lewis, F. Grannis, H. U. Kauczor, O. Von Stackelberg, L. Sequist, U. Pastorino and B. McKee. “Lung screening benefits and challenges: A review of the data and outline for implementation”. Journal of Thoracic Oncology, vol. 16, no. 1, pp.37-53, 2021.

H. A. Katki, S. A. Kovalchik, L. C. Petito, L. C. Cheung, E. Jacobs, A. Jemal, C. D. Berg and A. K. Chaturvedi. “Implications of nine risk prediction models for selecting ever-smokers for computed tomography lung cancer screening”. Annals of Internal Medicine, vol. 169, no. 1, pp. 10-19, 2018.

Available from: https://www.who.int/news-room/fact-sheets/detail/lungcancer?gad_source=1&gclid=cj0kcqjwhb60bhclarisabggtw9cqlaualhejgxrilmmi478y3jo5hxkyyvg0zkwxctc9u3kozdubagaatmeealw_wcb [Last accessed on 2024 Jul 10].

K. B. Prakash, S. S. Imambi, M. Ismail, T. P. Kumar and Y. N. Pawan. “Analysis, prediction and evaluation of covid-19 datasets using machine learning algorithms”. International Journal, vol. 8, no. 5, pp. 2199-2204, 2020.

N. Sharma, S. Appukutti, U. Garg, J. Mukherjee and S. Mishra. “Analysis of student’s academic performance based on their time spent on extra-curricular activities using machine learning techniques”. International Journal of Modern Education and Computer Science, vol. 15, no. 1, pp. 46-57, 2023.

M. Soni and S. Varma. “Diabetes prediction using machine learning techniques”. International Journal of Engineering Research and Technology. Vol. 9, no. 9, pp. 2278-0181, 2020.

M. S. Al-Batah, M. Alzyoud, R. Alazaidah, M. Toubat, H. Alzoubi and A. Olaiyat. “Early prediction of cervical cancer using machine learning techniques”. Jordanian Journal of Computers and Information Technology, vol. 8, no. 4, p. 1, 2022.

B. S. Abunasser, M. R. J. AL-Hiealy, A. M. Barhoom, A. R. Almasri and S. S. Abu-Naser. “Prediction of instructor performance using machine and deep learning techniques”. International Journal of Advanced Computer Science and Applications, vol. 13, no. 7, 2022.

N. Tariq. “Breast cancer detection using artificial neural networks”. Journal of Molecular Biomarkers and Diagnosis, vo. 9, no. 1, pp. 371, 2017.

Available from: https://www.javatpoint.com/classificationalgorithm-in-machine-learning [Last accessed on 2024 Jul 10].

Ž. Vujović. “Classification model evaluation metrics”. International Journal of Advanced Computer Science and Applications, vol. 12, no, 6, pp. 599-606, 2021.

Available from: https://www.analyticsvidhya.com/blog/2020/12/decluttering-the-performance-measures-of-classification-models [Last accessed on 2024 Jul 10].

R. Dharwal and L. Kaur. “Applications of artificial neural networks: A review”. Indian Journal of Science and Technology, vol. 9, no. 47, p. 1-8, 2016.

Available from: https://www.geeksforgeeks.org/neural-networks-abeginners-guide [Last accessed on 2024 Jul 10].

M. Ismail. “Lung cancer prediction using data mining techniques”. International Journal of Recent Technology and Engineering, vol. 8, no. 4, pp. 12301-12305, 2019.

Available from: https://www.javatpoint.com/machine-learningrandom-forest-algorithm [Last accessed on 2024 Jul 10].

Published

2024-11-17

How to Cite

Qadir, R. H., & HamaKarim, K. M. (2024). Prediction of Lung Cancer Disease Using Machine Learning Techniques. UHD Journal of Science and Technology, 8(2), 75–83. https://doi.org/10.21928/uhdjst.v8n2y2024.pp75-83

Issue

Section

Articles