A Study on Test Variable Selection and Balanced Data for Cervical Cancer Disease

Full Text (PDF, 520KB), PP.1-7

Views: 0 Downloads: 0

Author(s)

Kemal Akyol 1,*

1. Department of Computer Engineering, Kastamonu University, Kastamonu, 37100, Turkey

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2018.05.01

Received: 1 Mar. 2018 / Revised: 17 Apr. 2018 / Accepted: 24 May 2018 / Published: 8 Sep. 2018

Index Terms

Cervical cancer, the importance of test variables, random over-sampling, random under-fitting, stability selection, random forest

Abstract

Cancer is a pestilent disease. One of the most important cancer kinds, cervical cancer is a malignant tumor which threats women's life. In this study, the importance of test variables for cervical cancer disease is investigated by utilizing Stability Selection method. Also, Random Under-Sampling and Random Over-Sampling methods are implemented on the dataset. In this context, the learning model is designed by using Random Forest algorithm. The experimental results show that Stability Selection, Random Over-Sampling and Random Forest based model are more successful, approximately 98% accuracy.

Cite This Paper

Kemal Akyol, "A Study on Test Variable Selection and Balanced Data for Cervical Cancer Disease", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.10, No.5, pp. 1-7, 2018. DOI:10.5815/ijieeb.2018.05.01

Reference

[1]E.L. Dickson, R.I. Vogel, X. Luo, L.S. Downs, “Recent trends in typespecific HPV infection rates in the United States,” Epidemiol Infect, vol. 143, no. 5, pp. 1042-1047, 2015.
[2]O.W. Brawley and S.G. Cowal, “Civil society’s role in efforts to control women’s cancers,” Lancet, vol. 389, no. 10071, pp. 775-776, 2017.
[3]I.D. Duncan, “Cervical screening,” The Obstetrician & Gynaecologist, vol. 6, no. pp. 93–97, 2004.
[4]H. Demirhindi, E. Nazlıcan, M. Akbaba, “Cervical cancer screening in Turkey: A community-based experience after 60 years of Pap smear usage,” Asian Pac J Cancer P, vol. 13, no.12, pp. 6497-6500, 2012.
[5]“Turkish Cervical Cancer and Cervical Cytology Research Group. Prevalence of cervical cytological abnormalities in Turkey,” Int J Gynaecol Obstet, vol. 106, no.3, pp. 206-209, 2009.
[6]N. Gökgöz and D. Aktaş, “Determination of women awareness level of cervical cancer & conducting Pap-Smear Test,” Yildirim Beyazit Universitesi Hemşirelik E-Dergisi, vol. 3, pp.11-23, 2015.
[7]G. Ruzigana, L. Bazzet-Matabele, S. Rulisa, A.N. Martin, R.G. Ghebre, “Cervical cancer screening at a tertiary care center in Rwanda,” Gynecol Oncol Rep, vol. 21, pp.13-16, 2017.
[8]M. Başaran, A. Başaran and Z. Küçükaydın, “Restaging in cervical cancer,” Turkiye Klinikleri J Gynecol Obst-Special Topics, vol. 8, no.1, pp. 117-127, 2015.
[9]C. Eroglu, R. Keşli, M.A. Eryılmaz, Y. Ünlü, O. Gönenç, Ç. Çelik, “Serviks kanseri için riskli olan kadınlarda HPV tiplendirmesi ve HPV sıklığının risk faktörleri ve servikal smearle ilişkisi,” Nobel Medicus, vol. 7, no.3, pp.72-77, 2011.
[10]L.H. Aktun, Y. Aykanat, F. Gökdağlı-Sağır, “Are cervicovaginal smear tests reliable during pregnancy?” Medeniyet Medical Journal, vol. 32, no.2, pp. 111-114, 2017.
[11]L. Denny, S. de Sanjose, M. Mutebi, B.O. Anderson, Kim J, Jeronimo J, Herrero R, Yeates K, O. Ginsburg, R. Sankaranarayanan, “Interventions to close the divide for women with breast and cervical cancer between low-income and middle-income countries and high-income countries,” Lancet, vol. 389, no. 10071, pp.861-870, 2017.
[12]B.F. Lees, B.K. Erickson, W.K. Huh, “Cervical cancer screening: evidence behind the guidelines,” Am J Obstet Gynecol, vol. 214, no.4, pp. 438-443, 2016.
[13]E. Nazlıcan, M. Akbaba, H. Koyuncu, N. Savaş, B. Karaca, “Cervical cancer screening between 35-40 aged women at Kisecik region of Hatay provinence,” TAF Preventive Medicine Bulletin, vol.9, no.5, pp. 471-474, 2010.
[14]E. Fusco, F. Padula, E. Mancini, A. Cavalieri, G. Grubisic, “History of colposcopy: a brief biography of Hinselmann,” Journal of Prenatal Medicine, vol. 2, no.2, pp. 19-23, 2008.
[15]A. Singer, J.M. Monaghan, S.C. Quek, “Lower genital tract precancer colposcopy, pathology and treatment,” 2nd ed. Wiley: Blackwell Science, 2008.
[16]J.S. Bentz, “Liquid-based cytology for cervical cancer screening,” Expert Rev Mol Diagn, vol. 5, no.6, pp. 857-871, 2005.
[17]S.B. Kaveri, S. Khandelwal, “Role of Pap smear N cervical biopsy in unhealthy cervix,” Journal of Scientific and Innovative Research, vol.4, no.1, pp.4-9, 2015.
[18]D.J. Dittman, T.M. Khoshgoftaar, R. Wald, A. Napolitano, “Comparison of data sampling approaches for imbalanced bioinformatics data,” Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference, May 21-23, Florida, 2014.
[19]A.O. Durahim, “Comparison of sampling techniques for imbalanced learning,” Yönetim Bilişim Sistemleri Dergisi, vol. 1, no. 3, pp. 181-191, 2016.
[20]U. R. Salunkhe, S. N. Mali, "A Hybrid Approach for Class Imbalance Problem in Customer Churn Prediction: A Novel Extension to Under-sampling", International Journal of Intelligent Systems and Applications (IJISA), Vol.10, No.5, pp.71-81, 2018. DOI: 10.5815/ijisa.2018.05.08
[21]T. Sumadhi, M. Hemalatha, “An Enhanced Approach for Solving Class Imbalance Problem in Automatic Image Annotation,” International Journal of Image, Graphics and Signal Processing (IJIGSP), vol.5, no.2, pp.9-16, 2013.DOI: 10.5815/ijigsp.2013.02.02
[22]H. Kaur, Er. P. Verma, “E-Mail Spam Detection Using Refined MLP with Feature Selection,” International Journal of Modern Education and Computer Science (IJMECS), vol.9, no.9, pp. 42-52, 2017. DOI: 10.5815/ijmecs.2017.09.05
[23]S. Goswami, S. Chakraborty, H. N. Saha, "An Univariate Feature Elimination Strategy for Clustering Based on Metafeatures", International Journal of Intelligent Systems and Applications (IJISA), vol.9, no.10, pp.20-30, 2017. DOI: 10.5815/ijisa.2017.10.03
[24]F. Mordelet, J. Horton, A.J. Hartemink, B.E. Engelhardt and R. Gordân, “Stability selection for regression-based models of transcription factor–DNA binding specificity,” Bioinformatics, vol. 29, no.13, pp. i117–i125, 2013.
[25]M. Kumar, A.J. Singh, "Evaluation of Data Mining Techniques for Predicting Student’s Performance", International Journal of Modern Education and Computer Science (IJMECS), Vol.9, No.8, pp.25-31, 2017.DOI: 10.5815/ijmecs.2017.08.04
[26]L. Breiman, “Random forests,” Mach Learn, vol. 45, pp. 5-32, 2001.
[27]O. Akar and O. Gungor, “Classification of multispectral images using random forest algorithm,” Journal of Geodesy and Geoinformation, vol. 1, pp. 139-146, 2012.
[28]S.A. Shaikh, Measures derived from a 2x2 table for an accuracy of a diagnostic test. J Biom Biostat, vol. 2, no. 128, pp. 1-4, 2011.