An Evolutionary Model for Selecting Relevant Textual Features

Full Text (PDF, 835KB), PP.43-50

Views: 0 Downloads: 0

Author(s)

Taher Zaki 1,* Mohamed Salim EL Bazzi 1 Driss Mammass 1

1. IRF-SIC Laboratory, Faculty of Science, Ibn Zohr University, Agadir, Morocco

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2018.11.06

Received: 20 Aug. 2018 / Revised: 20 Sep. 2018 / Accepted: 17 Oct. 2018 / Published: 8 Nov. 2018

Index Terms

Arabic text, classification, natural selection, semantic vicinity, textual data, keyword extraction

Abstract

From a philosophical point of view, the words of a text or a speech are not held just for informational purposes, but they act and react; they have the power to react on their counterparts. Each word, evokes similar or different senses that can influence and interact with the following words, it has a vibratory property. It's not the words themselves that have the impact, but the semantic reaction behind the words. In this context, we propose a new textual data classification approach while trying to imitate human altruistic behavior in order to show the semantic altruistic stakes of natural language words through statistical, semantic and distributional analysis. We present the results of a word extraction method, which combines a distributional proximity index, a selection coefficient and a co-occurrence index with respect to the neighborhood.

Cite This Paper

Taher Zaki, Mohamed Salim EL Bazzi, Driss Mammass, " An Evolutionary Model for Selecting Relevant Textual Features", International Journal of Modern Education and Computer Science(IJMECS), Vol.10, No.11, pp. 43-50, 2018. DOI:10.5815/ijmecs.2018.11.06

Reference

[1]L. Greco, J. Boutet. le pouvoir des mots, Langage et societe (2016) : 131~134.
[2]P. Bourdieu, Ce que parler veut dire : L’économie des échanges linguistiques, Les éditions Fayard, 1982.
[3]D. Pierre, Hobbes et le pouvoir, Cahiers d’économie Politique / Papers in Political Economy 50 (2006) : 7~25. doi :10.3917/cep.050.0007.
[4]S. Khoja, S. Garside, Stemming Arabic Text, 1999. URL : http://www.comp.lancs.ac.uk/computing/users/khoja/stemmer.ps.
[5]S. E. Robertson, K. J. Spark, Simple proven approches to text retrieval, Technical Report, City University, Department of Information Science, 1997.
[6]S. E. Robertson, S. Walker and M. Beaulieu, Experimentation as a way of life: Okapi at TREC, Information Processing and Management (2000): 95~108.
[7]R. M. Quillian, Semantic Memory, in: Semantic Information Processing, MIT Press, 1968, pp. 216~270.
[8]E. W. Cheney, Introduction to Approximation Theory, McGraw-Hill Book Co, 1966.
[9]P. J. Davis, Interpolation and approximation, Dover books on advanced mathematics, Dover Publications, 1975.
[10]S. A Lukaszyk, new concept of probability metric and its applications in approximation of scattered data sets, Computational Mechanics (2004): 299~304.
[11]C. de Boor, Multivariate piecewise polynomials, Acta Numerica (1993): 65~109.
[12]P. G. Ciarlet, The Finite Element Method for Elliptic Problems, Classics in Applied Mathematics, Society for Industrial and Applied Mathematics, 2002.
[13]S. C. Brenner, L. R. Scott, The mathematical theory of finite element methods, Texts in applied mathematics, Springer-Verlag, New York,1994.
[14]M. J. Lai, L. L. Schumaker, Spline Functions on Triangulations, vol. 13, in Encyclopedia of Mathematics and its Applications, Cambridge University Press, 2007.
[15]G. F. Fasshauer, Meshfree Approximation Methods with MATLAB, World Scientific Publishing Co., Inc., River Edge, NJ, USA, 2007.
[16]T. Abeel, Y. Van de Peer and Y. Saeys, Java-ml : a machine learning library, JOURNAL OF MACHINE LEARNING RESEARCH 10 (2009) 931~934. URL : http://www.jmlr.org/papers/volume10/abeel09a/abeel09a.pdf.
[17]T. Kohonen, Self-organizing Maps, Springer-Verlag, Berlin, Heidelberg, 1997.
[18]Y. Zhao, Criterion Functions for Document Clustering, Ph.D. thesis, University of Minnesota, Minneapolis, MN, USA, 2005.
[19]M. Mozina, J. Demsar, M. Kattan and B. Zupan, “Nomograms for visualization of naive bayesian classifier”, Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD’04, Springer-Verlag New York, Inc., New York, NY, USA, 2004, pp. 337 ~348.
[20]J. L. Bentley, Multidimensional binary search trees used for associative searching, Commun. ACM 18 (1975) 509~517.
[21]F. Seydoux, M. Rajman and J. C. Chappelier, Exploitation de connaissances sémantiques externes dans les représentations vectorielles en recherche documentaire. Ph.D. thesis (2006).
[22]G. Salton, A. Wong and C. S. Yang, A vector space model for automatic indexing. Commun. ACM, vol. 18, no. 11, pages 613~620, 1975.
[23]P. Soucy, G. W. Mineau, “Beyond TFIDF weighting for text categorization in the vector space model”, Proceedings of the 19th international joint conference on Artificial intelligence, IJCAI'05, pages 1130_1135, San Francisco, CA, USA, 2005. Morgan Kaufmann Publishers Inc.
[24]J. FIRTH, Studies in Linguistic Analysis, chapter A synopsis of linguistic theory 1930-1955, (1957), p. 1–32. Blackwell : Oxford.
[25]M. Aljlayl, and O. Frieder, On Arabic Search: Improving the Retrieval Effectiveness via a Light Stemming Approach, In 11th International Conference on Information and Knowledge Management (CIKM), November 2002, Virginia (USA), pp.340-347.