Semantic Indexing of Web Documents Based on Domain Ontology

Full Text (PDF, 630KB), PP.1-11

Views: 0 Downloads: 0

Author(s)

Abdeslem DENNAI 1,* Sidi Mohammed BENSLIMANE 1

1. EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbes Algeria

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2015.02.01

Received: 5 Apr. 2014 / Revised: 21 Jul. 2014 / Accepted: 9 Oct. 2014 / Published: 8 Jan. 2015

Index Terms

Reverse Engineering, Ontology, Semantic Distance, Semantic Indexing, Semantic Web

Abstract

The first phase of reverse engineering of web-oriented applications is the extraction of concepts hidden in HTML pages including tables, lists and forms, or marked in XML documents. In this paper, we present an approach to index semantically these two sources of information (HTML page and XML document) using on the one hand, domain ontology to validate the extracted concepts and on the other hand the similarity measurement between ontology concepts with the aim of enrichment the index. This approach will be conceived in three steps (modeling, attaching and Enrichment) and thereafter, it will be realized and implemented by examples. The obtained results lead to better re-engineering of web applications and subsequently a distinguished improvement in the web structuring.

Cite This Paper

Abdeslem DENNAI, Sidi Mohammed BENSLIMANE, "Semantic Indexing of Web Documents Based on Domain Ontology", International Journal of Information Technology and Computer Science(IJITCS), vol.7, no.2, pp.1-11, 2015. DOI:10.5815/ijitcs.2015.02.01

Reference

[1]P. Tramontana, "Reverse engineering web applications", in IEEE (Ed.), in Proceedings 21st International Conference on Software Maintenance (ICSM05), pp. 705–708, Budapest, Hungary, 2005.

[2]F. Ricca and P. Tonella, "Using clustering to support the migration from static to dynamic web pages", in Proceedings of the 11th International Workshop on Program Comprehension, pp. 207–216, Portland Oregon, USA, 2003.

[3]F. Estivenart, A. Franois, J. Henrard and J. Hainaut, "A tool-supported method to extract data and schema from web sites", in Proceedings of the 5th International Workshop on Web Site Evolution, pp. 3–11, Amsterdam, Netherlands, 2003.

[4]L. Paganelli and F. Paterno, "Automatic reconstruction of the underlying interaction design of web applications", in A. Press (Ed.), in Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, pp. 439–445, Ishia Italy, 2002.

[5]Y. Gaeremynck, L. Bergman and A. Lau, "More for less: Model recovery from visual interfaces for multi-device application design", in A. Press (Ed.), in Proceedings of the International Conference on Intelligent User Interfaces, pp. 69–76, Miami Florida, USA, 2003.

[6]G. D. Lucca, A. Fasolino, F. Pace, P. Tramontana and U. D. Carlini, "Ware: a tool for the reverse engineering of web applications", in Proceedings of the 6th European Conference on Software Maintenance and Reengineering (CSMR2002), pp. 02-41, Budapest, Hungary, 2002.

[7]C. Bellettini, A. Marchetto and A. Trentini, "Webuml: Reverse engineering of web applications", in 19th ACM Symposium on Applied Computing (SAC 2004), pp. 1662–1669, Nicosia, Cyprus, 2004.

[8]P.A Gomez and D. Rojas Amaya, "Ontological reengineering for reuse", Fensel D. and Studer R., Eds., 11th European Workshop on Knowledge Acquisition, Modeling and Management (EKAW-99), Vol. 1621 of LNAI, pp. 26–29, Berlin, Germany, 1999 (Springer, pp. 139–156).

[9]F. Frédéric, "L’ingénierie ontologique", Institute for Research in Computer of Nantes France, Research Report No. 02-07, Oct. 2002.

[10]F. Gandon, "Ontology Engineering: a survey and a return on experience", Research Report No. 4396, INRIA, 2002.

[11]B. Peterson, W. Andersen and J. Engel, "Knowledge bus: Generating application focused databases from large ontologies", in Proceedings of the 5th KRDB Workshop, Seattle, WA, 1998.

[12]J. Conesa and A. Olive, ‘’Pruning ontologies in the development of conceptual schemas of information systems’’, in ER’2004, LNCS 3288, pp. 122–135, 2004.

[13]H. El-Ghalayini, M. Odeh and R. McClatchey, ‘’Deriving conceptual data models from domain ontologies for bioinformatics’’, in the 2nd International Conference on Information and Communication Technologies from Theory to Application ICTTA, 2006.

[14]O. Vasilecas and D. Bugaite, ‘’An algorithm for the automatic transformation of ontology axioms into a rule model’’, in Proceedings of the 2007 International Conference on Computer Systems and Technologies (CompSysTech ’07), pp. 1–6, Bulgaria, 2007.

[15]V. Jain and M. Singh, “Ontology based information retrieval in semantic web: A survey”, International Journal of Information Technology and Computer Science (IJITCS), pp. 62-69, 2013.

[16]S. Chagheri, C. Roussey, S. Calabretto and C. Dumoulin, "Semantic indexing of technical documentation", LIRIS 2009.

[17]C. Roussey, S. Calabretto and J. M Pinon, "Etat de l’art en indexation et recherche d’information", Digital document, special issue: Gestion des documents et gestion des connaissances, Vol. 3, No. 3-4, pp. 121-150, Dec. 1999.

[18]J. Y Nie, "Le domaine de la recherche d’information, survol d’une longue histoire" in Gaussier (E.), Stefanini (M-H.), Intelligent search assistance information, Treaty Collection Science and Information Technology, pp.19-28, Lavoisier, Paris, 2003.

[19]C. Fluher, "Le traitement du langage naturel dans la recherche d’information", in Intelligent interface for Scientific and Technical Information ; Klingenthal: INRIA, pp. 103-130, 1992.

[20]P.D Pomart and E. Sutter, "Indexation", Article of the Encyclopedic Dictionary of Information and Documentation, Paris, Nathan pp. 284-287, 1997.

[21]M. Hadj Henni, "Approche ontologique pour la modélisation sémantique, l’indexation et l’interrogation des documents Coraniques", Computer science memory schoolmaster, School of Computer Science, Oued-Smar, Algeria, 2009.

[22]J. Maniez, "Actualité des langages documentaires, Fondements théoriques de la recherche d’information", ABDS Paris Ed., 2002. 

[23]P. Lefevre, "La recherche d’information du texte intégral au thésaurus", Paris Hermès Ed., pp. 253, 2000.

[24]W. Mustafa El Hadi, "Indexation humaine et indexation automatisée : la place du terme et des environnements", 7th Science Meeting AUF - LTT "Words, terms and contexts", Brussels, Belgium, 2005.

[25]EM. El-Hachani, "Indexation des documents multilingues d'actualité incluant l’arabe: équivalence interlangues et gestion des connaissances chez les indexeurs", PhD Thesis, University of Lyon 2, France, Nov. 14, 2005.

[26]R. Wilkinson, "Effective retrieval of structured documents". (S.-V. New York, Ed.) pp. 311 – 317, 1994.

[27]Y. Mass, "Component ranking and automatic query refinement for XML retrieval", INEX 2004, pp. 134–140.

[28]L.R. Khan, "Retrieval effectiveness of an ontology-based model for information selection", International Journal on Very Large Data Bases (IJVLDB), vol. 13, pp. 71–85, 2004.

[29]H. Zargayouna and S. Salotti, "Mesure de similarité dans une ontologie pour l'indexation sémantique de documents XML" in Francophones Days of Knowledge Engineering, Lyon, France, 2004.

[30]M. Volk, B. Ripplinger and S. Vintar, "Semantic annotation for concept-based cross-language medical information retrieval" in International Journal of Medical Informatics, Vol. 67 pp. 1-3, Dec. 2002.

[31]M. Volk, S. Vintar and P. Buitelaar, "Ontologies in cross-language information retrieval", in Proceedings of 2nd Conference on Professional Knowledge Management, Lucerne, Switzerland, 2003.

[32]Z. Wu and M. Palmer, "Verb semantics and lexical selection", in Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics, pp. 133-138, 1994.