Donia Gamal

Work place: Computer Science Department, Faculty of computer and information sciences, Ain Shams University, Cairo, Egypt

E-mail: donia.gamaleldin@cis.asu.edu.eg

Website:

Research Interests: Computer systems and computational processes, Artificial Intelligence, Computational Learning Theory, Data Structures and Algorithms

Biography

Donia Gamal is a teaching and research assistant in Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt and received the B.Sc. degree with very good with honors in 2013 and M.SC degree in 2019. Her research interests: Sentiment Analysis, Machine Learning, and Artificial.

Author Articles
Introducing Arabic-SQuADv2.0 for Effective Arabic Machine Reading Comprehension

By Zeyad Ahmed Mariam Zeyada Youssef Amin Donia Gamal Hanan Hindy

DOI: https://doi.org/10.5815/ijeme.2023.05.03, Pub. Date: 8 Oct. 2023

Machine Reading Comprehension (MRC), known as the ability of computers to read and understand unstructured text and then answer questions, is still an open research field. MRC is considered one of the most research-demanding sub-tasks in Natural Language Processing (NLP) and Natural Language Understanding (NLU). MRC introduces multiple research challenges. One of these challenges is that the models should be trained to answer all questions and abstain from answering when the answer is not covered in the given context. Another challenge lies in dataset availability. These challenges are amplified for non-Latin-based languages; Arabic as an example. Currently, available Arabic MCR datasets are either small-sized high-quality collections or large-sized low-quality datasets. Additionally, they do not include unanswerable questions. This lack of resources depicts the model as incapable of real-world deployments. To tackle these challenges, this paper proposes a novel large-size high-quality Arabic MRC dataset that includes unanswerable questions, named “Arabic-SQuAD v2.0'”. The dataset consists of 96051 triplets {question, context, answer} in an attempt to help enrich the field of Arabic-MRC. Furthermore, a Machine Learning (ML)-based model is introduced that is capable of effectively solving Arabic MRC-with-unanswerable questions. The results of the proposed model are satisfactory and comparable with Latin-based language models. Furthermore, the results show a significant improvement of the current state-of-the-art Arabic MRC. To be exact, the model scores 71.49 F1-score and 65.12 Exact Match (EM). This proposed dataset and implementation pave the way to further Arabic MRC; aiming to reach a state when MRC models could mimic human text reasoning.

[...] Read more.
Twitter Benchmark Dataset for Arabic Sentiment Analysis

By Donia Gamal Marco Alfonse El-Sayed M.El-Horbaty Abdel-Badeeh M.Salem

DOI: https://doi.org/10.5815/ijmecs.2019.01.04, Pub. Date: 8 Jan. 2019

Sentiment classification is the most rising research areas of sentiment analysis and text mining, especially with the massive amount of opinions available on social media. Recent results and efforts have demonstrated that there is no single strategy can mutually accomplish the best prediction performance on various datasets. There is a lack of existing researches to Arabic sentiment analysis compared to English sentiment analysis, because of the unique nature and difficulty of the Arabic language which leads to shortage in Arabic dataset used in sentiment analysis. An Arabic benchmark dataset is proposed in this paper for sentiment analysis showing the gathering methodology of the most recent tweets in different Arabic dialects. This dataset includes more than 151,000 different opinions in variant Arabic dialects which labeled into two balanced classes, namely, positive and negative. Different machine learning algorithms are applied on this dataset including the ridge regression which gives the highest accuracy of 99.90%.

[...] Read more.
Other Articles