In this study, we have created a new Arabic dataset annotated according to Ekman’s basic emotions (Anger, Disgust, Fear, Happiness, Sadness and Surprise). This dataset is composed from Facebook posts written in the Iraqi dialect. We evaluated the quality of this dataset using four external judges which resulted in an average inter-annotation agreement of 0.751. Then we explored six different supervised machine learning methods to test the new dataset. We used Weka standard classifiers ZeroR, J48, Naïve Bayes, Multinomial Naïve Bayes for Text, and SMO. We also used a further compression-based classifier called PPM not included in Weka. Our study reveals that the PPM classifier significantly outperforms other classifiers such as SVM and Naïve Bayes achieving the highest results in terms of accuracy, precision, recall, and F-measure.
Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy th
... Show MoreThe dependable and efficient identification of Qin seal script characters is pivotal in the discovery, preservation, and inheritance of the distinctive cultural values embodied by these artifacts. This paper uses image histograms of oriented gradients (HOG) features and an SVM model to discuss a character recognition model for identifying partial and blurred Qin seal script characters. The model achieves accurate recognition on a small, imbalanced dataset. Firstly, a dataset of Qin seal script image samples is established, and Gaussian filtering is employed to remove image noise. Subsequently, the gamma transformation algorithm adjusts the image brightness and enhances the contrast between font structures and image backgrounds. After a s
... Show MoreThe study aims to examine the reality of preparing the Arabic language teacher for non-native speakers by presenting the experience of the Arabic Language Institute at the International University of Africa. Thus, it addresses the following questions: How is it possible to invest the long scientific experiences in proposal and experiment preperations to qualify Arabic language teachers for non-native speakers? What is the reality of preparing an Arabic language teacher at the Institute? How did the Arabic Language Institute process teacher preparation? What are the problems facing the preparation of the Arabic language teachers and the most important training mechanisms used in that Institute?What problems faced the implementation of the
... Show MoreCryptography is a method used to mask text based on any encryption method, and the authorized user only can decrypt and read this message. An intruder tried to attack in many manners to access the communication channel, like impersonating, non-repudiation, denial of services, modification of data, threatening confidentiality and breaking availability of services. The high electronic communications between people need to ensure that transactions remain confidential. Cryptography methods give the best solution to this problem. This paper proposed a new cryptography method based on Arabic words; this method is done based on two steps. Where the first step is binary encoding generation used t
... Show MoreThe purpose of this paper to discriminate between the poetic poems of each poet depending on the characteristics and attribute of the Arabic letters. Four categories used for the Arabic letters, letters frequency have been included in a multidimensional contingency table and each dimension has two or more levels, then contingency coefficient calculated.
The paper sample consists of six poets from different historical ages, and each poet has five poems. The method was programmed using the MATLAB program, the efficiency of the proposed method is 53% for the whole sample, and between 90% and 95% for each poet's poems.
Back ground: Diabetic nephropathy is rapidly becoming the leading cause of end-stage renal disease (ESRD). The onset and course of DN can be ameliorated to a very significant degree if intervention institutes at a point very early in the course of the development of this complication.
Objective: The aim of this study was to characterize risk factors associated with nephropathy in type I diabetes and construct a module for early prediction of diabetic nephropathy (DN) by analyzing their risk factors.
Methods: Case control design of 400 patients with type I diabetes mellitus (IDDM), aged 19-45 years. The cases were 200 diabetic patients with overt protein urea while the controls were 200 diabetic patients with no protein urea or micr
The tight gas is one of the main types of the unconventional gas. Typically the tight gas reservoirs consist of highly heterogeneous low permeability reservoir. The economic evaluation for the production from tight gas production is very challenging task because of prevailing uncertainties associated with key reservoir properties, such as porosity, permeability as well as drainage boundary. However one of the important parameters requiring in this economic evaluation is the equivalent drainage area of the well, which relates the actual volume of fluids (e.g gas) produced or withdrawn from the reservoir at a certain moment that changes with time. It is difficult to predict this equival