Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy they got. Deep Learning (DL) and Machine Learning (ML) models were used to enhance text classification for Arabic language. Remarks for future work were concluded.
ST Alawi, NA Mustafa, Al-Mustansiriyah Journal of Science, 2013
A steganography hides information within other information, such as file, message, picture, or video. A cryptography is the science of converting the information from a readable form to an unreadable form for unauthorized person. The main problem in the stenographic system is embedding in cover-data without providing information that would facilitate its removal. In this research, a method for embedding data into images is suggested which employs least significant bit Steganography (LSB) and ciphering (RSA algorithm) to protect the data. System security will be enhanced by this collaboration between steganography and cryptography.
<span>One of the main difficulties facing the certified documents documentary archiving system is checking the stamps system, but, that stamps may be contains complex background and surrounded by unwanted data. Therefore, the main objective of this paper is to isolate background and to remove noise that may be surrounded stamp. Our proposed method comprises of four phases, firstly, we apply k-means algorithm for clustering stamp image into a number of clusters and merged them using ISODATA algorithm. Secondly, we compute mean and standard deviation for each remaining cluster to isolate background cluster from stamp cluster. Thirdly, a region growing algorithm is applied to segment the image and then choosing the connected regi
... Show MoreThe Machine learning methods, which are one of the most important branches of promising artificial intelligence, have great importance in all sciences such as engineering, medical, and also recently involved widely in statistical sciences and its various branches, including analysis of survival, as it can be considered a new branch used to estimate the survival and was parallel with parametric, nonparametric and semi-parametric methods that are widely used to estimate survival in statistical research. In this paper, the estimate of survival based on medical images of patients with breast cancer who receive their treatment in Iraqi hospitals was discussed. Three algorithms for feature extraction were explained: The first principal compone
... Show MoreIn this paper, we investigate the automatic recognition of emotion in text. We perform experiments with a new method of classification based on the PPM character-based text compression scheme. These experiments involve both coarse-grained classification (whether a text is emotional or not) and also fine-grained classification such as recognising Ekman’s six basic emotions (Anger, Disgust, Fear, Happiness, Sadness, Surprise). Experimental results with three datasets show that the new method significantly outperforms the traditional word-based text classification methods. The results show that the PPM compression based classification method is able to distinguish between emotional and nonemotional text with high accuracy, between texts invo
... Show MoreAnomaly detection is still a difficult task. To address this problem, we propose to strengthen DBSCAN algorithm for the data by converting all data to the graph concept frame (CFG). As is well known that the work DBSCAN method used to compile the data set belong to the same species in a while it will be considered in the external behavior of the cluster as a noise or anomalies. It can detect anomalies by DBSCAN algorithm can detect abnormal points that are far from certain set threshold (extremism). However, the abnormalities are not those cases, abnormal and unusual or far from a specific group, There is a type of data that is do not happen repeatedly, but are considered abnormal for the group of known. The analysis showed DBSCAN using the
... Show Moreيعد هذا النص أحد النصوص المسمارية المصادرة التي بحوزة المتحف العراقي، ويحمل الرقم المتحفي (235869)، قياساته )12،7x 6x 2،5سم). يتضمن مدخولات كميات من الشعير،أرخ النص الى عصر أور الثالثة (2012-2004 ق.م) و يعود الى السنة الثالثة من حكم الملك أبي-سين (2028-2004 ق.م)،أن الشخصية الرئيسة في هذا النص هو)با-اَ-كا مسمن الماشية( من مدينة أري-ساكرك، ومقارنته مع النصوص المسمارية المنشورة التي تعود الى أرشيفه يبلغ عددها (196) نصاً تضمنت نشاطاته م
... Show More