Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
This research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB
... Show MoreThis study aimed at revealing the degree of availability of standards of word problems in mathematics books for the first three grades of the basic stage in Palestine. For this purpose, the researcher prepared an analysis tool and a list of criteria consisting of two areas: linguistic formulation and mathematical content. Every area had seven items. The results of the study showed that the third-grade mathematics book has the highest degree of availability of the standards with 85.75%, and then came the second-grade mathematics book with 83.12%. Finally, the first-grade mathematics book came with 80.13%. In the light of the previous results, the researcher recommended to develop the language of word problems, to take into account their i
... Show MoreThe road network serves as a hub for opportunities in production and consumption, resource extraction, and social cohabitation. In turn, this promotes a higher standard of living and the expansion of cities. This research explores the road network's spatial connectedness and its effects on travel and urban form in the Al-Kadhimiya and Al-Adhamiya municipalities. Satellite images and paper maps have been employed to extract information on the existing road network, including their kinds, conditions, density, and lengths. The spatial structure of the road network was then generated using the ArcGIS software environment. The road pattern connectivity was evaluated using graph theory indices. The study demands the abstractio
... Show MoreThe aim of the current research is to reveal the effect of using brain-based learning theory strategies on the achievement of Art Education students in the subject of Teaching Methods. The experimental design with two equal experimental and control groups was used. The experimental design with two independent and equal groups was used, and the total of the research sample was (60) male and female students, (30) male and female students represented the experimental group, and (30) male and female students represented the control group. The researcher prepared the research tool represented by the cognitive achievement test consisting of (20) questions, and it was characterized by honesty and reliability, and the experiment lasted (6) weeks
... Show Moret:
The most famous thing a person does is talk. He loves and hates, and continues with it confirming relationships, and with it, too, comes out of disbelief into faith. Marry a word and separate with a word. He reaches the top of the heavens with a kind word, with which he will gain the pleasure of God, and the Lord of a word that the servant speaks to which God writes with our pleasure or throws him on his face in the fire. Emotions are inflamed, the United Nations is intensified with a word, and relations between states and war continue with a word.
What comes out of a person’s mouth is a translator that expresses the repository of his conscience and reveals the place of his bed, for it is evidence of
... Show More<p><span>A Botnet is one of many attacks that can execute malicious tasks and develop continuously. Therefore, current research introduces a comparison framework, called BotDetectorFW, with classification and complexity improvements for the detection of Botnet attack using CICIDS2017 dataset. It is a free online dataset consist of several attacks with high-dimensions features. The process of feature selection is a significant step to obtain the least features by eliminating irrelated features and consequently reduces the detection time. This process implemented inside BotDetectorFW using two steps; data clustering and five distance measure formulas (cosine, dice, driver & kroeber, overlap, and pearson correlation
... Show MoreThe aim of this study was to determine the effect on using the McCarthy Model (4MAT) for developing creative writing skills and reflective thinking among undergraduate students. The quasi-experimental approach was adopted. And, in order to achieve the study objective, the educational content of Teaching Ethics (Approach 401), for the plan for the primary grades teacher preparation program was dealt with by using a teaching program based on the McCarthy Model (4MAT) was used.
The study which was done had been based on the academic achievement test for creative writing skills, and the reflective thinking test. The validity and reliability of the study tools were also confirmed. The study was applied to a sample consisting of
... Show MoreThis paper aims to improve the voltage profile using the Static Synchronous Compensator (STATCOM) in the power system in the Kurdistan Region for all weak buses. Power System Simulation studied it for Engineers (PSS\E) software version 33.0 to apply the Newton-Raphson (NR) method. All bus voltages were recorded and compared with the Kurdistan region grid index (0.95≤V ≤1.05), simulating the power system and finding the optimal size and suitable location of Static Synchronous Compensator (STATCOM)for bus voltage improvement at the weakest buses. It shows that Soran and New Koya substations are the best placement for adding STATCOM with the sizes 20 MVAR and 40 MVAR. After adding STATCOM with the sizes [20MVAR and 40MV
... Show More