Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
This study aims at creating an analogy between Quran text Contents as meanings that have representations as visional shapes within ornamental figures in Islamic architecture. The theoretical framework of the study deals with the concept of semantics and its parts, artistic contents of Quran texts, and ornamental figures in Islamic architecture. The study procedures included a population of (69) figures, (5) of them were chosen deliberately for analysis in accordance with a form that had been presented to a number of experts to ensure its validity. The study reached a number of conclusions, the most significant among them are: adopting natural denotation of direct reference in order to link the ornamental figure to the source it was taken
... Show MoreTHE PROBLEM OF TRANSLATING METAPHOR IN AN ARTISTIC TEXT (ON THE MATERIAL OF RUSSIAN AND ARABIC LANGUAGES)
Power-electronic converters are essential elements for the effective interconnection of renewable energy sources to the power grid, as well as to include energy storage units, vehicle charging stations, microgrids, etc. Converter models that provide an accurate representation of their wideband operation and interconnection with other active and passive grid components and systems are necessary for reliable steady state and transient analyses during normal or abnormal grid operating conditions. This paper introduces two Laplace domain-based approaches to model buck and boost DC-DC converters for electromagnetic transient studies. The first approach is an analytical one, where the converter is represented by a two-port admittance model via mo
... Show MoreGraphite Coated Electrodes (GCE) based on molecularly imprinted polymers were fabricated for the selective potentiometric determination of Risperidone (Ris). The molecularly imprinted (MIP) and nonimprinted (NIP) polymers were synthesized by bulk polymerization using (Ris.) as a template, acrylic acid (AA) and acrylamide (AAm) as monomers, ethylene glycol dimethacrylate (EGDMA) as a cross-linker and benzoyl peroxide (BPO) as an initiator. The imprinted membranes and the non-imprinted membranes were prepared using dioctyl phthalate (DOP) and Dibutylphthalate (DBP) as plasticizers in PVC matrix. The membranes were coated on graphite electrodes. The MIP electrodes using
... Show MoreThis paper interest to estimation the unknown parameters for generalized Rayleigh distribution model based on censored samples of singly type one . In this paper the probability density function for generalized Rayleigh is defined with its properties . The maximum likelihood estimator method is used to derive the point estimation for all unknown parameters based on iterative method , as Newton – Raphson method , then derive confidence interval estimation which based on Fisher information matrix . Finally , testing whether the current model ( GRD ) fits to a set of real data , then compute the survival function and hazard function for this real data.
Most of the medical datasets suffer from missing data, due to the expense of some tests or human faults while recording these tests. This issue affects the performance of the machine learning models because the values of some features will be missing. Therefore, there is a need for a specific type of methods for imputing these missing data. In this research, the salp swarm algorithm (SSA) is used for generating and imputing the missing values in the pain in my ass (also known Pima) Indian diabetes disease (PIDD) dataset, the proposed algorithm is called (ISSA). The obtained results showed that the classification performance of three different classifiers which are support vector machine (SVM), K-nearest neighbour (KNN), and Naïve B
... Show MoreThis paper presents a new algorithm in an important research field which is the semantic word similarity estimation. A new feature-based algorithm is proposed for measuring the word semantic similarity for the Arabic language. It is a highly systematic language where its words exhibit elegant and rigorous logic. The score of sematic similarity between two Arabic words is calculated as a function of their common and total taxonomical features. An Arabic knowledge source is employed for extracting the taxonomical features as a set of all concepts that subsumed the concepts containing the compared words. The previously developed Arabic word benchmark datasets are used for optimizing and evaluating the proposed algorithm. In this paper,
... Show MoreIn this paper, wavelets were used to study the multivariate fractional Brownian motion through the deviations of the random process to find an efficient estimation of Hurst exponent. The results of simulations experiments were shown that the performance of the proposed estimator was efficient. The estimation process was made by taking advantage of the detail coefficients stationarity from the wavelet transform, as the variance of this coefficient showed the power-low behavior. We use two wavelet filters (Haar and db5) to manage minimizing the mean square error of the model.
Early detection of brain tumors is critical for enhancing treatment options and extending patient survival. Magnetic resonance imaging (MRI) scanning gives more detailed information, such as greater contrast and clarity than any other scanning method. Manually dividing brain tumors from many MRI images collected in clinical practice for cancer diagnosis is a tough and time-consuming task. Tumors and MRI scans of the brain can be discovered using algorithms and machine learning technologies, making the process easier for doctors because MRI images can appear healthy when the person may have a tumor or be malignant. Recently, deep learning techniques based on deep convolutional neural networks have been used to analyze med
... Show More