Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

3

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Sun Jan 01 2023

Journal Name

Aip Conference Proceedings

Surface enhanced Raman spectroscopy based sensitive and specific detection of vitamin D3, glycated hemoglobin, and serum lipid profile of breast cancer patients

Rawaa A.

Zainab F.

Maha M. Kadhim

Fadhil Jawad

...Show More Authors

Considering the expanding frequency of breast cancer and high incidence of vitamin D3 [25(OH)D3] insufficiently, this investigate pointed to explain a relation between serum [25(OH)D3] (the sunshine vitamin) level and breast cancer hazard. The current study aimed to see how serum levels of each [25(OH)D3], HbA1c%, total cholesterol (TC), high density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C), and triglyceride (TG) were affected a woman’s risk of getting breast cancer. In 40 healthy volunteers and 69 untreated breast cancer patients with clinical and histological evidence which include outpatients and hospitalized admissions patients at the Oncology Center, Medical City / Baghdad - Iraq. Venous blood samp

View Publication

(8)

(6)

Publication Date

Fri Apr 01 2016

Journal Name

Al–bahith Al–a'alami

Indicators of Mental image among Students of the University of Baghdad about Iraqi Political Parties-(a research based on a master thesis)

Indicators

of Mental image

Students

University of Baghdad

Political Parties

Iraqi

نور اختياري

باقر موسى

...Show More Authors

Media studies have focused mostly on the issue of the mental image because the image that is formed in the mind has become not only a photo of a human being and having kept for himself. This image has an outside influence which may sometimes up to the formation of the fate of others and it sometimes includes individuals and groups together.
This study comes in the context of identifying the image of Iraqi political parties among Iraqi university students and the nature of the view that students have in their minds about these parties.
Chapter one includes the problem of the research, the importance of the study, the goals and method used. Chapter two is divided into two sections: section one deals with the concept of the mental i

View Publication Preview PDF

Publication Date

Sun Jan 01 2017

Journal Name

البحوث التربوية والنفسية

The effectiveness of an educational design based on Herman’s total brain theory on the achievement of chemistry among fifth-grade female students

Basma

...Show More Authors

Preview PDF

Publication Date

Wed Oct 11 2023

Journal Name

Journal Of Educational And Psychological Researches

Motivations of Volunteers in Jordan: An Exploratory Study Based on a Sample of University Students within the Context of their Social Relations

Jordan

volunteering

functions and motivations of volunteering

Mohamed

Naser

Ahmed

...Show More Authors

Volunteerism is an element included in many human cultures. It represents a positive cooperative act between individuals and groups. It expresses the social value systems. As a social phenomenon, it develops in societies according to innumerous circumstances and conditions. This study uses a functional approach that assumes that volunteering performs six functions for volunteers. Namely, we assume that volunteering (1) creates a sense of protection (2) meets significant cultural values (3) improves professional status of volunteers, (4) strengthens their social relationships, (5) helps them achieve a better understanding of life, and finally, (6) enhances their outlook and self-esteem. The central aim of the study is to discuss these fun

View Publication Preview PDF

Publication Date

Thu May 05 2022

Journal Name

Karbala International Journal Of Modern Science

Heterogeneous catalytic degradation of dye by Fenton-like oxidation over a continuous system based on Box–Behnken design and traditional batch experiments

Fenton-like

Bimetallic nanoparticles

Direct blue 15 dye

Fixed-bed column

Breakthrough curve

Imad

Mohammed A.

...Show More Authors

In this study, iron was coupled with copper to form a bimetallic compound through a biosynthetic method, which was then used as a catalyst in the Fenton-like processes for removing direct Blue 15 dye (DB15) from aqueous solution. Characterization techniques were applied on the resultant nanoparticles such as SEM, BET, EDAX, FT-IR, XRD, and zeta potential. Specifically, the rounded and shaped as spherical nanoparticles were found for green synthesized iron/copper nanoparticles (G-Fe/Cu NPs) with the size ranging from 32-59 nm, and the surface area was 4.452 m2/g. The effect of different experimental factors was studied in both batch and continuous experiments. These factors were H2O2 concentration, G-Fe/CuNPs amount, pH, initial DB15

View Publication

(10)

(6)

Publication Date

Sat Feb 03 2018

Journal Name

Chinese Journal Of Physics

A true random number generator based on the photon arrival time registered in a coincidence window between two single-photon counting modules

True random number generators

Photon arrival times

Raghad Saeed

Shelan

Nahla Qader

Ahmed Ismael

...Show More Authors

True random number generators are essential components for communications to be conconfidentially secured. In this paper a new method is proposed to generate random sequences of numbers based on the difference of the arrival times of photons detected in a coincidence window between two single-photon counting modules

View Publication

(20)

(15)

Publication Date

Thu Apr 03 2025

Journal Name

Engineering, Technology & Applied Science Research

Application of the One-Step Second-Derivative Method for Solving the Transient Distribution in Markov Chain

transient distribution

Chapman-Kolmogorov

differential equation

numerical method

initial value problem

Zeina

...Show More Authors

Markov chains are an application of stochastic models in operation research, helping the analysis and optimization of processes with random events and transitions. The method that will be deployed to obtain the transient solution to a Markov chain problem is an important part of this process. The present paper introduces a novel Ordinary Differential Equation (ODE) approach to solve the Markov chain problem. The probability distribution of a continuous-time Markov chain with an infinitesimal generator at a given time is considered, which is a resulting solution of the Chapman-Kolmogorov differential equation. This study presents a one-step second-derivative method with better accuracy in solving the first-order Initial Value Problem

View Publication

Publication Date

Wed Jul 17 2019

Journal Name

Advances In Intelligent Systems And Computing

A New Arabic Dataset for Emotion Recognition

emotions recognition

text categorization

machine learn-ing

PPM

WEKA

Arabic corpus

Amer J.

William J.

...Show More Authors

In this study, we have created a new Arabic dataset annotated according to Ekman’s basic emotions (Anger, Disgust, Fear, Happiness, Sadness and Surprise). This dataset is composed from Facebook posts written in the Iraqi dialect. We evaluated the quality of this dataset using four external judges which resulted in an average inter-annotation agreement of 0.751. Then we explored six different supervised machine learning methods to test the new dataset. We used Weka standard classifiers ZeroR, J48, Naïve Bayes, Multinomial Naïve Bayes for Text, and SMO. We also used a further compression-based classifier called PPM not included in Weka. Our study reveals that the PPM classifier significantly outperforms other classifiers such as SVM and N

View Publication

(20)

(10)

Publication Date

Thu Sep 01 2011

Journal Name

Journal Of Economics And Administrative Sciences

BASES PROOF FOR PERIOD (1.1) FOR CORRELATION CONEFFICIENT

سليم اسماعيل

...Show More Authors

مفهوم معامل الارتباط كمقياس يربط بين متغيرين هذا يجلب انتباهنا إلى موضوع الإحصاء في كل المستويات. أكثر من ذلك هناك ثلاث نقاط خاصة هي اعتيادياً نشدد عليها كما يأتي:-

(1 معامل الارتباط هو الدليل المعياري والذي قيمته لا تعتمد على قياسات

المتغيرات الأصلية.

(2قيمته تقع في المدى] 1,1-[ .

&nb

View Publication Preview PDF

Publication Date

Fri Mar 01 2013

Journal Name

Journal Of Economics And Administrative Sciences

Comparison for estimation methods for the autoregressive approximations

عملية غير انعكاسية- عمية ضوضاء متكامل نسبياً- تقريب الانحدار الذاتي- معادلات Yule- Walker- طريقة اقل المربعات- طيقة الامامية- الخلفية – طريقتي Burg.

non-invertible process

fractionally- integrated noise process

autoregressive approximation

Yule-Walker equations

Least Squares

Least Squares ( forward- backword ) and Burg’s methods.

جنان عباس

...Show More Authors

Abstract

In this study, we compare between the autoregressive approximations (Yule-Walker equations, Least Squares , Least Squares ( forward- backword ) and Burg’s (Geometric and Harmonic ) methods, to determine the optimal approximation to the time series generated from the first - order moving Average non-invertible process, and fractionally - integrated noise process, with several values for d (d=0.15,0.25,0.35,0.45) for different sample sizes (small,median,large)for two processes . We depend on figure of merit function which proposed by author Shibata in 1980, to determine the theoretical optimal order according to min

View Publication Preview PDF

1 2 ... 126 127 128 129 ... 676 677