Fuzzy C means Based Evaluation Algorithms For Cancer Gene Expression Data Clustering

Omar Al-Janabee; Basad Al-Sarray

doi:https://doi.org/10.52866/ijcsm.2022.02.01.004

Details

Publication Date

Mon Feb 21 2022

Journal Name

Iraqi Journal For Computer Science And Mathematics

DOI

https://doi.org/10.52866/ijcsm.2022.02.01.004

Choose Citation Style

Statistics

View publication

7

Statistics

(1)

Fuzzy C means Based Evaluation Algorithms For Cancer Gene Expression Data Clustering

Omar Al-Janabee

Basad Al-Sarray

...Show More Authors

The influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, which is used to cluster genes. FCM allows an object to belong to two or more clusters with a membership grade between zero and one and the sum of belonging to all clusters of each gene is equal to one. This paradigm is useful when dealing with microarray data. The total time required to implement the first model is 22.2589 s. The second model combines FCM and particle swarm optimization (PSO) to obtain better results. The hybrid algorithm, i.e., FCM–PSO, uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–PSO method is effective. The total time of implementation of this model is 89.6087 s. The third model combines FCM with a genetic algorithm (GA) to obtain better results. This hybrid algorithm also uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–GA method is effective. Its total time of implementation is 50.8021 s. In addition, this study uses cluster validity indexes to determine the best partitioning for the underlying data. Internal validity indexes include the Jaccard, Davies Bouldin, Dunn, Xie–Beni, and silhouette. Meanwhile, external validity indexes include Minkowski, adjusted Rand, and percentage of correctly categorized pairings. Experiments conducted on brain tumor gene expression data demonstrate that the techniques used in this study outperform traditional models in terms of stability and biological significance.

View Publication

Publication Date

Tue Dec 01 2015

Journal Name

Journal Of Economics And Administrative Sciences

Developing Human Capital according to the Communities of Practice: A comparative study by using Data Envelopment Analysis

رأس المال البشري

مجتمعات الممارسة

تحليل مغلف البيانات

مايكروسوفت اكسل (2010)

كليات جامعة الموصل.

Human Capital (HC)

Communities of Practice (CoPs)

Data Envelopment Analysis (DEA)

Microsoft – Excel (2010)

the Colleges at Mosul University.

عامر عبد الرزاق

...Show More Authors

The research discusses the need to find the innovative structures and methodologies for developing Human Capital (HC) in Iraqi Universities. One of the most important of these structures is Communities of Practice (CoPs) which contributes to develop HC by using learning, teaching and training through the conversion speed of knowledge and creativity into practice. This research has been used the comparative approach through employing the methodology of Data Envelopment Analysis (DEA) by using (Excel 2010 - Solver) as a field evidence to prove the role of CoPs in developing HC. In light of the given information, a researcher adopted on an archived preliminary data about (23) colleges at Mosul University as a deliberate sample for t

View Publication Preview PDF

Publication Date

Wed Dec 28 2022

Journal Name

Al–bahith Al–a'alami

Content of Data Journalism in Security Topics - Security Media Cell Model Research extracted from a master’s thesis

Data Journalism

Security Media

Security Media Cell

Murtadha

Layth

...Show More Authors

This paper aims at the analytical level to know the security topics that were used with data journalism, and the expression methods used in the statements of the Security Media Cell, as well as to identify the means of clarification used in data journalism. About the Security Media Cell, and the methods preferred by the public in presenting press releases, especially determining the strength of the respondents' attitude towards the data issued by the Security Media Cell. On the Security Media Cell, while the field study included the distribution of a questionnaire to the public of Baghdad Governorate. The study reached several results, the most important of which is the interest of the security media cell in presenting its data in differ

View Publication Preview PDF

Publication Date

Sat Mar 26 2022

Journal Name

Journal Of Accounting And Financial Studies ( Jafs )

The Role of Big Data applications in forecasting corporate bankruptcy: Field analysis in the Saudi Business Environment

Big Data

Big Data applications

corporate bankruptcy

ASS.PROF.Dr-Metwaly elsayed

...Show More Authors

This study aimed to investigate the role of Big Data in forecasting corporate bankruptcy and that is through a field analysis in the Saudi business environment, to test that relationship. The study found: that Big Data is a recently used variable in the business context and has multiple accounting effects and benefits. Among the benefits is forecasting and disclosing corporate financial failures and bankruptcies, which is based on three main elements for reporting and disclosing that, these elements are the firms’ internal control system, the external auditing, and financial analysts' forecasts. The study recommends: Since the greatest risk of Big Data is the slow adaptation of accountants and auditors to these technologies, wh

View Publication Preview PDF

Publication Date

Tue Jun 01 2010

Journal Name

Al-khwarizmi Engineering Journal

Land Use/Cover Change Analysis Using Remote Sensing Data: A Case Study, Zhengzhou Area, Henan Province, China

Bassam F.

...Show More Authors

In the last two decades, arid and semi-arid regions of China suffered rapid changes in the Land Use/Cover Change (LUCC) due to increasing demand on food, resulting from growing population. In the process of this study, we established the land use/cover classification in addition to remote sensing characteristics. This was done by analysis of the dynamics of (LUCC) in Zhengzhou area for the period 1988-2006. Interpretation of a laminar extraction technique was implied in the identification of typical attributes of land use/cover types. A prominent result of the study indicates a gradual development in urbanization giving a gradual reduction in crop field area, due to the progressive economy in Zhengzhou. The results also reflect degradati

View Publication Preview PDF

Publication Date

Thu Oct 29 2020

Journal Name

Complexity

Training and Testing Data Division Influence on Hybrid Machine Learning Model Process: Application of River Flow Forecasting

Hai

Ali Omran

Ameen Mohammed

Zainab Hasan

Nadhir

Sinan Q.

Reham R.

...Show More Authors

The hydrological process has a dynamic nature characterised by randomness and complex phenomena. The application of machine learning (ML) models in forecasting river flow has grown rapidly. This is owing to their capacity to simulate the complex phenomena associated with hydrological and environmental processes. Four different ML models were developed for river flow forecasting located in semiarid region, Iraq. The effectiveness of data division influence on the ML models process was investigated. Three data division modeling scenarios were inspected including 70%–30%, 80%–20, and 90%–10%. Several statistical indicators are computed to verify the performance of the models. The results revealed the potential of the hybridized s

View Publication

(59)

(31)

Publication Date

Tue Dec 01 2020

Journal Name

Journal Of Economics And Administrative Sciences

Use The moment method to Estimate the Reliability Function Of The Data Of Truncated Skew Normal Distribution

Skew Normal Distribution

Truncated Skew Normal Distribution

Reliability Function

Method of Moments.

التَوزيع الطَبيعي المُلتَوي

التَوزيع الطَبيعي المُلتَوي المَبتور

دالة المُعَوَلية

طريقة العُزوم.

Hatem Kareem

Ahmed Dheyab

...Show More Authors

The Estimation Of The Reliability Function Depends On The Accuracy Of The Data Used To Estimate The Parameters Of The Probability distribution, and Because Some Data Suffer from a Skew in their Data to Estimate the Parameters and Calculate the Reliability Function in light of the Presence of Some Skew in the Data, there must be a Distribution that has flexibility in dealing with that Data. As in the data of Diyala Company for Electrical Industries, as it was observed that there was a positive twisting in the data collected from the Power and Machinery Department, which required distribution that deals with those data and searches for methods that accommodate this problem and lead to accurate estimates of the reliability function,

View Publication Preview PDF

Publication Date

Tue Oct 23 2018

Journal Name

Journal Of Economics And Administrative Sciences

Processing of missing values in survey data using Principal Component Analysis and probabilistic Principal Component Analysis methods

قتيبة نبيل

بشرى رحيم

...Show More Authors

The idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeed

View Publication Preview PDF

Publication Date

Thu Feb 01 2018

Journal Name

Journal Of Economics And Administrative Sciences

Comparison of Slice inverse regression with the principal components in reducing high-dimensions data by using simulation

اختزال الابعاد

الانحدار الشرائحي المعكوس

المركبات الرئيسية.

dimensions reduction

Slice inverse regression

principal components.

عمر عبد المحسن

زينة ابراهيم

...Show More Authors

This research aims to study the methods of reduction of dimensions that overcome the problem curse of dimensionality when traditional methods fail to provide a good estimation of the parameters So this problem must be dealt with directly . Two methods were used to solve the problem of high dimensional data, The first method is the non-classical method Slice inverse regression ( SIR ) method and the proposed weight standard Sir (WSIR) method and principal components (PCA) which is the general method used in reducing dimensions, (SIR ) and (PCA) is based on the work of linear combinations of a subset of the original explanatory variables, which may suffer from the problem of heterogeneity and the problem of linear

View Publication Preview PDF

Publication Date

Wed Nov 01 2017

Journal Name

Journal Of Economics And Administrative Sciences

Applied Study on Analysis of Fixed, Random and Mixed Panel Data Models Measured at specific time intervals

Fixed Panel Data

Random Panel Data

Mixed Panel Data

Lagrange multiplier

The Coefficient of determination .

رجاء كامل

...Show More Authors

This research sought to present a concept of cross-sectional data models, A crucial double data to take the impact of the change in time and obtained from the measured phenomenon of repeated observations in different time periods, Where the models of the panel data were defined by different types of fixed , random and mixed, and Comparing them by studying and analyzing the mathematical relationship between the influence of time with a set of basic variables Which are the main axes on which the research is based and is represented by the monthly revenue of the working individual and the profits it generates, which represents the variable response And its relationship to a set of explanatory variables represented by the

View Publication Preview PDF

Publication Date

Fri Mar 01 2024

Journal Name

Baghdad Science Journal

Exploring the Challenges of Diagnosing Thyroid Disease with Imbalanced Data and Machine Learning: A Systematic Literature Review

Classification

Deep learning

Imbalanced data

Machine learning

Thyroid disease

Dhekre Saber

Mohd Shahizan

...Show More Authors

Thyroid disease is a common disease affecting millions worldwide. Early diagnosis and treatment of thyroid disease can help prevent more serious complications and improve long-term health outcomes. However, thyroid disease diagnosis can be challenging due to its variable symptoms and limited diagnostic tests. By processing enormous amounts of data and seeing trends that may not be immediately evident to human doctors, Machine Learning (ML) algorithms may be capable of increasing the accuracy with which thyroid disease is diagnosed. This study seeks to discover the most recent ML-based and data-driven developments and strategies for diagnosing thyroid disease while considering the challenges associated with imbalanced data in thyroid dise

View Publication Preview PDF

(8)

(9)

1 2 ... 189 190 191 192 ... 878 879