Adaptation Proposed Methods for Handling Imbalanced Datasets based on Over-Sampling Technique

Liqaa M. Shoohi; Jamila H. Saud

doi:10.23851/mjs.v31i2.740

Details

Publication Date

Wed Apr 15 2020

Journal Name

Al-mustansiriyah Journal Of Science

Volume

31

DOI

10.23851/mjs.v31i2.740

Choose Citation Style

Statistics

View publication

18

View original publication

1

View pdf

1

Statistics

Adaptation Proposed Methods for Handling Imbalanced Datasets based on Over-Sampling Technique

Imbalanced Datasets

O.S.

SMOTE

Borderline-SMOTE

ADASYN.

Liqaa M. Shoohi

Jamila H. Saud

...Show More Authors

Classification of imbalanced data is an important issue. Many algorithms have been developed for classification, such as Back Propagation (BP) neural networks, decision tree, Bayesian networks etc., and have been used repeatedly in many fields. These algorithms speak of the problem of imbalanced data, where there are situations that belong to more classes than others. Imbalanced data result in poor performance and bias to a class without other classes. In this paper, we proposed three techniques based on the Over-Sampling (O.S.) technique for processing imbalanced dataset and redistributing it and converting it into balanced dataset. These techniques are (Improved Synthetic Minority Over-Sampling Technique (Improved SMOTE), Borderline-SMOTE + Imbalanced Ratio(IR), Adaptive Synthetic Sampling (ADASYN) +IR) Algorithm, where the work these techniques are generate the synthetic samples for the minority class to achieve balance between minority and majority classes and then calculate the IR between classes of minority and majority. Experimental results show ImprovedSMOTE algorithm outperform the Borderline-SMOTE + IR and ADASYN + IR algorithms because it achieves a high balance between minority and majority classes.

View Publication Preview PDF

Quick Preview PDF

Publication Date

Fri Jan 01 2016

Journal Name

Ibn Al-haitham Journal For Pure And Applied Science

Genetic--Based Face Retrieval Using Statistical Features

Wathiq N.

...Show More Authors

Publication Date

Sun Jun 20 2021

Journal Name

Baghdad Science Journal

Reinforcement Learning-Based Television White Space Database

radio propagation

radio spectrum management

reinforcement learning

television white space database

Armie E.

Raine Mattheus

Jerrick Spencer

Xavier Francis

Joshua Vincent

Lawrence

...Show More Authors

Television white spaces (TVWSs) refer to the unused part of the spectrum under the very high frequency (VHF) and ultra-high frequency (UHF) bands. TVWS are frequencies under licenced primary users (PUs) that are not being used and are available for secondary users (SUs). There are several ways of implementing TVWS in communications, one of which is the use of TVWS database (TVWSDB). The primary purpose of TVWSDB is to protect PUs from interference with SUs. There are several geolocation databases available for this purpose. However, it is unclear if those databases have the prediction feature that gives TVWSDB the capability of decreasing the number of inquiries from SUs. With this in mind, the authors present a reinforcement learning-ba

View Publication Preview PDF

(3)

Publication Date

Fri Jun 18 2021

Journal Name

Periodicals Of Engineering And Natural Sciences (pen)

Quadtree partitioning scheme of color image based

Ghadah

Mohammed

Omar

...Show More Authors

View Publication

(1)

Publication Date

Tue Jan 02 2018

Journal Name

Journal Of Educational And Psychological Researches

The effectiveness of a proposed program to solve the problem of mixing Arabic letters similar to the voice of students in the second grade primary

The effectiveness of a proposed program

mixing Arabic

Ohood Samiy Hashem

...Show More Authors

The purpose of this research is to demonstrate the effectiveness of a program to address the problem of mixing similar letters in the Arabic language for students in the second grade of primary and to achieve the goal of the research. The researcher followed the experimental method to suit the nature of this research and found that there are statistically significant differences between the tribal and remote tests, The effectiveness of the proposed educational program. At the end of the research, the researcher recommends several recommendations, the most important of which are: 1 - Training students to correct pronunciation of the outlets, especially in the first three stages of primary education (primary) and the use of direct training

View Publication Preview PDF

Publication Date

Thu Aug 18 2022

Journal Name

Dental Hypotheses

Microleakage Evaluation of Glass Hybrid Restoration Following Usage of Papain-Based Gel and Ceramic Bur for Caries Removal: An in Vitro Study

Halla

Aseel

...Show More Authors

Publication Date

Tue Sep 01 2015

Journal Name

Journal Of Engineering

Application of Box-Behnken Method Based ANN-GA to Prediction of wt.% of Doping Elements for Incoloy 800H Coated by Aluminizing-Chromizing

Box-behnken design

GA

ANN

Hot Corrosion

Pack cementation .

Abbas Khammas

...Show More Authors

In this work , an effective procedure of Box-Behnken based-ANN (Artificial Neural Network) and GA (Genetic Algorithm) has been utilized for finding the optimum conditions of wt.% of doping elements (Ce,Y, and Ge) doped-aluminizing-chromizing of Incoloy 800H . ANN and Box-Behnken design method have been implanted for minimizing hot corrosion rate k_p (10^-12g².cm^-4.s^-1) in Incoloy 800H at 900^oC . ANN was used for estimating the predicted values of hot corrosion rate k_p (10^-12g².cm^-4.s^-1) . The optimal wt.% of doping elements combination to obtain minimum hot corrosion rate was calculated using genetic alg

View Publication Preview PDF

Publication Date

Tue Jun 30 2020

Journal Name

Journal Of Economics And Administrative Sciences

Using The Maximum Likelihood And Bayesian Methods To Estimate The Time-Rate Function Of Earthquake Phenomenon

عمليات بواسون غير المتجانسة

الإمكان الأعظم

خوارزمية كبس

خوارزمية متروبولس هاستنكس

الطريقة البيزية.

علي محمد

...Show More Authors

In this research, we dealt with the study of the Non-Homogeneous Poisson process, which is one of the most important statistical issues that have a role in scientific development as it is related to accidents that occur in reality, which are modeled according to Poisson’s operations, because the occurrence of this accident is related to time, whether with the change of time or its stability. In our research, this clarifies the Non-Homogeneous hemispheric process and the use of one of these models of processes, which is an exponentiated - Weibull model that contains three parameters (α, β, σ) as a function to estimate the time rate of occurrence of earthquakes in Erbil Governorate, as the governorate is adjacent to two countr

View Publication Preview PDF

Publication Date

Tue Oct 01 2019

Journal Name

Journal Of Economics And Administrative Sciences

Comparison of some robust methods in the presence of problems of multicollinearity and high leverage points

الانحدار الخطي المتعدد

تعدد العلاقة الخطية

نقاط الانعطاف العالية

مقدر MM

مقدر GM2

انحدار الحرف للـ جاكنايف .

Multiple Linear Regression

Multicollinearity

high leverage point

Jackknife ridge regression

MM-estimator

GM2-estimator.

غفران اسماعيل

سيف الامام سعدي

...Show More Authors

Abstract

The multiple linear regression model of the important regression models used in the analysis for different fields of science Such as business, economics, medicine and social sciences high in data has undesirable effects on analysis results . The multicollinearity is a major problem in multiple linear regression. In its simplest state, it leads to the departure of the model parameter that is capable of its scientific properties, Also there is an important problem in regression analysis is the presence of high leverage points in the data have undesirable effects on the results of the analysis , In this research , we present some of

View Publication Preview PDF

Publication Date

Sun Apr 04 2010

Journal Name

Journal Of Educational And Psychological Researches

The effect of two treatment methods in the gaining of fourth grade students in geography object.

two treatment methods in the gaining

عائدة مخلف مهدي القريشي

...Show More Authors

research aim :
- The research aimed to investigate the effect of two treatment
methods in the gaining of fourth grade students in geography
object.
- Research hypothesis
 there are no statistically significant differences at the level of ( 0.05 )
in the average level of achievement in geography between the first
experimental group ( strengthening lessons ) and the second group
( re- teaching )
 no individual differences statically significant at the level of ( 0.05 )
in the average level achievement in geography object of the second
experimental group ( re- teaching ) and the first experimental group
( strengthening lesson )
 the research sample : the researcher selected randomly Baghdad

View Publication Preview PDF

Publication Date

Sat Oct 01 2022

Journal Name

Al–bahith Al–a'alami

METHODS OF MEASURING THE LEVEL OF COMMUNICATION EFFICIENCY IN NEWS HEADLINES: (Drawn from the doctoral thesis)

Methods of Measuring

Communication Efficiency

News Headlines

Akram

...Show More Authors

Measuring the level of communicative competence in news headlines and the level of stylistic and semantic processing in its formulation requires creating a quantitative scale based on the bases on building the scales and their standards. As judging by scientific of journalism studies lies in the possibility of quantifying the journalistic knowledge, i.e. the ability of this knowledge to shift from qualitative language to its equivalent in the language of numbers.

News headlines and editorial processing are one of the journalistic knowledges that should be studied, analyzed stylistically and semantically; their conclusions drawn and expressed in numbers. Press knowledge is divided into two types:<

View Publication Preview PDF

1 2 ... 185 186 187 188 ... 1246 1247