Classification of imbalanced data is an important issue. Many algorithms have been developed for classification, such as Back Propagation (BP) neural networks, decision tree, Bayesian networks etc., and have been used repeatedly in many fields. These algorithms speak of the problem of imbalanced data, where there are situations that belong to more classes than others. Imbalanced data result in poor performance and bias to a class without other classes. In this paper, we proposed three techniques based on the Over-Sampling (O.S.) technique for processing imbalanced dataset and redistributing it and converting it into balanced dataset. These techniques are (Improved Synthetic Minority Over-Sampling Technique (Improved SMOTE), Borderline-SMOTE + Imbalanced Ratio(IR), Adaptive Synthetic Sampling (ADASYN) +IR) Algorithm, where the work these techniques are generate the synthetic samples for the minority class to achieve balance between minority and majority classes and then calculate the IR between classes of minority and majority. Experimental results show ImprovedSMOTE algorithm outperform the Borderline-SMOTE + IR and ADASYN + IR algorithms because it achieves a high balance between minority and majority classes.
Multiple linear regressions are concerned with studying and analyzing the relationship between the dependent variable and a set of explanatory variables. From this relationship the values of variables are predicted. In this paper the multiple linear regression model and three covariates were studied in the presence of the problem of auto-correlation of errors when the random error distributed the distribution of exponential. Three methods were compared (general least squares, M robust, and Laplace robust method). We have employed the simulation studies and calculated the statistical standard mean squares error with sample sizes (15, 30, 60, 100). Further we applied the best method on the real experiment data representing the varieties of
... Show Moresummary
In this search, we examined the factorial experiments and the study of the significance of the main effects, the interaction of the factors and their simple effects by the F test (ANOVA) for analyze the data of the factorial experience. It is also known that the analysis of variance requires several assumptions to achieve them, Therefore, in case of violation of one of these conditions we conduct a transform to the data in order to match or achieve the conditions of analysis of variance, but it was noted that these transfers do not produce accurate results, so we resort to tests or non-parametric methods that work as a solution or alternative to the parametric tests , these method
... Show MorePorosity is important because it reflects the presence of oil reserves. Hence, the number of underground reserves and a direct influence on the essential petrophysical parameters, such as permeability and saturation, are related to connected pores. Also, the selection of perforation interval and recommended drilling additional infill wells. For the estimation two distinct methods are used to obtain the results: the first method is based on conventional equations that utilize porosity logs. In contrast, the second approach relies on statistical methods based on making matrices dependent on rock and fluid composition and solving the equations (matrices) instantaneously. In which records have entered as equations, and the matrix is sol
... Show MoreRadiation therapy plays an important role in improving breast cancer cases, in order to obtain an appropriateestimate of radiation doses number given to the patient after tumor removal; some methods of nonparametric regression werecompared. The Kernel method was used by Nadaraya-Watson estimator to find the estimation regression function forsmoothing data based on the smoothing parameter h according to the Normal scale method (NSM), Least Squared CrossValidation method (LSCV) and Golden Rate Method (GRM). These methods were compared by simulation for samples ofthree sizes, the method (NSM) proved to be the best according to average of Mean Squares Error criterion and the method(LSCV) proved to be the best according to Average of Mean Absolu
... Show MoreThis study focuses on evaluating the suitability of three interpolation methods in terms of their accuracy at climate data for some provinces of south of Iraq. Two data sets of maximum and minimum temperature in February 2008 from nine meteorological stations located in the south of Iraq using three interpolation methods. ArcGIS is used to produce the spatially distributed temperature data by using IDW, ordinary kriging, and spline. Four statistical methods are applied to analyze the results obtained from three interpolation methods. These methods are RMSE, RMSE as a percentage of the mean, Model efficiency (E) and Bias, which showed that the ordinary krigingis the best for this data from other methods by the results that have b
... Show MoreIn this paper, we will provide a proposed method to estimate missing values for the Explanatory variables for Non-Parametric Multiple Regression Model and compare it with the Imputation Arithmetic mean Method, The basis of the idea of this method was based on how to employ the causal relationship between the variables in finding an efficient estimate of the missing value, we rely on the use of the Kernel estimate by Nadaraya – Watson Estimator , and on Least Squared Cross Validation (LSCV) to estimate the Bandwidth, and we use the simulation study to compare between the two methods.
The purpose of this study is aimed to lay down an arranged platform suited to Iraqi constructional associations which in charge to carry out multi constructional projects, as it fulfilled management requirements and supervising, so that low - cost projects will be controlled in due term and quality. Based on primary info and observed data collected, the study thesis has been formulated in this way: Iraqi constructional sector bodies which are in charge to implement simultaneously multi constructional projects in need to reformulate its organized structure so that it will be more fitted to management and control of these projects. This thesis includes a
theoretical part contained presenting the most important resources locally and int
This world is moving towards knowledge economy which basically depends on knowledge and information. So, the economic units need to develop its financial reporting system which helps to provide useful information in timeliness for investors in accordance with the requirements of the knowledge economy and meets the needs of those investors. This research aims to revealing the reflects of knowledge economy on the approaches of financial reporting and suggesting a financial reporting model in the environment of knowledge economy, depending on combining the value approach with the events approach using database and communication technology and providing useful accounting information for all users regardless of
... Show MoreThe Makhoul Dam project proposed to be established is considered one of the strategic projects in Iraq as it works to insurance large quantity of water spare in flood seasons, increase the storage capacity of the dams in Iraq, as well as increase food security. The Makhool Dam is located on Tigris River in Salah al-Din Governorate, and 8 km south of the meeting point of the Tigris River with the Lower Zab River. The lake area is about 256 km2. In this research, a mathematical model was prepared by using HEC-RAS Two Dimension Software to analyze the velocity patterns and water depths inside makhool dam reservoir at the highest operational water elevation, based on the designs prepared