In this paper, a new method of selection variables is presented to select some essential variables from large datasets. The new model is a modified version of the Elastic Net model. The modified Elastic Net variable selection model has been summarized in an algorithm. It is applied for Leukemia dataset that has 3051 variables (genes) and 72 samples. In reality, working with this kind of dataset is not accessible due to its large size. The modified model is compared to some standard variable selection methods. Perfect classification is achieved by applying the modified Elastic Net model because it has the best performance. All the calculations that have been done for this paper are in R program by using some existing packages.
Semi-parametric models analysis is one of the most interesting subjects in recent studies due to give an efficient model estimation. The problem when the response variable has one of two values either 0 ( no response) or one – with response which is called the logistic regression model.
We compare two methods Bayesian and . Then the results were compared using MSe criteria.
A simulation had been used to study the empirical behavior for the Logistic model , with different sample sizes and variances. The results using represent that the Bayesian method is better than the at small samples sizes.
... Show MoreSewer sediment deposition is an important aspect as it relates to several operational and environmental problems. It concerns municipalities as it affects the sewer system and contributes to sewer failure which has a catastrophic effect if happened in trunks or interceptors. Sewer rehabilitation is a costly process and complex in terms of choosing the method of rehabilitation and individual sewers to be rehabilitated. For such a complex process, inspection techniques assist in the decision-making process; though, it may add to the total expenditure of the project as it requires special tools and trained personnel. For developing countries, Inspection could prohibit the rehabilitation proceeds. In this study, the researchers propos
... Show MoreIn this paper, the Monte-Carlo simulation method was used to compare the robust circular S estimator with the circular Least squares method in the case of no outlier data and in the case of the presence of an outlier in the data through two trends, the first is contaminant with high inflection points that represents contaminant in the circular independent variable, and the second the contaminant in the vertical variable that represents the circular dependent variable using three comparison criteria, the median standard error (Median SE), the median of the mean squares of error (Median MSE), and the median of the mean cosines of the circular residuals (Median A(k)). It was concluded that the method of least squares is better than the
... Show MoreThis study aims to analyze the spatial distribution of the epidemic spread and the role of the physical, social, and economic characteristics in this spreading. A geographically weighted regression (GWR) model was built within a GIS environment using infection data monitored by the Iraqi Ministry of Health records for 10 months from March to December 2020. The factors adopted in this model are the size of urban interaction areas and human gatherings, movement level and accessibility, and the volume of public services and facilities that attract people. The results show that it would be possible to deal with each administrative unit in proportion to its circumstances in light of the factors that appe
With the development of communication technologies for mobile devices and electronic communications, and went to the world of e-government, e-commerce and e-banking. It became necessary to control these activities from exposure to intrusion or misuse and to provide protection to them, so it's important to design powerful and efficient systems-do-this-purpose. It this paper it has been used several varieties of algorithm selection passive immune algorithm selection passive with real values, algorithm selection with passive detectors with a radius fixed, algorithm selection with passive detectors, variable- sized intrusion detection network type misuse where the algorithm generates a set of detectors to distinguish the self-samples. Practica
... Show MoreThe logistic regression model is one of the oldest and most common of the regression models, and it is known as one of the statistical methods used to describe and estimate the relationship between a dependent random variable and explanatory random variables. Several methods are used to estimate this model, including the bootstrap method, which is one of the estimation methods that depend on the principle of sampling with return, and is represented by a sample reshaping that includes (n) of the elements drawn by randomly returning from (N) from the original data, It is a computational method used to determine the measure of accuracy to estimate the statistics, and for this reason, this method was used to find more accurate estimates. The ma
... Show MoreThe issue of penalized regression model has received considerable critical attention to variable selection. It plays an essential role in dealing with high dimensional data. Arctangent denoted by the Atan penalty has been used in both estimation and variable selection as an efficient method recently. However, the Atan penalty is very sensitive to outliers in response to variables or heavy-tailed error distribution. While the least absolute deviation is a good method to get robustness in regression estimation. The specific objective of this research is to propose a robust Atan estimator from combining these two ideas at once. Simulation experiments and real data applications show that the p
... Show MoreIn this research, we use fuzzy nonparametric methods based on some smoothing techniques, were applied to real data on the Iraqi stock market especially the data about Baghdad company for soft drinks for the year (2016) for the period (1/1/2016-31/12/2016) .A sample of (148) observations was obtained in order to construct a model of the relationship between the stock prices (Low, high, modal) and the traded value by comparing the results of the criterion (G.O.F.) for three techniques , we note that the lowest value for this criterion was for the K-Nearest Neighbor at Gaussian function .
Some experiments need to know the extent of their usefulness to continue providing them or not. This is done through the fuzzy regression discontinuous model, where the Epanechnikov Kernel and Triangular Kernel were used to estimate the model by generating data from the Monte Carlo experiment and comparing the results obtained. It was found that the. Epanechnikov Kernel has a least mean squared error.