This research deals with a shrinking method concernes with the principal components similar to that one which used in the multiple regression “Least Absolute Shrinkage and Selection: LASS”. The goal here is to make an uncorrelated linear combinations from only a subset of explanatory variables that may have a multicollinearity problem instead taking the whole number say, (K) of them. This shrinkage will force some coefficients to equal zero, after making some restriction on them by some "tuning parameter" say, (t) which balances the bias and variance amount from side, and doesn't exceed the acceptable percent explained variance of these components. This had been shown by MSE criterion in the regression case and the percent explained variance in the principal components case.
Currently, one of the topical areas of application of machine learning methods is the prediction of material characteristics. The aim of this work is to develop machine learning models for determining the rheological properties of polymers from experimental stress relaxation curves. The paper presents an overview of the main directions of metaheuristic approaches (local search, evolutionary algorithms) to solving combinatorial optimization problems. Metaheuristic algorithms for solving some important combinatorial optimization problems are described, with special emphasis on the construction of decision trees. A comparative analysis of algorithms for solving the regression problem in CatBoost Regressor has been carried out. The object of
... Show MoreCoagulation is the most important process in drinking water treatment. Alum coagulant increases the aluminum residuals, which have been linked in many studies to Alzheimer's disease. Therefore, it is very important to use it with the very optimal dose. In this paper, four sets of experiments were done to determine the relationship between raw water characteristics: turbidity, pH, alkalinity, temperature, and optimum doses of alum [ .14 O] to form a mathematical equation that could replace the need for jar test experiments. The experiments were performed under different conditions and under different seasonal circumstances. The optimal dose in every set was determined, and used to build a gene expression model (GEP). The models were co
... Show MoreSemi-parametric models analysis is one of the most interesting subjects in recent studies due to give an efficient model estimation. The problem when the response variable has one of two values either 0 ( no response) or one – with response which is called the logistic regression model.
We compare two methods Bayesian and . Then the results were compared using MSe criteria.
A simulation had been used to study the empirical behavior for the Logistic model , with different sample sizes and variances. The results using represent that the Bayesian method is better than the at small samples sizes.
... Show MoreIn this article four samples of HgBa2Ca2Cu2.4Ag0.6O8+δ were prepared and irradiated with different doses of gamma radiation 6, 8 and 10 Mrad. The effects of gamma irradiation on structure of HgBa2Ca2Cu2.4Ag0.6O8+δ samples were characterized using X-ray diffraction. It was concluded that there effect on structure by gamma irradiation. Scherrer, crystallization, and Williamson equations were applied based on the X-ray diffraction diagram and for all gamma doses, to calculate crystal size, strain, and degree of crystallinity. I
... Show MoreThis research dealt with the analysis of murder crime data in Iraq in its temporal and spatial dimensions, then it focused on building a new model with an algorithm that combines the characteristics associated with time and spatial series so that this model can predict more accurately than other models by comparing them with this model, which we called the Combined Regression model (CR), which consists of merging two models, the time series regression model with the spatial regression model, and making them one model that can analyze data in its temporal and spatial dimensions. Several models were used for comparison with the integrated model, namely Multiple Linear Regression (MLR), Decision Tree Regression (DTR), Random Forest Reg
... Show MoreRegression testing being expensive, requires optimization notion. Typically, the optimization of test cases results in selecting a reduced set or subset of test cases or prioritizing the test cases to detect potential faults at an earlier phase. Many former studies revealed the heuristic-dependent mechanism to attain optimality while reducing or prioritizing test cases. Nevertheless, those studies were deprived of systematic procedures to manage tied test cases issue. Moreover, evolutionary algorithms such as the genetic process often help in depleting test cases, together with a concurrent decrease in computational runtime. However, when examining the fault detection capacity along with other parameters, is required, the method falls sh
... Show MoreThere is a great operational risk to control the day-to-day management in water treatment plants, so water companies are looking for solutions to predict how the treatment processes may be improved due to the increased pressure to remain competitive. This study focused on the mathematical modeling of water treatment processes with the primary motivation to provide tools that can be used to predict the performance of the treatment to enable better control of uncertainty and risk. This research included choosing the most important variables affecting quality standards using the correlation test. According to this test, it was found that the important parameters of raw water: Total Hardn
Water quality planning relies on Biochemical Oxygen Demand BOD. BOD testing takes five days. The Particle Swarm Optimization (PSO) is increasingly used for water resource forecasting. This work designed a PSO technique for estimating everyday BOD at Al-Rustumiya wastewater treatment facility inlet. Al-Rustumiya wastewater treatment plant provided 702 plant-scale data sets during 2012-2022. The PSO model uses the daily data of the water quality parameters, including chemical oxygen demand (COD), chloride (Cl-), suspended solid (SS), total dissolved solids (TDS), and pH, to determine how each variable affects the daily incoming BOD. PSO and multiple linear regression (MLR) findings are compared, and their perfor
... Show More
Regression testing is a crucial phase in the software development lifecycle that makes sure that new changes/updates in the software system don’t introduce defects or don’t affect adversely the existing functionalities. However, as the software systems grow in complexity, the number of test cases in regression suite can become large which results into more testing time and resource consumption. In addition, the presence of redundant and faulty test cases may affect the efficiency of the regression testing process. Therefore, this paper presents a new Hybrid Framework to Exclude Similar & Faulty Test Cases in Regression Testing (ETCPM) that utilizes automated code analysis techniques and historical test execution data to
... Show MoreIt is the regression analysis is the foundation stone of knowledge of statistics , which mostly depends on the ordinary least square method , but as is well known that the way the above mentioned her several conditions to operate accurately and the results can be unreliable , add to that the lack of certain conditions make it impossible to complete the work and analysis method and among those conditions are the multi-co linearity problem , and we are in the process of detected that problem between the independent variables using farrar –glauber test , in addition to the requirement linearity data and the lack of the condition last has been resorting to the
... Show More