Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematically studied by exploring available studies of different metaheuristic algorithms used for FS to improve TC. This paper will contribute to the body of existing knowledge by answering four research questions (RQs): 1) What are the different approaches of FS that apply metaheuristic algorithms to improve TC? 2) Does applying metaheuristic algorithms for TC lead to better accuracy than the typical FS methods? 3) How effective are the modified, hybridized metaheuristic algorithms for text FS problems?, and 4) What are the gaps in the current studies and their future directions? These RQs led to a study of recent works on metaheuristic-based FS methods, their contributions, and limitations. Hence, a final list of thirty-seven (37) related articles was extracted and investigated to align with our RQs to generate new knowledge in the domain of study. Most of the conducted papers focused on addressing the TC in tandem with metaheuristic algorithms based on the wrapper and hybrid FS approaches. Future research should focus on using a hybrid-based FS approach as it intuitively handles complex optimization problems and potentiality provide new research opportunities in this rapidly developing field.
The purpose of this study is to measure the levels of quality control for some crude oil products in Iraqi refineries, and how they are close to the international standards, through the application of statistical methods in quality control of oil products in Iraqi refineries. Where the answers of the study sample were applied to a group of Iraqi refinery employees (Al-Dora refinery, Al-Nasiriyah refinery, and Al-Basra refinery) on the principles of quality management control, and according to the different personal characteristics (gender, age, academic qualification, number of years of experience, job level). In order to achieve the objectives of the study, a questionnaire that included (12) items, in order to collect preliminary inform
... Show MoreAs a result of the development and global openness and the possibility of companies providing their services outside their spatial boundaries that were determined by them, and the transformation of the world due to the development of the means of communication into a large global market that accommodates all products from different regions and of the same type and production field, competition resulted between companies, and the race to obtain the largest market share It ensures the largest amount of profits, and it is natural for the advertising promotion by companies for their product to shift from an advertisement for one product to a competitive advertisement that calls on the recipient to leave the competing product and switch to it
... Show MoreIn this paper, the methods of weighted residuals: Collocation Method (CM), Least Squares Method (LSM) and Galerkin Method (GM) are used to solve the thin film flow (TFF) equation. The weighted residual methods were implemented to get an approximate solution to the TFF equation. The accuracy of the obtained results is checked by calculating the maximum error remainder functions (MER). Moreover, the outcomes were examined in comparison with the 4th-order Runge-Kutta method (RK4) and good agreements have been achieved. All the evaluations have been successfully implemented by using the computer system Mathematica®10.
Classification of imbalanced data is an important issue. Many algorithms have been developed for classification, such as Back Propagation (BP) neural networks, decision tree, Bayesian networks etc., and have been used repeatedly in many fields. These algorithms speak of the problem of imbalanced data, where there are situations that belong to more classes than others. Imbalanced data result in poor performance and bias to a class without other classes. In this paper, we proposed three techniques based on the Over-Sampling (O.S.) technique for processing imbalanced dataset and redistributing it and converting it into balanced dataset. These techniques are (Improved Synthetic Minority Over-Sampling Technique (Improved SMOTE), Border
... Show MoreBecause of the experience of the mixture problem of high correlation and the existence of linear MultiCollinearity between the explanatory variables, because of the constraint of the unit and the interactions between them in the model, which increases the existence of links between the explanatory variables and this is illustrated by the variance inflation vector (VIF), L-Pseudo component to reduce the bond between the components of the mixture.
To estimate the parameters of the mixture model, we used in our research the use of methods that increase bias and reduce variance, such as the Ridge Regression Method and the Least Absolute Shrinkage and Selection Operator (LASSO) method a
... Show MoreCefixime is an antibiotic useful for treating a variety ofmicroorganism infections. In the present work, tworapid, specific, inexpensive and nontoxic methods wereproposed for cefixime determination. Area under curvespectrophotometric and HPLC methods were depictedfor the micro quantification of Cefixime in highly pureand local market formulation. The area under curve(first technique) used in calculation of the cefiximepeak using a UV-visible spectrophotometer.The HPLC (2nd technique) was depended on thepurification of Cefixime by a C18 separating column250mm (length of column) × 4.6 mm (diameter)andusing methanol 50% (organic modifier) and deionizedwater 50% as a mobile phase. The isocratic flow withrate of 1 mL/min was applied, the temper
... Show MoreFractal geometry is receiving increase attention as a quantitative and qualitative model for natural phenomena description, which can establish an active classification technique when applied on satellite images. In this paper, a satellite image is used which was taken by Quick Bird that contains different visible classes. After pre-processing, this image passes through two stages: segmentation and classification. The segmentation carried out by hybrid two methods used to produce effective results; the two methods are Quadtree method that operated inside Horizontal-Vertical method. The hybrid method is segmented the image into two rectangular blocks, either horizontally or vertically depending on spectral uniformity crit
... Show More—Medical images have recently played a significant role in the diagnosis and detection of various diseases. Medical imaging can provide a means of direct visualization to observe through the human body and notice the small anatomical change and biological processes associated by different biological and physical parameters. To achieve a more accurate and reliable diagnosis, nowadays, varieties of computer aided detection (CAD) and computer-aided diagnosis (CADx) approaches have been established to help interpretation of the medical images. The CAD has become among the many major research subjects in diagnostic radiology and medical imaging. In this work we study the improvement in accuracy of detection of CAD system when comb
... Show MoreThis research a study model of linear regression problem of autocorrelation of random error is spread when a normal distribution as used in linear regression analysis for relationship between variables and through this relationship can predict the value of a variable with the values of other variables, and was comparing methods (method of least squares, method of the average un-weighted, Thiel method and Laplace method) using the mean square error (MSE) boxes and simulation and the study included fore sizes of samples (15, 30, 60, 100). The results showed that the least-squares method is best, applying the fore methods of buckwheat production data and the cultivated area of the provinces of Iraq for years (2010), (2011), (2012),
... Show More