Mixed-effects conditional logistic regression is evidently more effective in the study of qualitative differences in longitudinal pollution data as well as their implications on heterogeneous subgroups. This study seeks that conditional logistic regression is a robust evaluation method for environmental studies, thru the analysis of environment pollution as a function of oil production and environmental factors. Consequently, it has been established theoretically that the primary objective of model selection in this research is to identify the candidate model that is optimal for the conditional design. The candidate model should achieve generalizability, goodness-of-fit, parsimony and establish equilibrium between bias and variability. In the practical sphere it is however more realistic to capture the most significant parameters of the research design through the best fitted candidate model for this research. Simulation studies demonstrate that the mixed-effects conditional logistic regression is more accurate for pollution studies, with fixed-effects conditional logistic regression models potentially generating flawed conclusions. This is because mixed-effects conditional logistic regression provides detailed insights on clusters that were largely overlooked by fixed-effects conditional logistic regression.
In this paper, we will study non parametric model when the response variable have missing data (non response) in observations it under missing mechanisms MCAR, then we suggest Kernel-Based Non-Parametric Single-Imputation instead of missing value and compare it with Nearest Neighbor Imputation by using the simulation about some difference models and with difference cases as the sample size, variance and rate of missing data.
Business organizations have faced many challenges in recent times, most important of which is information technology, because it is widely spread and easy to use. Its use has led to an increase in the amount of data that business organizations deal with an unprecedented manner. The amount of data available through the internet is a problem that many parties seek to find solutions for. Why is it available there in this huge amount randomly? Many expectations have revealed that in 2017, there will be devices connected to the internet estimated at three times the population of the Earth, and in 2015 more than one and a half billion gigabytes of data was transferred every minute globally. Thus, the so-called data mining emerged as a
... Show MoreThe efficiency of management is determining factor for the success or failure of agricultural projects generally and Livestock particularly achieving its objectives. Therefore, this research came to diagnose the most important variables that determine the efficiency of management using the probability regression models to measure the probability of management efficient of broilers production projects using random sample included (60) broilers projects represented 11.6% of Baghdad province (research community) in 2016. After estimating the relationship between the management efficiency (descriptive dependent variable) and the independent variables affecting it (age, educational level, production index (PI), experience). The results
... Show MoreIn this research, we find the Bayesian formulas and the estimation of Bayesian expectation for product system of Atlas Company. The units of the system have been examined by helping the technical staff at the company and by providing a real data the company which manufacturer the system. This real data include the failed units for each drawn sample, which represents the total number of the manufacturer units by the company system. We calculate the range for each estimator by using the Maximum Likelihood estimator. We obtain that the expectation-Bayesian estimation is better than the Bayesian estimator of the different partially samples which were drawn from the product system after it checked by the
... Show MoreIn this paper, we will discuss the performance of Bayesian computational approaches for estimating the parameters of a Logistic Regression model. Markov Chain Monte Carlo (MCMC) algorithms was the base estimation procedure. We present two algorithms: Random Walk Metropolis (RWM) and Hamiltonian Monte Carlo (HMC). We also applied these approaches to a real data set.
This research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB
... Show MoreAmong the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the
... Show MoreResearchers have increased interest in recent years in determining the optimum sample size to obtain sufficient accuracy and estimation and to obtain high-precision parameters in order to evaluate a large number of tests in the field of diagnosis at the same time. In this research, two methods were used to determine the optimum sample size to estimate the parameters of high-dimensional data. These methods are the Bennett inequality method and the regression method. The nonlinear logistic regression model is estimated by the size of each sampling method in high-dimensional data using artificial intelligence, which is the method of artificial neural network (ANN) as it gives a high-precision estimate commensurate with the dat
... Show MoreNonlinear time series analysis is one of the most complex problems ; especially the nonlinear autoregressive with exogenous variable (NARX) .Then ; the problem of model identification and the correct orders determination considered the most important problem in the analysis of time series . In this paper , we proposed splines estimation method for model identification , then we used three criterions for the correct orders determination. Where ; proposed method used to estimate the additive splines for model identification , And the rank determination depends on the additive property to avoid the problem of curse dimensionally . The proposed method is one of the nonparametric methods , and the simulation results give a
... Show More