The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.
It is well known that the rate of penetration is a key function for drilling engineers since it is directly related to the final well cost, thus reducing the non-productive time is a target of interest for all oil companies by optimizing the drilling processes or drilling parameters. These drilling parameters include mechanical (RPM, WOB, flow rate, SPP, torque and hook load) and travel transit time. The big challenge prediction is the complex interconnection between the drilling parameters so artificial intelligence techniques have been conducted in this study to predict ROP using operational drilling parameters and formation characteristics. In the current study, three AI techniques have been used which are neural network, fuzzy i
... Show MoreThe research aims to develop physical exercises with auxiliary training tools that work to develop the explosive power of the arms and legs, and then find out their effect on the accuracy of shooting from free throw and correction from jumping of advanced basketball players, as the researchers found a problem that these players have weakness in the skill of throwing Free throwing and shooting by jumping calculated with two points as a result of adopting unhealthy physical and technical positions, which led to a lack of focus and accuracy, and thus negatively affected the performance technique of free throw and jump shot, as most teams use traditional exercises without the use of auxiliary training tools, and this topic gave researchers the
... Show MoreThe goal of this research is to develop a numerical model that can be used to simulate the sedimentation process under two scenarios: first, the flocculation unit is on duty, and second, the flocculation unit is out of commission. The general equation of flow and sediment transport were solved using the finite difference method, then coded using Matlab software. The result of this study was: the difference in removal efficiency between the coded model and operational model for each particle size dataset was very close, with a difference value of +3.01%, indicating that the model can be used to predict the removal efficiency of a rectangular sedimentation basin. The study also revealed
In this research, the focus was placed on estimating the parameters of the Hypoexponential distribution function using the maximum likelihood method and genetic algorithm. More than one standard, including MSE, has been adopted for comparison by Using the simulation method
This study examined the correlation between binder-level fatigue properties and mixture-level cracking resistance in asphalt binders modified with five Nanomaterials (NMs): Nano-Silica (NS), Nano-Alumina (NA), and Nano-Titanium dioxide (NT) at 2%, 4%, and 6% as well as Nano-Zinc oxide (NZ) and Carbon Nanotubes (CNTs) at 1%, 2%, and 3%. Modified binders were subjected to Rolling Thin-Film Oven Test (RTFOT) and Pressure Aging Vessel (PAV) aging and tested at 25 °C using the Linear Amplitude Sweep (LAS) test to determine fatigue life (Nf) and the fatigue parameter G*.sin δ. The corresponding asphalt mixtures were evaluated using the IDEAL-CT test. The results indicated strong correlations between binder and mixture performance for
... Show MoreIn this research, a factorial experiment (4*4) was studied, applied in a completely random block design, with a size of observations, where the design of experiments is used to study the effect of transactions on experimental units and thus obtain data representing experiment observations that The difference in the application of these transactions under different environmental and experimental conditions It causes noise that affects the observation value and thus an increase in the mean square error of the experiment, and to reduce this noise, multiple wavelet reduction was used as a filter for the observations by suggesting an improved threshold that takes into account the different transformation levels based on the logarithm of the b
... Show MoreCorpus linguistics is a methodology in studying language through corpus-based research. It differs from a traditional approach in studying a language (prescriptive approach) in its insistence on the systematic study of authentic examples of language in use (descriptive approach).A “corpus” is a large body of machine-readable structurally collected naturally occurring linguistic data, either written texts or a transcription of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a language. In the past decade, interest has grown tremendously in the use of language corpora for language education. The ways in which corpora have been employed in language pedago
... Show MoreThe transmitting and receiving of data consume the most resources in Wireless Sensor Networks (WSNs). The energy supplied by the battery is the most important resource impacting WSN's lifespan in the sensor node. Therefore, because sensor nodes run from their limited battery, energy-saving is necessary. Data aggregation can be defined as a procedure applied for the elimination of redundant transmissions, and it provides fused information to the base stations, which in turn improves the energy effectiveness and increases the lifespan of energy-constrained WSNs. In this paper, a Perceptually Important Points Based Data Aggregation (PIP-DA) method for Wireless Sensor Networks is suggested to reduce redundant data before sending them to the
... Show MoreObjective This research investigates Breast Cancer real data for Iraqi women, these data are acquired manually from several Iraqi Hospitals of early detection for Breast Cancer. Data mining techniques are used to discover the hidden knowledge, unexpected patterns, and new rules from the dataset, which implies a large number of attributes. Methods Data mining techniques manipulate the redundant or simply irrelevant attributes to discover interesting patterns. However, the dataset is processed via Weka (The Waikato Environment for Knowledge Analysis) platform. The OneR technique is used as a machine learning classifier to evaluate the attribute worthy according to the class value. Results The evaluation is performed using
... Show More