This research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB), and Support Vector Machine (SVM) techniques. CART gives clear results with high accuracy between the six supervised algorithms. It is worth noting that the preprocessing steps take remarkable efforts to handle this type of data, since its pure data set has so many null values of a ratio 94.8%, then it becomes 0% after achieving the preprocessing steps. Then, in order to apply CART algorithm, several determined tests were assumed as classes. The decision to select the tests which had been assumed as classes were depending on their acquired accuracy. Consequently, enabling the physicians to trace and connect the tests result with each other, which extends its impact on patients’ health.
Many people take protein supplements in an effort to gain muscle. However, there is some controversy as to whether this is really effective. There is evidence suggesting that consuming high level s of protein may in fact have negative side effects for health. The current study included 29 young Iraqi building muscles in two different groups (taken and not protein supplements) (age range=17-31 years), the cases were selected from family, friends, college students, and Gyms), from November 2014 to March 2015. A careful history was obtained from each volunteer including age, duration of sports, type of supplements, and family history of diseases. Some biochemical parameters like (glucose, urea, uric acid, creatinine, bilirubin, serum protei
... Show MoreMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated d
Support vector machines (SVMs) are supervised learning models that analyze data for classification or regression. For classification, SVM is widely used by selecting an optimal hyperplane that separates two classes. SVM has very good accuracy and extremally robust comparing with some other classification methods such as logistics linear regression, random forest, k-nearest neighbor and naïve model. However, working with large datasets can cause many problems such as time-consuming and inefficient results. In this paper, the SVM has been modified by using a stochastic Gradient descent process. The modified method, stochastic gradient descent SVM (SGD-SVM), checked by using two simulation datasets. Since the classification of different ca
... Show MoreDeficiencies in revenue-related accounting standards, including American accounting standards as well as international accounting standards, prompted the issuance of the International Financial Reporting Standard IFRS 15 "Revenue from contracts with customers" as part of the convergence plan between the FASB and the International Accounting Standards Board (IASB) according to the requirements of The joint venture between the two councils, whereby the standard aims to define the basis for reporting useful information to the users of the financial statements about the nature, amount, timing and uncertainty about the revenues and cash flows arising from a contract with the customer, The standard is base
... Show MoreSurvival analysis is one of the types of data analysis that describes the time period until the occurrence of an event of interest such as death or other events of importance in determining what will happen to the phenomenon studied. There may be more than one endpoint for the event, in which case it is called Competing risks. The purpose of this research is to apply the dynamic approach in the analysis of discrete survival time in order to estimate the effect of covariates over time, as well as modeling the nonlinear relationship between the covariates and the discrete hazard function through the use of the multinomial logistic model and the multivariate Cox model. For the purpose of conducting the estimation process for both the discrete
... Show MoreIn developing countries, individual students and researchers are not able to afford the high price of the subscription to the international publishers, like JSTOR, ELSEVIER,…; therefore the governments and/or universities of those countries aim to purchase one global subscription to the international publishers to provide their educational resources at a cheaper price, or even freely, to all students and researchers of those institutions. For realizing this concept, we must build a system that sits between the publishers and the users (students or researchers) and act as a gatekeeper and a director of information: this system must register its users and must have an adequate security to e
... Show More