Big data analysis is essential for modern applications in areas such as healthcare, assistive technology, intelligent transportation, environment and climate monitoring. Traditional algorithms in data mining and machine learning do not scale well with data size. Mining and learning from big data need time and memory efficient techniques, albeit the cost of possible loss in accuracy. We have developed a data aggregation structure to summarize data with large number of instances and data generated from multiple data sources. Data are aggregated at multiple resolutions and resolution provides a trade-off between efficiency and accuracy. The structure is built once, updated incrementally, and serves as a common data input for multiple mining and learning algorithms. Data mining algorithms are modified to accept the aggregated data as input. Hierarchical data aggregation serves as a paradigm under which novel …
Spatial data analysis is performed in order to remove the skewness, a measure of the asymmetry of the probablitiy distribution. It also improve the normality, a key concept of statistics from the concept of normal distribution “bell shape”, of the properties like improving the normality porosity, permeability and saturation which can be are visualized by using histograms. Three steps of spatial analysis are involved here; exploratory data analysis, variogram analysis and finally distributing the properties by using geostatistical algorithms for the properties. Mishrif Formation (unit MB1) in Nasiriya Oil Field was chosen to analyze and model the data for the first eight wells. The field is an anticline structure with northwest- south
... Show MoreIn this research, several estimators concerning the estimation are introduced. These estimators are closely related to the hazard function by using one of the nonparametric methods namely the kernel function for censored data type with varying bandwidth and kernel boundary. Two types of bandwidth are used: local bandwidth and global bandwidth. Moreover, four types of boundary kernel are used namely: Rectangle, Epanechnikov, Biquadratic and Triquadratic and the proposed function was employed with all kernel functions. Two different simulation techniques are also used for two experiments to compare these estimators. In most of the cases, the results have proved that the local bandwidth is the best for all the
... Show MoreAmplitude variation with offset (AVO) analysis is an 1 efficient tool for hydrocarbon detection and identification of elastic rock properties and fluid types. It has been applied in the present study using reprocessed pre-stack 2D seismic data (1992, Caulerpa) from north-west of the Bonaparte Basin, Australia. The AVO response along the 2D pre-stack seismic data in the Laminaria High NW shelf of Australia was also investigated. Three hypotheses were suggested to investigate the AVO behaviour of the amplitude anomalies in which three different factors; fluid substitution, porosity and thickness (Wedge model) were tested. The AVO models with the synthetic gathers were analysed using log information to find which of these is the
... Show MoreThe influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, whic
... Show MoreThe aim of this study is to estimate the parameters and reliability function for kumaraswamy distribution of this two positive parameter (a,b > 0), which is a continuous probability that has many characterstics with the beta distribution with extra advantages.
The shape of the function for this distribution and the most important characterstics are explained and estimated the two parameter (a,b) and the reliability function for this distribution by using the maximum likelihood method (MLE) and Bayes methods. simulation experiments are conducts to explain the behaviour of the estimation methods for different sizes depending on the mean squared error criterion the results show that the Bayes is bet
... Show MoreMerging biometrics with cryptography has become more familiar and a great scientific field was born for researchers. Biometrics adds distinctive property to the security systems, due biometrics is unique and individual features for every person. In this study, a new method is presented for ciphering data based on fingerprint features. This research is done by addressing plaintext message based on positions of extracted minutiae from fingerprint into a generated random text file regardless the size of data. The proposed method can be explained in three scenarios. In the first scenario the message was used inside random text directly at positions of minutiae in the second scenario the message was encrypted with a choosen word before ciphering
... Show MoreIn the last two decades, networks had been changed according to the rapid changing in its requirements. The current Data Center Networks have large number of hosts (tens or thousands) with special needs of bandwidth as the cloud network and the multimedia content computing is increased. The conventional Data Center Networks (DCNs) are highlighted by the increased number of users and bandwidth requirements which in turn have many implementation limitations. The current networking devices with its control and forwarding planes coupling result in network architectures are not suitable for dynamic computing and storage needs. Software Defined networking (SDN) is introduced to change this notion of traditional networks by decoupling control and
... Show MoreIn many oil-recovery systems, relative permeabilities (kr) are essential flow factors that affect fluid dispersion and output from petroleum resources. Traditionally, taking rock samples from the reservoir and performing suitable laboratory studies is required to get these crucial reservoir properties. Despite the fact that kr is a function of fluid saturation, it is now well established that pore shape and distribution, absolute permeability, wettability, interfacial tension (IFT), and saturation history all influence kr values. These rock/fluid characteristics vary greatly from one reservoir region to the next, and it would be impossible to make kr measurements in all of them. The unsteady-state approach was used to calculate the relat
... Show MoreThe development of information systems in recent years has contributed to various methods of gathering information to evaluate IS performance. The most common approach used to collect information is called the survey system. This method, however, suffers one major drawback. The decision makers consume considerable time to transform data from survey sheets to analytical programs. As such, this paper proposes a method called ‘survey algorithm based on R programming language’ or SABR, for data transformation from the survey sheets inside R environments by treating the arrangement of data as a relational format. R and Relational data format provide excellent opportunity to manage and analyse the accumulated data. Moreover, a survey syste
... Show More