Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
The data presented in this paper are related to the research article entitled “Novel dichloro(bis{2-[1-(4-methylphenyl)-1H-1,2,3-triazol-4-yl-κN3 ]pyridine-κN})metal(II) coordination compounds of seven transition metals (Mn, Fe, Co, Ni, Cu, Zn and Cd)” (Conradie et al., 2018) [1]. This paper presents characterization and structural data of the 2-(1-(4-methyl-phenyl)-1H-1,2,3-triazol-1-yl)pyridine ligand (L2 ) (Tawfiq et al., 2014) [2] as well as seven dichloro(bis{2- [1-(4-methylphenyl)-1H-1,2,3-triazol-4-yl-κN3 ]pyridine-κN})metal (II) coordination compounds, [M(L2 )2Cl2], all containing the same ligand but coordinated to different metal ions. The data illustrate the shift in IR, UV/VIS, and NMR (for diamagnetic complexes) peaks wh
... Show MoreCadmium sulfide (CdS) nanocrystalline thin films have been prepared by chemical bath deposition (CBD) technique on commercial glass substrates at 70ºC temperature. Cadmium chloride (CdCl2) as a source of cadmium (Cd), thiourea (CS(NH2)2) as a source of sulfur and ammonia solution (NH4OH) were added to maintain the pH value of the solution at 10. The characterization of thin films was carried out through the structural and optical properties by X-ray diffraction (XRD) and UV-VIS spectroscopy. A UV-VIS optical spectroscopy study was carried out to determine the band gap of the nanocrystalline CdS thin film and it showed a blue shift with respect to the bulk value (from 3.9 - 2.4eV). In present w
... Show MoreToday, the role of cloud computing in our day-to-day lives is very prominent. The cloud computing paradigm makes it possible to provide demand-based resources. Cloud computing has changed the way that organizations manage resources due to their robustness, low cost, and pervasive nature. Data security is usually realized using different methods such as encryption. However, the privacy of data is another important challenge that should be considered when transporting, storing, and analyzing data in the public cloud. In this paper, a new method is proposed to track malicious users who use their private key to decrypt data in a system, share it with others and cause system information leakage. Security policies are also considered to be int
... Show MoreIn data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
Reliable data transfer and energy efficiency are the essential considerations for network performance in resource-constrained underwater environments. One of the efficient approaches for data routing in underwater wireless sensor networks (UWSNs) is clustering, in which the data packets are transferred from sensor nodes to the cluster head (CH). Data packets are then forwarded to a sink node in a single or multiple hops manners, which can possibly increase energy depletion of the CH as compared to other nodes. While several mechanisms have been proposed for cluster formation and CH selection to ensure efficient delivery of data packets, less attention has been given to massive data co
In this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method
Modern civilization increasingly relies on sustainable and eco-friendly data centers as the core hubs of intelligent computing. However, these data centers, while vital, also face heightened vulnerability to hacking due to their role as the convergence points of numerous network connection nodes. Recognizing and addressing this vulnerability, particularly within the confines of green data centers, is a pressing concern. This paper proposes a novel approach to mitigate this threat by leveraging swarm intelligence techniques to detect prospective and hidden compromised devices within the data center environment. The core objective is to ensure sustainable intelligent computing through a colony strategy. The research primarily focusses on the
... Show MoreThis study aim to identify the concept of web based information systems since its one of the important topics that is usually omitted by our organizations, in addition to, designing a web based information system in order to manage the customers data of Al- Rasheed bank, as a unified information system that is specialized to the banking deals of the customers with the bank, and providing a suggested model to apply the virtual private network as a tool that is to protect the transmitted data through the web based information system.
This study is considered important because it deals with one of the vital topics nowadays, namely: how to make it possible to use a distributed informat
... Show MoreChemical pollution is a very important issue that people suffer from and it often affects the nature of health of society and the future of the health of future generations. Consequently, it must be considered in order to discover suitable models and find descriptions to predict the performance of it in the forthcoming years. Chemical pollution data in Iraq take a great scope and manifold sources and kinds, which brands it as Big Data that need to be studied using novel statistical methods. The research object on using Proposed Nonparametric Procedure NP Method to develop an (OCMT) test procedure to estimate parameters of linear regression model with large size of data (Big Data) which comprises many indicators associated with chemi
... Show More