Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
This research study Blur groups (Fuzzy Sets) which is the perception of the most modern in the application in various practical and theoretical areas and in various fields of life, was addressed to the fuzzy random variable whose value is not real, but the numbers Millbh because it expresses the mysterious phenomena or uncertain with measurements are not assertive. Fuzzy data were presented for binocular test and analysis of variance method of random Fuzzy variables , where this method depends on a number of assumptions, which is a problem that prevents the use of this method in the case of non-realized.
Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the
... Show More
The great scientific progress has led to widespread Information as information accumulates in large databases is important in trying to revise and compile this vast amount of data and, where its purpose to extract hidden information or classified data under their relations with each other in order to take advantage of them for technical purposes.
And work with data mining (DM) is appropriate in this area because of the importance of research in the (K-Means) algorithm for clustering data in fact applied with effect can be observed in variables by changing the sample size (n) and the number of clusters (K)
... Show MoreThe main objective of this study is to experimentally investigate the effect of the CMC polymeric drag reducer on the pressure drop occurred along the annulus of the wellbore in drilling operation and investigate the optimum polymer concentration that give the minimum pressure drop. A flow loop was designed for this purpose consist from 14 m long with transparent test section and differential pressure transmitter that allows to sense and measure the pressure losses along the test section. The results from the experimental work show that increasing in polymer concentration help to reduce the pressure drop in annulus and the optimum polymer concentration with the maximum drag reducing is 0.8 kg/m3. Also increasing in flow rate a
... Show MoreBig data of different types, such as texts and images, are rapidly generated from the internet and other applications. Dealing with this data using traditional methods is not practical since it is available in various sizes, types, and processing speed requirements. Therefore, data analytics has become an important tool because only meaningful information is analyzed and extracted, which makes it essential for big data applications to analyze and extract useful information. This paper presents several innovative methods that use data analytics techniques to improve the analysis process and data management. Furthermore, this paper discusses how the revolution of data analytics based on artificial intelligence algorithms might provide
... Show MoreToday, problems of spatial data integration have been further complicated by the rapid development in communication technologies and the increasing amount of available data sources on the World Wide Web. Thus, web-based geospatial data sources can be managed by different communities and the data themselves can vary in respect to quality, coverage, and purpose. Integrating such multiple geospatial datasets remains a challenge for geospatial data consumers. This paper concentrates on the integration of geometric and classification schemes for official data, such as Ordnance Survey (OS) national mapping data, with volunteered geographic information (VGI) data, such as the data derived from the OpenStreetMap (OSM) project. Useful descriptions o
... Show MoreGivers of foreign Audit about Social Responsibility of Profit Organization. The recent time is charcterstically with big economic Organization activities, because there are many transactions between these Organizations and different financial markets development techniques.
This encourgage business men to increase their efforts for investment in these markets. Because the Accounting is in general terms it represents a language of these Unions Activities and translate them in to fact numbers, for that there is need for Accounting recording for certain of these Organizations behavior and their harmonization with their Objectives.
In this respect the Audit function comes to che
... Show MoreThis paper presents a grey model GM(1,1) of the first rank and a variable one and is the basis of the grey system theory , This research dealt properties of grey model and a set of methods to estimate parameters of the grey model GM(1,1) is the least square Method (LS) , weighted least square method (WLS), total least square method (TLS) and gradient descent method (DS). These methods were compared based on two types of standards: Mean square error (MSE), mean absolute percentage error (MAPE), and after comparison using simulation the best method was applied to real data represented by the rate of consumption of the two types of oils a Heavy fuel (HFO) and diesel fuel (D.O) and has been applied several tests to
... Show MorePharmaceutical-instigated pollution is a major concern, especially in relation to aquatic environments and drugs such as meropenem antibiotics. Adsorbents, such as multi-walled carbon nanotubes, offer potential as means of removing polluting meropenem antibiotics and other similar compounds from water. In order to evaluate the effectiveness of multi-walled carbon nanotubes in this capacity, various experimental parameters, including contact time, initial concentration, pH, temperature and the dose of adsorbent have been investigated. The Langmuir and the Freundlich isotherm models have been used. The data obtained using a modified Langmuir model have been consistent with the experimental ones; the best pH value has been obtained to have the
... Show MoreBackground: One of the drawbacks of vital teeth bleaching is color stability. The aim of the present study was to evaluate the effects of tea and tomato sauce on the color stability of bleached enamel in association with the application of MI Paste Plus (CPP-ACPF). Materials and Methods: Sixty enamel samples were bleached with 10% carbamide peroxide for two weeks then divided into three groups (A, B and C) of 20 samples each. After bleaching, the samples of each group were subdivided into two subgroups (n=10). While subgroups A1, B1 and C1 were kept in distilled water, A2, B2, and C2 were treated with MI Paste Plus. Then, the samples were immersed in different solutions as follow: A1 and A2 in distilled water (control); B1 and B2 in black
... Show More