Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
An oil spill is a leakage of pipelines, vessels, oil rigs, or tankers that leads to the release of petroleum products into the marine environment or on land that happened naturally or due to human action, which resulted in severe damages and financial loss. Satellite imagery is one of the powerful tools currently utilized for capturing and getting vital information from the Earth's surface. But the complexity and the vast amount of data make it challenging and time-consuming for humans to process. However, with the advancement of deep learning techniques, the processes are now computerized for finding vital information using real-time satellite images. This paper applied three deep-learning algorithms for satellite image classification
... Show MoreThis paper addresses the nature of Spatial Data Infrastructure (SDI), considered as one of the most important concepts to ensure effective functioning in a modern society. It comprises a set of continually developing methods and procedures providing the geospatial base supporting a country’s governmental, environmental, economic, and social activities. In general, the SDI framework consists of the integration of various elements including standards, policies, networks, data, and end users and application areas. The transformation of previously paper-based map data into a digital format, the emergence of GIS, and the Internet and a host of online applications (e.g., environmental impact analysis, navigation, applications of VGI dat
... Show MoreThe importance of Public Relations activity has increased during the last half of the last century as a specialized and modern administrative function in most institutions. It has, moreover, become an integral part of activities of those institutions of various types, due to its pivotal role in building its reputation and drawing a good mental image among its audiences, as well as its influential and basic role in maintaining communication and the communication between its members at its various levels and their job tasks to ensure the greatest amount of understanding and to enhance trust between them. This is why public relations activity has become indispensable in all institutions, and without it, it is difficult to achieve any coordi
... Show MoreBreast cancer is highlighted in recent research as one of the most prevalent types of cancer. Timely identification is essential for enhancing patient results and decreasing fatality rates. Utilizing computer-assisted detection and diagnosis early on may greatly improve the chances of recovery by accurately predicting outcomes and developing suitable treatment plans. Grading breast cancer properly, especially evaluating nuclear atypia, is difficult owing to faults and inconsistencies in slide preparation and the intricate nature of tissue patterns. This work explores the capability of deep learning to extract characteristics from histopathology photos of breast cancer. The research introduces a new method called SMOTE-based Convolut
... Show More