A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

doi:10.1186/s40537-023-00727-2

Details

Publication Date

Fri Apr 14 2023

Journal Name

Journal Of Big Data

Volume

10

DOI

10.1186/s40537-023-00727-2

Choose Citation Style

Statistics

View publication

23

View pdf

1

Statistics

(506)

(500)

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

...Show More Authors

Abstract<p>Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.</p>

View Publication Preview PDF

Quick Preview PDF

Publication Date

Sat Oct 28 2023

Journal Name

Baghdad Science Journal

A Comparative Study of the Photostabilization of Polyvinyl Chloride with Nano and Micro Nickel Oxide

Photocatalysis

nano nickel oxide

PVC photodegradation

Polymer–NiO.

B. E.

Ahmed A.

N. A.A.

...Show More Authors

NiO nanoparticle synthesis by chemical method and characterized by XRD with crystal size 11.72
nm and grain size 13 nm from FESEM image also NiO micro used ,two NiO as an additive to evaluate the
possibility of producing photodegradable polymers, the practical application of solid-phase photocatalytic
degradation of polyvinyl chloride (PVC- NiO composite films) was investigated. PVC has a negative impact
on the environment since its polymer degrades slowly, yet it has a wide range of industrial applications and
the amount used shows no evidence of diminishing use. Thus, a synthesis of modified PVC- NiO micro and
nano has been studied with 0, 50, 100, 150, 200, 250, and 300 (hours) as irradiation time a

View Publication Preview PDF

(6)

(1)

Publication Date

Mon Apr 04 2022

Journal Name

Journal Of Educational And Psychological Researches

Some Indicators of Learning Difficulties and their Relationship to the Self-Concept of Elementary School Students (Case Study)

learning difficulties

self-concept

elementary school students

Esraa Shaker Al-Samarrai

...Show More Authors

The aim of the research is to identify learning difficulties and their role in children's perception of self-concept. The researcher adopted the descriptive and analytical approach method in this study. A questionnaire was designed by the researcher to collect some related information such as biodata, family, health, diagnostic and behavioral patterns of the case. In addition, the researcher adopted the scale of learning difficulties for elementary school students prepared by Zaidan Ahmed Al-Sartawi (1995), the scale of student appreciation for the survey of learning difficulties for primary school students by Michael Best, which was translated to the Arabic language by (Saeed Abdullah Debis). The researcher adopted also the Self-Concept

View Publication Preview PDF

Publication Date

Wed Sep 11 2024

Journal Name

Edelweiss Applied Science And Technology

Learning styles according to Entwistle model and their relationship to mathematical excellence among scientific fifth-grade female students

Maysoon

Lina

...Show More Authors

The objective of the present study is to determine the nature and direction of the correlation between mathematical excellence and learning styles as defined by the Entwistle, model in fifth-grade scientific female students. The descriptive correlational approach was implemented by the two researchers to accomplish the research objectives. A scale was developed to assess the learning styles of female students in the sample in accordance with the Entwistle, model. : (Knowledge, understanding, application, analysis, synthesis, evaluation, systematic thinking, creativity), and the research community was determined by the female students of the scientific fifth grade in the morning preparatory and secondary schools of the General Direct

View Publication

Publication Date

Sat Aug 01 2015

Journal Name

Journal Of Engineering

Analytical Approach for Load Capacity of Large Diameter Bored Piles Using Field Data

bored piles

lateral displacement

horizontal loads

vertical loads.

Alaa Dawood

Ali

...Show More Authors

An analytical approach based on field data was used to determine the strength capacity of large diameter bored type piles. Also the deformations and settlements were evaluated for both vertical and lateral loadings. The analytical predictions are compared to field data obtained from a proto-type test pile used at Tharthar –Tigris canal Bridge. They were found to be with acceptable agreement of 12% deviation.

Following ASTM standards D1143M-07e1,2010, a test schedule of five loading cycles were proposed for vertical loads and series of cyclic loads to simulate horizontal loading .The load test results and analytical data of 1.95

View Publication Preview PDF

Publication Date

Sun Mar 01 2015

Journal Name

Journal Of Engineering

Multi-Sites Multi-Variables Forecasting Model for Hydrological Data using Genetic Algorithm Modeling

forecasting

multi-sites

multi-variables

cross sites correlation

serial correlation

cross variables correlations

hydrology.

Rafa H.

...Show More Authors

A two time step stochastic multi-variables multi-sites hydrological data forecasting model was developed and verified using a case study. The philosophy of this model is to use the cross-variables correlations, cross-sites correlations and the two steps time lag correlations simultaneously, for estimating the parameters of the model which then are modified using the mutation process of the genetic algorithm optimization model. The objective function that to be minimized is the Akiake test value. The case study is of four variables and three sites. The variables are the monthly air temperature, humidity, precipitation, and evaporation; the sites are Sulaimania, Chwarta, and Penjwin, which are located north Iraq. The model performance was

View Publication Preview PDF

Publication Date

Tue Dec 01 2015

Journal Name

Journal Of Engineering

Ten Years of OpenStreetMap Project: Have We Addressed Data Quality Appropriately? – Review Paper

OpenStreetMap

VGI

spatial data quality

geometrical similarity

positional accuracy.

Maythm

...Show More Authors

It has increasingly been recognised that the future developments in geospatial data handling will centre on geospatial data on the web: Volunteered Geographic Information (VGI). The evaluation of VGI data quality, including positional and shape similarity, has become a recurrent subject in the scientific literature in the last ten years. The OpenStreetMap (OSM) project is the most popular one of the leading platforms of VGI datasets. It is an online geospatial database to produce and supply free editable geospatial datasets for a worldwide. The goal of this paper is to present a comprehensive overview of the quality assurance of OSM data. In addition, the credibility of open source geospatial data is discussed, highlight

View Publication Preview PDF

Publication Date

Tue Jan 01 2019

Journal Name

Journal Of Southwest Jiaotong University

Recognizing Job Apathy Patterns of Iraqi Higher Education Employees Using Data Mining Techniques

Mustafa S.

Suhad Faisal

...Show More Authors

Psychological research centers help indirectly contact professionals from the fields of human life, job environment, family life, and psychological infrastructure for psychiatric patients. This research aims to detect job apathy patterns from the behavior of employee groups in the University of Baghdad and the Iraqi Ministry of Higher Education and Scientific Research. This investigation presents an approach using data mining techniques to acquire new knowledge and differs from statistical studies in terms of supporting the researchers’ evolving needs. These techniques manipulate redundant or irrelevant attributes to discover interesting patterns. The principal issue identifies several important and affective questions taken from

View Publication

(1)

Publication Date

Sun Jul 01 2018

Journal Name

Agronomy Journal

Use of Rainfall Data to Improve Ground-Based Active Optical Sensors Yield Estimates

Ahmed

...Show More Authors

Ground-based active optical sensors (GBAOS) have been successfully used in agriculture to predict crop yield potential (YP) early in the season and to improvise N rates for optimal crop yield. However, the models were found weak or inconsistent due to environmental variation especially rainfall. The objectives of the study were to evaluate if GBAOS could predict YP across multiple locations, soil types, cultivation systems, and rainfall differences. This study was carried from 2011 to 2013 on corn (Zea mays L.) in North Dakota, and in 2017 in potatoes in Maine. Six N rates were used on 50 sites in North Dakota and 12 N rates on two sites, one dryland and one irrigated, in Maine. Two active GBAOS used for this study were GreenSeeker and Holl

View Publication

Publication Date

Fri Mar 01 2019

Journal Name

Spatial Statistics

Efficient Bayesian modeling of large lattice data using spectral properties of Laplacian matrix

Adaptive specification

Areal spatial data

Conditionally autoregressive prior

Dimension reduction

Plant abundance

Spike and slab prior

Ghadeer J.M.

Avishek

Mark E.

Anthony G.

...Show More Authors

Spatial data observed on a group of areal units is common in scientific applications. The usual hierarchical approach for modeling this kind of dataset is to introduce a spatial random effect with an autoregressive prior. However, the usual Markov chain Monte Carlo scheme for this hierarchical framework requires the spatial effects to be sampled from their full conditional posteriors one-by-one resulting in poor mixing. More importantly, it makes the model computationally inefficient for datasets with large number of units. In this article, we propose a Bayesian approach that uses the spectral structure of the adjacency to construct a low-rank expansion for modeling spatial dependence. We propose a pair of computationally efficient estimati

View Publication

(9)

(6)

Publication Date

Sun May 11 2025

Journal Name

Iraqi Statisticians Journal

Estimating General Linear Regression Model of Big Data by Using Multiple Test Technique

Ahmed Mahdi

Munaf Yousif

...Show More Authors

View Publication

1 2 ... 154 155 156 157 ... 2168 2169