Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
Eye Detection is used in many applications like pattern recognition, biometric, surveillance system and many other systems. In this paper, a new method is presented to detect and extract the overall shape of one eye from image depending on two principles Helmholtz & Gestalt. According to the principle of perception by Helmholz, any observed geometric shape is perceptually "meaningful" if its repetition number is very small in image with random distribution. To achieve this goal, Gestalt Principle states that humans see things either through grouping its similar elements or recognize patterns. In general, according to Gestalt Principle, humans see things through genera
... Show MoreThe aim of this research is to diagnose the attention deficit hyperactivity disorder among primary school pupils in Baquba city of Diyala province. The sample of the study consisted of (25) male and female pupils. The American Guide of Attention Deficit Hyperactivity Scale (DSM-IV, 1994) was used in this study in addition to Conner’s (1996) scale to measure the attention deficit hyperactivity disorder for teachers and parents. The result revealed that (19) male and female pupils diagnosed with attention deficit hyperactivity to various degrees.
Attention-Deficit Hyperactivity Disorder (ADHD), a neurodevelopmental disorder affecting millions of people globally, is defined by symptoms of hyperactivity, impulsivity, and inattention that can significantly affect an individual's daily life. The diagnostic process for ADHD is complex, requiring a combination of clinical assessments and subjective evaluations. However, recent advances in artificial intelligence (AI) techniques have shown promise in predicting ADHD and providing an early diagnosis. In this study, we will explore the application of two AI techniques, K-Nearest Neighbors (KNN) and Adaptive Boosting (AdaBoost), in predicting ADHD using the Python programming language. The classification accuracies obtained w
... Show MoreClinical keratoconus (KCN) detection is a challenging and time-consuming task. In the diagnosis process, ophthalmologists must revise demographic and clinical ophthalmic examinations. The latter include slit-lamb, corneal topographic maps, and Pentacam indices (PI). We propose an Ensemble of Deep Transfer Learning (EDTL) based on corneal topographic maps. We consider four pretrained networks, SqueezeNet (SqN), AlexNet (AN), ShuffleNet (SfN), and MobileNet-v2 (MN), and fine-tune them on a dataset of KCN and normal cases, each including four topographic maps. We also consider a PI classifier. Then, our EDTL method combines the output probabilities of each of the five classifiers to obtain a decision b
Wildfire risk has globally increased during the past few years due to several factors. An efficient and fast response to wildfires is extremely important to reduce the damaging effect on humans and wildlife. This work introduces a methodology for designing an efficient machine learning system to detect wildfires using satellite imagery. A convolutional neural network (CNN) model is optimized to reduce the required computational resources. Due to the limitations of images containing fire and seasonal variations, an image augmentation process is used to develop adequate training samples for the change in the forest’s visual features and the seasonal wind direction at the study area during the fire season. The selected CNN model (Mob
... Show MoreIn pre- Islamic poetry, there are a lot of words that indicate
peacefulness of one sort of another, in addition to the inspirations of semantic
modeling in which the poet sets himself in various horizons.
Among these words: brother, comrade, friend, companion, lover,
people, prince, home, land, country, blessing, honesty, contract, company,
justice, thankfulness, forgiveness, pardoning, guest, goodness, faithfulness,
silence, death, peace,….
In addition, there are their derivatives from various aspects that indicate
peacefulness either directly or indirectly.
The paper pays attention to the polysemous words Harry Potter (HP). In this story, the present study exams some picking polysemic words to the extent that the translators of HP prevail to render the proposed significance as per the setting of the first content. Obviously, the picking translators in this examination were not mindful of the wonder of polysemy in the HP. They embrace a strict interpretation methodology to pass on the greater part of the polysemic sense. The method of data collection is divided into two stages. Firstly, determining the situational context of the fantasy and identifying the polysemic sense to clearly make all the contextual meanings of the source text. Secondly, reviewing the selected translation to
... Show MoreSpeech recognition is a very important field that can be used in many applications such as controlling to protect area, banking, transaction over telephone network database access service, voice email, investigations, House controlling and management ... etc. Speech recognition systems can be used in two modes: to identify a particular person or to verify a person’s claimed identity. The family speaker recognition is a modern field in the speaker recognition. Many family speakers have similarity in the characteristics and hard to identify between them. Today, the scope of speech recognition is limited to speech collected from cooperative users in real world office environments and without adverse microphone or channel impairments.
The growth of developments in machine learning, the image processing methods along with availability of the medical imaging data are taking a big increase in the utilization of machine learning strategies in the medical area. The utilization of neural networks, mainly, in recent days, the convolutional neural networks (CNN), have powerful descriptors for computer added diagnosis systems. Even so, there are several issues when work with medical images in which many of medical images possess a low-quality noise-to-signal (NSR) ratio compared to scenes obtained with a digital camera, that generally qualified a confusingly low spatial resolution and tends to make the contrast between different tissues of body are very low and it difficult to co
... Show More