Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
A case-control study was performed to examine age, gender, and ABO blood groups in 1014 Iraqi hospitalized cases with Coronavirus disease 2019 (COVID-19) and 901 blood donors (control group). The infection was molecularly diagnosed by detecting coronavirus RNA in nasal swabs of patients.
Mean age was significantly elevated in cases compared to controls (48.2 ± 13.8
In the present paper, three reliable iterative methods are given and implemented to solve the 1D, 2D and 3D Fisher’s equation. Daftardar-Jafari method (DJM), Temimi-Ansari method (TAM) and Banach contraction method (BCM) are applied to get the exact and numerical solutions for Fisher's equations. The reliable iterative methods are characterized by many advantages, such as being free of derivatives, overcoming the difficulty arising when calculating the Adomian polynomial boundaries to deal with nonlinear terms in the Adomian decomposition method (ADM), does not request to calculate Lagrange multiplier as in the Variational iteration method (VIM) and there is no need to create a homotopy like in the Homotopy perturbation method (H
... Show MoreCompressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension) on speech compression. DWT and MCT performances in terms of comp
... Show MoreClassifying an overlapping object is one of the main challenges faced by researchers who work in object detection and recognition. Most of the available algorithms that have been developed are only able to classify or recognize objects which are either individually separated from each other or a single object in a scene(s), but not overlapping kitchen utensil objects. In this project, Faster R-CNN and YOLOv5 algorithms were proposed to detect and classify an overlapping object in a kitchen area. The YOLOv5 and Faster R-CNN were applied to overlapping objects where the filter or kernel that are expected to be able to separate the overlapping object in the dedicated layer of applying models. A kitchen utensil benchmark image database and
... Show MoreIn this paper two main stages for image classification has been presented. Training stage consists of collecting images of interest, and apply BOVW on these images (features extraction and description using SIFT, and vocabulary generation), while testing stage classifies a new unlabeled image using nearest neighbor classification method for features descriptor. Supervised bag of visual words gives good result that are present clearly in the experimental part where unlabeled images are classified although small number of images are used in the training process.
Background: MicroRNAs (miRNAs) are small noncoding RNAs that postâ€transcriptionally regulate gene expression by targeting specific mRNAs. The main objective of this study was measure the level of salivary (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in both oral squamous cell carcinoma and healthy controls to asses the association of them with age, gender and tumor grade materials and methods The level of three salivary microRNAs namely hsa-miR-200a, hsa-miR-125a and hsa- miR-93 were measured in saliva of patients with oral squamous cell carcinoma and healthy controls by using reveres transcription, preamplification and quantitative PCR also the general information from each patient including the age, sex and tumor grade were record
... Show More