BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

doi:10.5455/jjcit.71-1703265368

Details

Publication Date

Mon Jan 01 2024

Journal Name

Jordanian Journal Of Computers And Information Technology

DOI

10.5455/jjcit.71-1703265368

Choose Citation Style

Statistics

View publication

6

Statistics

BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

...Show More Authors

Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.

View Publication

Publication Date

Tue Dec 01 2020

Journal Name

Egyptian Journal Of Medical Human Genetics

Association between ABO blood groups and susceptibility to COVID-19: profile of age and gender in Iraqi patients

Ali J. R.

...Show More Authors

Abstract<sec> <title>Background

A case-control study was performed to examine age, gender, and ABO blood groups in 1014 Iraqi hospitalized cases with Coronavirus disease 2019 (COVID-19) and 901 blood donors (control group). The infection was molecularly diagnosed by detecting coronavirus RNA in nasal swabs of patients.

Results

Mean age was significantly elevated in cases compared to controls (48.2 ± 13.8 vs. 29.9 ± 9.0 year; probability [p] < 0.001). Receiver operating characteristic anal

View Publication

(23)

(16)

Publication Date

Sat Sep 30 2023

Journal Name

Wasit Journal Of Computer And Mathematics Science

Real time handwriting recognition system using CNN algorithms

Maryam

...Show More Authors

Abstract— The growing use of digital technologies across various sectors and daily activities has made handwriting recognition a popular research topic. Despite the continued relevance of handwriting, people still require the conversion of handwritten copies into digital versions that can be stored and shared digitally. Handwriting recognition involves the computer's strength to identify and understand legible handwriting input data from various sources, including document, photo-graphs and others. Handwriting recognition pose a complexity challenge due to the diversity in handwriting styles among different individuals especially in real time applications. In this paper, an automatic system was designed to handwriting recognition

View Publication

(1)

Publication Date

Wed Oct 09 2024

Journal Name

Engineering, Technology & Applied Science Research

Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

CNN pre-trained models

LSTM

activation function

hyper-parameters

overfitting

Nuha M.

Nada

...Show More Authors

The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of

View Publication

(7)

(5)

Publication Date

Wed Jan 01 2020

Journal Name

Arab Journal Of Basic And Applied Sciences

Reliable iterative methods for 1D Swift–Hohenberg equation

Majeed A.

Othman Mahdi

...Show More Authors

View Publication

(4)

Publication Date

Thu Jul 03 2025

Journal Name

2025 3rd International Conference On Cyber Resilience (iccr)

Fine-Grained Emotion Recognition from Short Video Clips Using CNN-LSTM with Facial Action Heatmaps

Zahraa Haimeed

Haneen Siraj

Enas Ahmed

Mina Taha

Nourhan Ahmed

Azhaar A.

Wael Yahya

Inbithaq Ahmed

...Show More Authors

View Publication

Publication Date

Mon Jan 04 2021

Journal Name

Iium Engineering Journal

RELIABLE ITERATIVE METHODS FOR SOLVING 1D, 2D AND 3D FISHER’S EQUATION

Othman Mahdi

Majeed

...Show More Authors

In the present paper, three reliable iterative methods are given and implemented to solve the 1D, 2D and 3D Fisher’s equation. Daftardar-Jafari method (DJM), Temimi-Ansari method (TAM) and Banach contraction method (BCM) are applied to get the exact and numerical solutions for Fisher's equations. The reliable iterative methods are characterized by many advantages, such as being free of derivatives, overcoming the difficulty arising when calculating the Adomian polynomial boundaries to deal with nonlinear terms in the Adomian decomposition method (ADM), does not request to calculate Lagrange multiplier as in the Variational iteration method (VIM) and there is no need to create a homotopy like in the Homotopy perturbation method (H

View Publication

(2)

Publication Date

Mon Dec 31 2012

Journal Name

Al-khwarizmi Engineering Journal

Speech Compression Using Multecirculerletet Transform

Sound

Speech Compression

MCT

DWT

Sulaiman

Ali. K.

...Show More Authors

Compressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension) on speech compression. DWT and MCT performances in terms of comp

View Publication Preview PDF

Publication Date

Mon Jan 04 2021

Journal Name

Multimedia Tools And Applications

Attention enhancement system for college students with brain biofeedback signals based on virtual reality

Marwan Kadhim Mohammed

TianHan

Rana Kadhim

Song

...Show More Authors

View Publication

(5)

(7)

Publication Date

Thu Dec 15 2016

Journal Name

Research Journal Of Applied Sciences, Engineering And Technology

Building Words Dictionary List Using Symbol Enumeration and Hashing Methodology

Safa

Loay

...Show More Authors

View Publication

(1)

Publication Date

Fri May 29 2026

Journal Name

Journal Of Baghdad College Of Dentistry

Salivary microRNAs (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in relation to age, gender and histopathological parameters.

Shaimaa H

Raja H

Ban A

...Show More Authors

Background: MicroRNAs (miRNAs) are small noncoding RNAs that postâ€transcriptionally regulate gene expression by targeting specific mRNAs. The main objective of this study was measure the level of salivary (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in both oral squamous cell carcinoma and healthy controls to asses the association of them with age, gender and tumor grade materials and methods The level of three salivary microRNAs namely hsa-miR-200a, hsa-miR-125a and hsa- miR-93 were measured in saliva of patients with oral squamous cell carcinoma and healthy controls by using reveres transcription, preamplification and quantitative PCR also the general information from each patient including the age, sex and tumor grade were record

View Publication Preview PDF

1 2 ... 4 5 6 7 ... 2231 2232