This study explores the challenges in Artificial Intelligence (AI) systems in generating image captions, a task that requires effective integration of computer vision and natural language processing techniques. A comparative analysis between traditional approaches such as retrieval- based methods and linguistic templates) and modern approaches based on deep learning such as encoder-decoder models, attention mechanisms, and transformers). Theoretical results show that modern models perform better for the accuracy and the ability to generate more complex descriptions, while traditional methods outperform speed and simplicity. The paper proposes a hybrid framework that combines the advantages of both approaches, where conventional methods produce an initial description, which is then contextually, and refined using modern models. Preliminary estimates indicate that this approach could reduce the initial computational cost by up to 20% compared to relying entirely on deep models while maintaining high accuracy. The study recommends further research to develop effective coordination mechanisms between traditional and modern methods and to move to the experimental validation phase of the hybrid model in preparation for its application in environments that require a balance between speed and accuracy, such as real-time computer vision applications.
This paper presents a proposed method for (CBIR) from using Discrete Cosine Transform with Kekre Wavelet Transform (DCT/KWT), and Daubechies Wavelet Transform with Kekre Wavelet Transform (D4/KWT) to extract features for Distributed Database system where clients/server as a Star topology, client send the query image and server (which has the database) make all the work and then send the retrieval images to the client. A comparison between these two approaches: first DCT compare with DCT/KWT and second D4 compare with D4/KWT are made. The work experimented over the image database of 200 images of 4 categories and the performance of image retrieval with respect to two similarity measures namely Euclidian distance (ED) and sum of absolute diff
... Show MoreText based-image clustering (TBIC) is an insufficient approach for clustering related web images. It is a challenging task to abstract the visual features of images with the support of textual information in a database. In content-based image clustering (CBIC), image data are clustered on the foundation of specific features like texture, colors, boundaries, shapes. In this paper, an effective CBIC) technique is presented, which uses texture and statistical features of the images. The statistical features or moments of colors (mean, skewness, standard deviation, kurtosis, and variance) are extracted from the images. These features are collected in a one dimension array, and then genetic algorithm (GA) is applied for image clustering.
... Show MoreAn experimental study on a KIA pride (SAIPA 131) car model with scale of 1:14 in the wind tunnel was made beside the real car tests. Some of the modifications to passive flow control which are (vortex generator, spoiler and slice diffuser) were added to the car to reduce the drag force which its undesirable characteristic that increase fuel consumption and exhaust toxic gases. Two types of calculations were used to determine the drag force acting on the car body. Firstly, is by the integrating the values of pressure recorded along the pressure taps (for the wind tunnel and the real car testing), secondly, is by using one component balance device (wind tunnel testing) to measure the force. The results show that, the avera
... Show MoreAbstract:
The research aims to monitor the image of the man in the group (The Cart and the Rain) by the storyteller (Badiaa Amin); With the aim of highlighting an aspect of feminist writing, especially with regard to the relationship of women to men, and determining the form adopted by the storyteller in drawing the features of men.
The research used the descriptive-analytical method in the space of its textual formation, which aims to stand on the text and deconstruct its narrative significance. To provide a comprehensive picture of it.
Three images of the man appeared in the group's stories, represented by (the authoritarian, the negative, and the positive), and the image of the authoritarian ma
... Show MoreInformation hiding strategies have recently gained popularity in a variety of fields. Digital audio, video, and images are increasingly being labelled with distinct but undetectable marks that may contain a hidden copyright notice or serial number, or even directly help to prevent unauthorized duplication. This approach is extended to medical images by hiding secret information in them using the structure of a different file format. The hidden information may be related to the patient. In this paper, a method for hiding secret information in DICOM images is proposed based on Discrete Wavelet Transform (DWT). Firstly. segmented all slices of a 3D-image into a specific block size and collecting the host image depend on a generated key
... Show MoreDigital image is widely used in computer applications. This paper introduces a proposed method of image zooming based upon inverse slantlet transform and image scaling. Slantlet transform (SLT) is based on the principle of designing different filters for different scales.
First we apply SLT on color image, the idea of transform color image into slant, where large coefficients are mainly the signal and smaller one represent the noise. By suitably modifying these coefficients , using scaling up image by box and Bartlett filters so that the image scales up to 2X2 and then inverse slantlet transform from modifying coefficients using to the reconstructed image .
&nbs
... Show MoreSteganography is a mean of hiding information within a more obvious form of
communication. It exploits the use of host data to hide a piece of information in such a way
that it is imperceptible to human observer. The major goals of effective Steganography are
High Embedding Capacity, Imperceptibility and Robustness. This paper introduces a scheme
for hiding secret images that could be as much as 25% of the host image data. The proposed
algorithm uses orthogonal discrete cosine transform for host image. A scaling factor (a) in
frequency domain controls the quality of the stego images. Experimented results of secret
image recovery after applying JPEG coding to the stego-images are included.