This study explores the challenges in Artificial Intelligence (AI) systems in generating image captions, a task that requires effective integration of computer vision and natural language processing techniques. A comparative analysis between traditional approaches such as retrieval- based methods and linguistic templates) and modern approaches based on deep learning such as encoder-decoder models, attention mechanisms, and transformers). Theoretical results show that modern models perform better for the accuracy and the ability to generate more complex descriptions, while traditional methods outperform speed and simplicity. The paper proposes a hybrid framework that combines the advantages of both approaches, where conventional methods produce an initial description, which is then contextually, and refined using modern models. Preliminary estimates indicate that this approach could reduce the initial computational cost by up to 20% compared to relying entirely on deep models while maintaining high accuracy. The study recommends further research to develop effective coordination mechanisms between traditional and modern methods and to move to the experimental validation phase of the hybrid model in preparation for its application in environments that require a balance between speed and accuracy, such as real-time computer vision applications.
B Saleem, H Alwan, L Khalid, Journal of Engineering, 2011 - Cited by 2
The energy requirements of corn silage harvesters and the application of precision agricultural techniques are essential for efficient and productive agricultural practices. The article aims to review previous studies on the energy requirements needed for different corn silage harvesting machines, and on the other hand, to present methods for measuring corn silage productivity directly in the field and monitoring it based on microcontrollers and artificial intelligence techniques. The process of making corn silage is done by cutting green fodder plants into small pieces, so special harvesters are used for this, called corn silage harvesters. The purpose of harvesting corn silage is to efficiently collect and store as many digestible nutrien
... Show MoreThis paper compares between the direct and indirect georeferencing techniques in Photogrammetry bases on a simulation model. A flight plan is designed which consists of three strips with nine overlapped images for each strip by a (Canon 500D) digital camera with a resolution of 15 Mega Pixels.
The triangulation computations are carried out by using (ERDAS LPS) software, and the direct measurements are taken directly on the simulated model to substitute using GPS/INS in real case. Two computational tests have been implemented to evaluate the positional accuracy for the whole model and the Root Mean Square Error (RMSE) relating to (30) check points show that th
... Show MoreGenerally, direct measurement of soil compression index (Cc) is expensive and time-consuming. To save time and effort, indirect methods to obtain Cc may be an inexpensive option. Usually, the indirect methods are based on a correlation between some easier measuring descriptive variables such as liquid limit, soil density, and natural water content. This study used the ANFIS and regression methods to obtain Cc indirectly. To achieve the aim of this investigation, 177 undisturbed samples were collected from the cohesive soil in Sulaymaniyah Governorate in Iraq. Results of this study indicated that ANFIS models over-performed the Regression method in estimating Cc with R2 of 0.66 and 0.48 for both ANFIS and Regre
... Show MoreOne of the significant stages in computer vision is image segmentation which is fundamental for different applications, for example, robot control and military target recognition, as well as image analysis of remote sensing applications. Studies have dealt with the process of improving the classification of all types of data, whether text or audio or images, one of the latest studies in which researchers have worked to build a simple, effective, and high-accuracy model capable of classifying emotions from speech data, while several studies dealt with improving textual grouping. In this study, we seek to improve the classification of image division using a novel approach depending on two methods used to segment the images. The first
... Show MoreBuilding a system to identify individuals through their speech recording can find its application in diverse areas, such as telephone shopping, voice mail and security control. However, building such systems is a tricky task because of the vast range of differences in the human voice. Thus, selecting strong features becomes very crucial for the recognition system. Therefore, a speaker recognition system based on new spin-image descriptors (SISR) is proposed in this paper. In the proposed system, circular windows (spins) are extracted from the frequency domain of the spectrogram image of the sound, and then a run length matrix is built for each spin, to work as a base for feature extraction tasks. Five different descriptors are generated fro
... Show MoreHM Al-Dabbas, RA Azeez, AE Ali, Iraqi Journal of Science, 2023