The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of the previous stage. Improvements include the use of a new activation function, regular parameter tuning, and an improved learning rate in the later stages of training. The experimental results on the flickr8k dataset showed a noticeable and satisfactory improvement in the second stage, where a clear increment was achieved in the evaluation metrics Bleu1-4, Meteor, and Rouge-L. This increment confirmed the effectiveness of the alterations and highlighted the importance of hyper-parameter tuning in improving the performance of CNN-LSTM models in image caption tasks.
With the rapid development of smart devices, people's lives have become easier, especially for visually disabled or special-needs people. The new achievements in the fields of machine learning and deep learning let people identify and recognise the surrounding environment. In this study, the efficiency and high performance of deep learning architecture are used to build an image classification system in both indoor and outdoor environments. The proposed methodology starts with collecting two datasets (indoor and outdoor) from different separate datasets. In the second step, the collected dataset is split into training, validation, and test sets. The pre-trained GoogleNet and MobileNet-V2 models are trained using the indoor and outdoor se
... Show MoreThe searching process using a binary codebook of combined Block Truncation Coding (BTC) method and Vector Quantization (VQ), i.e. a full codebook search for each input image vector to find the best matched code word in the codebook, requires a long time. Therefore, in this paper, after designing a small binary codebook, we adopted a new method by rotating each binary code word in this codebook into 900 to 2700 step 900 directions. Then, we systematized each code word depending on its angle to involve four types of binary code books (i.e. Pour when , Flat when , Vertical when, or Zigzag). The proposed scheme was used for decreasing the time of the coding procedure, with very small distortion per block, by designing s
... Show MoreThe effect of using three different interpolation methods (nearest neighbour, linear and non-linear) on a 3D sinogram to restore the missing data due to using angular difference greater than 1° (considered as optimum 3D sinogram) is presented. Two reconstruction methods are adopted in this study, the back-projection method and Fourier slice theorem method, from the results the second reconstruction proven to be a promising reconstruction with the linear interpolation method when the angular difference is less than 20°.
The recent emergence of sophisticated Large Language Models (LLMs) such as GPT-4, Bard, and Bing has revolutionized the domain of scientific inquiry, particularly in the realm of large pre-trained vision-language models. This pivotal transformation is driving new frontiers in various fields, including image processing and digital media verification. In the heart of this evolution, our research focuses on the rapidly growing area of image authenticity verification, a field gaining immense relevance in the digital era. The study is specifically geared towards addressing the emerging challenge of distinguishing between authentic images and deep fakes – a task that has become critically important in a world increasingly reliant on digital med
... Show MoreThis article investigates how an appropriate chaotic map (Logistic, Tent, Henon, Sine...) should be selected taking into consideration its advantages and disadvantages in regard to a picture encipherment. Does the selection of an appropriate map depend on the image properties? The proposed system shows relevant properties of the image influence in the evaluation process of the selected chaotic map. The first chapter discusses the main principles of chaos theory, its applicability to image encryption including various sorts of chaotic maps and their math. Also this research explores the factors that determine security and efficiency of such a map. Hence the approach presents practical standpoint to the extent that certain chaos maps will bec
... Show MoreSemantic segmentation is an exciting research topic in medical image analysis because it aims to detect objects in medical images. In recent years, approaches based on deep learning have shown a more reliable performance than traditional approaches in medical image segmentation. The U-Net network is one of the most successful end-to-end convolutional neural networks (CNNs) presented for medical image segmentation. This paper proposes a multiscale Residual Dilated convolution neural network (MSRD-UNet) based on U-Net. MSRD-UNet replaced the traditional convolution block with a novel deeper block that fuses multi-layer features using dilated and residual convolution. In addition, the squeeze and execution attention mechanism (SE) and the s
... Show MoreAzo dyes like methyl orange (MO) are very toxic components due to their recalcitrant properties which makes their removal from wastewater of textile industries a significant issue. The present study aimed to study their removal by utilizing aluminum and Ni foam (NiF) as anodes besides Fe foam electrodes as cathodes in an electrocoagulation (EC) system. Primary experiments were conducted using two Al anodes, two NiF anodes, or Al-NiF anodes to predict their advantages and drawbacks. It was concluded that the Al-NiF anodes were very effective in removing MO dye without long time of treatment or Ni leaching at in the case of adopting the Al-Al or NiF-NiF anodes, respectively. The structure and surface morphology of the NiF electrode were inves
... Show MoreThis study conducts a systematic comparative critical discourse analysis of news reports from prominent American (CNN) and Russian (RT) media sources covering the Russia-Ukraine conflict. Utilizing the theoretical frameworks of Norman Fairclough's multidimensional model and Teun van Dijk's socio-cognitive approach, the research examines the underlying ideological assumptions and discursive strategies employed by the two contrasting news channels. Quantitative analysis of discursive techniques and linguistic features provides insights into how each channel selectively utilizes language to convey distinct ideological positions. The findings demonstrate how media discourse constructs and normalizes particular ideological representations of pol
... Show MoreThe concept of the active contour model has been extensively utilized in the segmentation and analysis of images. This technology has been effectively employed in identifying the contours in object recognition, computer graphics and vision, biomedical processing of images that is normal images or medical images such as Magnetic Resonance Images (MRI), X-rays, plus Ultrasound imaging. Three colleagues, Kass, Witkin and Terzopoulos developed this energy, lessening “Active Contour Models” (equally identified as Snake) back in 1987. Being curved in nature, snakes are characterized in an image field and are capable of being set in motion by external and internal forces within image data and the curve itself in that order. The present s
... Show More