The need for an efficient method to find the furthermost appropriate document corresponding to a particular search query has become crucial due to the exponential development in the number of papers that are now readily available to us on the web. The vector space model (VSM) a perfect model used in “information retrieval”, represents these words as a vector in space and gives them weights via a popular weighting method known as term frequency inverse document frequency (TF-IDF). In this research, work has been proposed to retrieve the most relevant document focused on representing documents and queries as vectors comprising average term term frequency inverse sentence frequency (TF-ISF) weights instead of representing them as vectors of term TF-IDF weight and two basic and effective similarity measures: Cosine and Jaccard were used. Using the MS MARCO dataset, this article analyzes and assesses the retrieval effectiveness of the TF-ISF weighting scheme. The result shows that the TF-ISF model with the Cosine similarity measure retrieves more relevant documents. The model was evaluated against the conventional TF-ISF technique and shows that it performs significantly better on MS MARCO data (Microsoft-curated data of Bing queries).
Zernike Moments has been popularly used in many shape-based image retrieval studies due to its powerful shape representation. However its strength and weaknesses have not been clearly highlighted in the previous studies. Thus, its powerful shape representation could not be fully utilized. In this paper, a method to fully capture the shape representation properties of Zernike Moments is implemented and tested on a single object for binary and grey level images. The proposed method works by determining the boundary of the shape object and then resizing the object shape to the boundary of the image. Three case studies were made. Case 1 is the Zernike Moments implementation on the original shape object image. In Case 2, the centroid of the s
... Show MoreDocument source identification in printer forensics involves determining the origin of a printed document based on characteristics such as the printer model, serial number, defects, or unique printing artifacts. This process is crucial in forensic investigations, particularly in cases involving counterfeit documents or unauthorized printing. However, consistent pattern identification across various printer types remains challenging, especially when efforts are made to alter printer-generated artifacts. Machine learning models are often used in these tasks, but selecting discriminative features while minimizing noise is essential. Traditional KNN classifiers require a careful selection of distance metrics to capture relevant printing
... Show MoreIn the present investigation, bed porosity and solid holdup in viscous three-phase inverse fluidized bed (TPIFB) are determined for aqueous solutions of carboxy methyl cellulose (CMC) system using polyethylene and polypropylene as a particles with low-density and diameter (5 mm) in a (9.2 cm) inner diameter with height (200 cm) of vertical perspex column. The effectiveness of gas velocity Ug , liquid velocity UL, liquid viscosity μL, and particle density ρs on bed porosity BP and solid holdups εg were determined. The bed porosity increases with "increasing gas velocity", "liquid velocity", and "liquid viscosity". Solid holdup decreases with increasing gas, liquid
... Show MoreSignature verification involves vague situations in which a signature could resemble many reference samples or might differ because of handwriting variances. By presenting the features and similarity score of signatures from the matching algorithm as fuzzy sets and capturing the degrees of membership, non-membership, and indeterminacy, a neutrosophic engine can significantly contribute to signature verification by addressing the inherent uncertainties and ambiguities present in signatures. But type-1 neutrosophic logic gives these membership functions fixed values, which could not adequately capture the various degrees of uncertainty in the characteristics of signatures. Type-1 neutrosophic representation is also unable to adjust to various
... Show MoreThis research aims to study the methods of reduction of dimensions that overcome the problem curse of dimensionality when traditional methods fail to provide a good estimation of the parameters So this problem must be dealt with directly . Two methods were used to solve the problem of high dimensional data, The first method is the non-classical method Slice inverse regression ( SIR ) method and the proposed weight standard Sir (WSIR) method and principal components (PCA) which is the general method used in reducing dimensions, (SIR ) and (PCA) is based on the work of linear combinations of a subset of the original explanatory variables, which may suffer from the problem of heterogeneity and the problem of linear
... Show Moresignificant bits either in the spatial domain or frequency domain sequentially or pseudo
randomly through the cover media (Based on this fact) statistical Steganalysis use different
techniques to detect the hidden message, A proposed method is suggested of a stenographic
scheme a hidden message is embedded through the second least significant bits in the
frequency domain of the cover media to avoid detection of the hidden message through the
known statistical Steganalysis techniques.