When optimizing the performance of neural network-based chatbots, determining the optimizer is one of the most important aspects. Optimizers primarily control the adjustment of model parameters such as weight and bias to minimize a loss function during training. Adaptive optimizers such as ADAM have become a standard choice and are widely used for their invariant parameter updates' magnitudes concerning gradient scale variations, but often pose generalization problems. Alternatively, Stochastic Gradient Descent (SGD) with Momentum and the extension of ADAM, the ADAMW, offers several advantages. This study aims to compare and examine the effects of these optimizers on the chatbot CST dataset. The effectiveness of each optimizer is evaluated based on its sparse-categorical loss during training and BLEU in the inference phase, utilizing a neural generative attention-based additive scoring function. Despite memory constraints that limited ADAMW to ten epochs, this optimizer showed promising results compared to configurations using early stopping techniques. SGD provided higher BLEU scores for generalization but was very time-consuming. The results highlight the importance of finding a balance between optimization performance and computational efficiency, positioning ADAMW as a promising alternative when training efficiency and generalization are primary concerns.
Abstract
The current research problem includes a variety of research motivations to serve the private health sector, which is witnessing a great competition from internal and external environments. In this regard, private medical clinics are increasingly seeking to attract and retain customers through the quality of their service offerings represented by health services. Innovative and effective marketing methods to improve performance and stay in competition, by relying on the physical evidence of the product as a component of the marketing mix of services and its role in particular in packaging and supporting the health service with concrete evidence that affects the customer an
... Show MoreThe aim of this research is to measure and analyze the gap between the actual reality and the requirements of the environmental management system in the middle refineries company/refinery cycle according to ISO14001: 2015, as well as to measure the availability of a clean production strategy and test the relationship and impact between the availability of the requirements of the standard and a clean production strategy for the actual reality in the company.
The research problem was determined by the extent to which the requirements of the environmental management system are applied according to ISO14001: 2015 in the middle refineries company? To what extent are the required clean production strategies ava
... Show MoreIn this research the results of applying Artificial Neural Networks with modified activation function to
perform the online and offline identification of four Degrees of Freedom (4-DOF) Selective Compliance
Assembly Robot Arm (SCARA) manipulator robot will be described. The proposed model of
identification strategy consists of a feed-forward neural network with a modified activation function that
operates in parallel with the SCARA robot model. Feed-Forward Neural Networks (FFNN) which have
been trained online and offline have been used, without requiring any previous knowledge about the
system to be identified. The activation function that is used in the hidden layer in FFNN is a modified
version of the wavelet func
In this research the results of applying Artificial Neural Networks with modified activation function to perform the online and offline identification of four Degrees of Freedom (4-DOF) Selective Compliance Assembly Robot Arm (SCARA) manipulator robot will be described. The proposed model of identification strategy consists of a feed-forward neural network with a modified activation function that operates in parallel with the SCARA robot model. Feed-Forward Neural Networks (FFNN) which have been trained online and offline have been used, without requiring any previous knowledge about the system to be identified. The activation function that is used in the hidden layer in FFNN is a modified version of the wavelet function. This approach ha
... Show MoreThis paper aims to prove an existence theorem for Voltera-type equation in a generalized G- metric space, called the -metric space, where the fixed-point theorem in - metric space is discussed and its application. First, a new contraction of Hardy-Rogess type is presented and also then fixed point theorem is established for these contractions in the setup of -metric spaces. As application, an existence result for Voltera integral equation is obtained.
The aim of the research was to prepare Pilates exercises using the barrel ladder apparatus and to identify the effect of Pilates exercises on agility, coordination, and motor sequences of third-year female students in artistic gymnastics. The researcher adopted the experimental method to achieve the objectives of the study and to verify its hypotheses, as it is suitable for the nature and problem of the research. In selecting the research population, the researcher carefully chose the sample using a purposive method, clarifying its elements and constituent units. The research population consisted of third-year female students at the College of Physical Education and Sports Sciences for Women / University of Baghdad, with a total of 20 stud
... Show MoreThis paper includes the application of Queuing theory with of Particle swarm algorithm or is called (Intelligence swarm) to solve the problem of The queues and developed for General commission for taxes /branch Karkh center in the service stage of the Department of calculators composed of six employees , and it was chosen queuing model is a single-service channel M / M / 1 according to the nature of the circuit work mentioned above and it will be divided according to the letters system for each employee, and it was composed of data collection times (arrival time , service time, departure time)
... Show More
The plant licorice is considered important plants as nutritionally and medically and economically, as a rich in phytochemical, vitamins and minerals, and being widely available, Research indicated the presence of many nutrients such as (proteins, Carbohydrates, vitamins and minerals) as well as presence of Glycyrrhizin which responsible of sweet taste, that allowing the possibility to use it as natural intensity sweetener with few calories in Sweetening of many food. This research is aimed to study the Stability of Glycyrrhizin toward the various manufacturing conditions such as (thermal treatment, pH of foods and microwaves), so three factorial experiments was implemented to find out the Stability as following: 100C° - 121C° - Microwa
... Show MoreThe Impact of Intellectual trends on the nature of the Economic Structure of Iraq
The present study discusses the problem based learning in Iraqi classroom. This method aims to involve all learners in collaborative activities and it is learner-centered method. To fulfill the aims and verify the hypothesis which reads as follow” It is hypothesized that there is no statistically significant differences between the achievements of Experimental group and control group”. Thirty learners are selected to be the sample of present study.Mann-Whitney Test for two independent samples is used to analysis the results. The analysis shows that experimental group’s members who are taught according to problem based learning gets higher scores than the control group’s members who are taught according to traditional method. This
... Show More