Prediction the Performance of Full Scale Wastewater Treatment Plant with AB Process Using Artificial Neural Network and Genetic Algorithm
Farzaneh Mohammadi^{1}, Zeynab Yavari^{2}, Farideh Mohammadi^{3}, Somaye Rahimi^{4}
^{1} Department of Environmental Health Engineering, School of Health, Isfahan University of Medical Sciences, Isfahan, Iran ^{2} Department of Environmental Health Engineering, School of Abarkouh Paramedical, Genetic and Environmental Adventures Research Center, Shahid Sadoughi University of Medical Sciences, Yazd, Iran ^{3} Department of Textile Engineering, Isfahan University of Technology, Isfahan, Iran ^{4} Department of Environmental Health, Firoozabad Branch, Islamic Azad University, Firoozabad, Iran
Date of Submission  01Dec2020 
Date of Decision  26Dec2020 
Date of Acceptance  28Jan2021 
Date of Web Publication  13Dec2022 
Correspondence Address: Dr. Zeynab Yavari School of Abarkouh Paramedical, Yazd Iran
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/ijehe.ijehe_52_20
Aim: A reliable model for wastewater treatment plants (WWTPs) is essential to provide a tool for predicting their performance and to form a basis for controlling the operation of the process also; this would minimize the operation costs. In recent years, computerbased methods have been applied to many areas of environmental issues. Artificial neural networks (ANNs) and genetic algorithm (GA) techniques could be applied for modeling WWTP processes, owing to their high accuracy, adequacy, and quite promising applications in engineering. Materials and Methods: This study applied multilayer feed forward back propagation neural network and GA to predict and optimize the performance of the second phase of the Isfahan North WWTP. Experimental results, which demonstrated the performance of the plant over 6 years were applied for modeling. Results: A threelayer neural network was developed as a predictive model and the network was trained with Levenberg–Marquardt algorithm. The chemical oxygen demand (COD), biochemical oxygen demand (BOD), total suspended solids (TSS), total kjeldahl nitrogen (TKN), and total phosphorus (TP) were introduced as the model input and output. Neural network performance was evaluated with correlation coefficient ® and least mean square error. Proposed model demonstrated the high consistency of the results of modeling and experiments. GA achieved the best value of input parameters as 324.36, 457.37, 359.11, 60.09, and 14.15 mg/l for BOD, COD, TSS, TKN, TP, respectively. Conclusion: ANN and GA combination provides powerful analysis tool for modeling and optimization of nonlinear relationships between the parameters in WWTPs and could be used for proper design and operation of the WWTPs.
Keywords: Artificial neural network, genetic algorithm, modeling, wastewater
How to cite this article: Mohammadi F, Yavari Z, Mohammadi F, Rahimi S. Prediction the Performance of Full Scale Wastewater Treatment Plant with AB Process Using Artificial Neural Network and Genetic Algorithm. Int J Env Health Eng 2022;11:18 
How to cite this URL: Mohammadi F, Yavari Z, Mohammadi F, Rahimi S. Prediction the Performance of Full Scale Wastewater Treatment Plant with AB Process Using Artificial Neural Network and Genetic Algorithm. Int J Env Health Eng [serial online] 2022 [cited 2023 May 29];11:18. Available from: https://www.ijehe.org/text.asp?2022/11/1/18/363370 
Introduction   
In recent years, computer simulations are widely utilized in solving environmental problems. Proper operation and control of the process in a biological wastewater treatment plant (WWTP) is very difficult, because of variations in composition, strength and discharge of raw wastewater and the complex nature of the treatment processes.^{[1]} Features of wastewater in each area differ depending on people's lifestyle, such that the operation of WWTP in each region depends on the experience and knowledge of engineers of that community. Moreover, the lack of adequate information about the appropriate range of input variables, affects the output quality of treatment plant.^{[2]} In traditional modeling techniques that were employed for biological processes, mass and energy balance equations with the equations of microbial growth, substrate utilization, and the formation of products were applied. However, in microbial reactions and environmental interactions, there are nonlinear and time variable relationships with complex nature. Using these methods to predict the operation variables in WWTPs are timeconsuming and create barriers to optimal process control.^{[3]}
Artificial neural network (ANN) technique is a good alternative in compared with traditional models because this method has a high potential in the interpretation of nonlinear and complex relations between variables. Over the past decade, ANN models have been introduced in different fields of environmental engineering.^{[4],[5]}
There are a number of key variables in wastewater treatment and by studying them; the performance of the process can be understood. These variables include chemical oxygen demand (COD), biochemical oxygen demand (BOD), total suspended solids (TSS) using performance modeling; it was found that the neural network is a powerful tool for predicting the performance of WWTPs for the removal of these variables.^{[6]}
In this study, multilayer ANN model was designed to predict the performance of Isfahan North WWTP for the removal of the main variables BOD, COD, TSS, total kjeldahl nitrogen (TKN) and total phosphorus (TP) in the AB process which is a twostage activated sludge process. The 5year data were employed for modeling. The model was tested for the last year of operation. By utilizing the results of this modeling, the plant operators will be able to predict the quality of the effluent based on input wastewater characteristics. Moreover, the optimized values of the input parameters for the best performance of the WWTP were determined using a genetic algorithm (GA).
Materials and Methods   
Artificial neural network theory
ANN is an information processing system that is inspired of the brain's nerve cell function. The development of the neural network is to compute the output variables of a process with the aid of input variables and calculations within the network.^{[7]}
The neural network is trained to perform a specific task. For training, the output values predicted by the model are compared with the target vector, which are the experimental results. Correcting connections values (weights) between the elements is carried out by calculating the error value. This will continue as long as the good predictive ability of the network is achieved and the model outputs have high compliance with the targeted vector. These concepts are shown in [Figure 1]a.  Figure 1: Neural networks training structure (a); the structure of a neural network neuron (b)
Click here to view 
Trained neural networks are utilized to perform complex functions in various areas, such as pattern recognition, identification, classification, speech, vision, control systems and to solve complex problems that are difficult for traditional computers and humans. There are different types of learning algorithms. One of the most common classes of training algorithms for feed forward neural networks is called back propagation (BP).^{[5]} The main component of neural networks is neuron or node as shown in [Figure 1]b.
Input and output variables are shown with a_{1} to a_{n} and O_{j,} respectively. Each node has received many signals and neuron that are investigated to make an output signal. W_{1j} to W_{nj} are weights given to the inputs of each neuron. Weights are adaptive factors within the network that defines the intensity of the input signal to the neuron. Each input (a_{1}, a_{2},…, a_{n}) is multiplied by their respective weight (W_{1}, W_{2},…, W_{n}) and all of these values are added together (a_{1} × W_{1} + a_{2} × W_{2}+… + a_{n} × W_{n}); then the transfer function estimates output signal. Another input to node is b_{j}, which is called internal threshold or bias. This is a randomly selected value and will make outputs to be more accurate (Eq. 1).
[INLINE:1]
The final output of nodes (O_{j}) is obtained with a mathematical operation on u_{j}. This is called transfer function. A transfer function change u_{j} into O_{j} by a linear or nonlinear behavior. Three types of common transfer function are expressed in Eqs. (24).^{[8]}
•Sigmoid transfer function (Eq. 2)
[INLINE:2]
•Hyperbolic tangent transfer function (Eq. 3)
[INLINE:3]
•Linear transfer function (Eq. 4)
[INLINE:4]
In this study, Neural Network Toolbox of MATLAB R2013a software was used for modeling.
Theory of genetic algorithm
GA is a heuristic search that mimics the process of natural selection. The development of this method is based on Darwin's theory of evolution. At first, GAs were developed by John Holland as a tool for finding the best solutions for complex issues. This procedure is implemented by creating a population of chromosomes. Thereafter, the people in the population are put at the control of the evolution process. Each evolutionary step is called a generation. Each member of the population is evaluated based on a series of predefined quality criteria. This is done by a fitness function. For the next generation, people are selected based on their competence. Therefore, the more qualified individuals are selected several times for the new population and will have more chances to reproduce, whereas less qualified people are faded, and have no chance to be selected and reproduce. Therefore, the more deserving chromosomes are increased until one of the stopping criteria is fulfilled. At this stage, the best solution is achieved.^{[9],[10]}
In GA, a chromosome is often represented as a simple string and is a set of parameters that define a proposed solution to the problem the GA is trying to solve. Fitness function is a function that evaluates the quality of a chromosome as a response to the problem. GA component selection is done so that the chromosomes can be directed to evolution. The recombination is a process where the competent chromosomes called parent are selected and the next generation are produced from them. The major components of the recombination are crossover and mutation. Crossover is performed via the exchange of parts of chromosome of two parents such that one or two new members called children are produced. Mutation acts on a single chromosome and with changes in chromosome components, the new members are produced. The stopping criteria are determined based on the number of produced generations or the desired competencies.^{[11]} In this study, the MATLAB R2013a Toolbox software was used to optimize the process.
Data collection
An ANN was employed for modeling in the second phase of Isfahan North WWTP. Isfahan North WWTP [Figure 2]ahas two phases and its first phase has been exploited in 1987 with a nominal capacity of 400,000 people and nominal flow of 65,000 m^{3}/day. The process of this phase is conventional activated sludge. The second phase of plant was started in 2008 with a nominal capacity of 800,000 people. The process of this phase is twostage activated sludge (AB) [Figure 2]b, which has a pumping station unit, screening, grit chamber, first phase aeration and sedimentation (A), aeration and sedimentation of the second phase (B), and chlorination basin. Also in sludge treatment, three stages of thickening, digestion, and dewatering of sludge are considered. Measured values for the parameters of BOD, COD, TSS, TKN, and TP during 6 years (20122018) were collected at the treatment plant. The results of 5 years were used to build the optimal neural network model and the results of 1 year were used to simulate the model.  Figure 2: Aerial view of the 1^{st} and 2^{nd} phase of wastewater treatment plant Isfahan North (a), Flow diagram of the wastewater treatment plant second phase using AB process (b)
Click here to view 
Results   
As mentioned before, the parameters of BOD, COD, TSS, TKN, and TP during 2012–2018 were taken from NWWTP operators. Given that for the modeling; the results of experiments conducted by operators were used, it is necessary to ensure normal distribution of data and outliers should be removed. Checking the normality of the data was done in SPSS software Version 26, (IBM Corp, Chicago, USA) and Kolmogorov–Smirnov test. The results are shown in [Table 1] and [Figure 3].  Figure 3: Error bar graph based on the error of standard deviation for input and output variables
Click here to view 
Results of 60 conducted trials over 5 years in the treatment plant were used to develop a neural network (ANN) model and the results of 12 trials in the past year were set aside for simulation at the end of the study. Features of network are shown in [Table 2].
In FFBP network, the input and output matrix (5 × 60) with five variables; BOD, COD, TSS, TKN, and TP were considered. Seventy percent of the required data for network designing were allocated for network training and 15% for validation and 15% for network testing. Data selecting for the said cases was carried out randomly by software. Performance of neural network is assessed through the mean squared error (MSE) and correlation coefficient ®, and their equations are shown in Eqs. (5, 6).
[INLINE:5]
[INLINE:6]
In the equations above, n represents the number of data, y_{model, i} and y_{obs, i} are the output value predicted by the model and the measured output value, y_{obs, mean} and y_{model, mean} are average output values measured in the laboratory and average values predicted by the model.^{[12]}
BP Levenberg–Marquardt (LMBP) algorithm was used to train network. [Figure 1]a illustrates the stages of this algorithm. In general, in the function estimation problems with network parameters <100, the LMBP algorithm has high speed, performance and accuracy, because in many studies, this algorithm has reached the least error.^{[13]}
During the training, the output predicted by the model is compared with the expected output and the MSE is calculated. If MSE is higher than the amount of error prescribed, it is redistributed from output to input and weights are modified accordingly. This is repeated until either the error reached the allowed limit or the maximum number of trial and error would be equal to the number of specified error. Weights modification in each trial is performed with the following equation (Eq. 7):
[INLINE:7]
Where ɳ is training rate and α is momentum coefficient that is both in the range of 0–1.
MSE is monitored during the training period. In the first phase of training, error will decline so that network reaches the minimum error and then by giving more data, the error is increased; in this stage, network's training is stopped and the weights in the minimum error is returned.^{[14]}
The best neural network architecture is shown in [Figure 4]a. In addition, other results of MLPNN are observed in [Figure 4]b and [Figure 4]c.  Figure 4: Designed neural network architecture (a), Regression graph (b), Network performance graph (c)
Click here to view 
Designed network was applied for the simulation of data obtained in 2018, to evaluate the predictive ability of the ANN model for data that were not in the network designing in any way. The results of this simulation are shown in [Figure 5].
Results of 72 experiments performed at plant in 5 years were introduced as an input to the GA. Each chromosome in the GA contains five components; BOD, COD, TSS, TKN, and TP. The aim is to determine the optimum amount of input to the plant so that all output parameters are minimal. The fitness function in GA in the studied problem is neural network model developed to predict the values in the output of the plant. Other settings intended for GA are: Number of generation 100, the Rank scaling function, selection function of Stochastic uniform, the number of elite 2, Crossover fraction equal to 0.8, and the mutation function of constraint dependent and combination of scattered function. Finally, GA was able to optimize the input parameters to minimize the output parameters. The results of GA are presented in [Table 3].
In [Figure 6]a, the diagram of best and average of fitness values in each generation is presented. [Figure 6]b also represents the best fitness value in the present generation.  Figure 6: Output genetic algorithm graphs. (a) Fitness graph, (b) Best individuals, (c) Variables optimized by genetic algorithm
Click here to view 
Discussion   
To determine the optimal network that has the lowest error and highest correlation coefficient, different networks are built with hidden layers and different neurons. As shown in [Figure 4]a, architecture of the best neural network that has the mentioned features with 3 layers. The number of neurons in hidden layer is equal to 10. As shown in [Figure 4]b, correlation coefficient in collections of training, validation, and testing are well consistent on target vectors with correlation coefficients >0.95. The correlation coefficient of the entire collection has been obtained as 0.98. Hence, the developed neural network has high ability to predict the performance of the plant.
In the graph of network performance [Figure 4]c, MSE of the network starts from a great value and gradually decreases. This implies that the learning process of network has developed. The graph includes three lines because 60 input and target vectors have been divided randomly into three series of training (70%), validation (15%), and test (15%). Validation collection is employed in keeping the popularity of the network. Training process will continue until the network error is decreased on the validation network. Therefore, the least square error on validation graph has been identified. Hence, the over fit of the network on the training set is prevented.
Nasr et al.^{[15]} by developing a threelayer network and neuron number 10, 30, and 3, attained a correlation coefficient of 0.90 for the variables of BOD, COD, and TSS, respectively in WWTP of ELAGAMY with SBR process. Mjalli et al.^{[16]} developed three separate threelayer networks with neuron number 20, 10, and 1 for variables of BOD, COD and TSS, respectively. The correlation coefficients of these parameters are 0.924, 0.924, and 0.839, respectively in treatment plant in Doha East.
After training of the network, it can be employed to simulate the new data. As earlier mentioned, treatment plant data for the years 2012–2017 were utilized to develop optimized neural network and experimental results of 2018 were considered for simulation. Therefore, the values of BOD, COD, TSS, TKN, and TP that were measured during 2018 were given to the optimized neural network as input. Predicted output value of these parameters in comparison with the measured output value in the WWTP is shown in [Figure 5].
The methodology developed in this study is based on the combination of a GA and ANN for multiobjective optimization in existing WWTPs. According to the results, without having a large number of kinetic and stoichiometric coefficients that are needed in biochemical models such as ASMs, the WWTP performance could be predicted and optimized by having the main parameters of wastewater during some years and using a combination of multilayer neural network and GA. In this method, which wastewater treatment modeled using database methods, the characteristics of the input wastewater could even be optimized so that the treatment plant has the best performance. Apart from its applicability of biochemical models in the studied field, another constraint on the biochemical models to choose is that they should be able to solve multiobjective problems, which are complicated and time consuming. Adaptations of the ANN and GA algorithms were found to be the best solution to tackle such problems because of its working procedure, as well as the very little knowledge required about the problem to be solved.
Conclusion   
The results of this study presented the appropriate ability of an ANN in predicting the performance of fullscale WWTP. The correlation coefficient was equal to 0.98 and introduces ANN model as reliable and accurate one. Conducted simulation demonstrated the high consistency of the results of modeling and experiments. GA achieved the best value of input parameters; BOD, COD, TSS, TKN, TP, as 324.36, 457.37, 359.11, 60.09, and 14.15 mg/l, respectively. It seems ANN alongside GAs could be applied to upgrade operation of WWTP. The performance of the WWTP without having all kinetic and stoichiometric coefficients is also predictable and manageable using data based modeling.
Financial support and sponsorship
The authors wish to acknowledge to Vice Chancellery of Research of Isfahan University of Medical Sciences for the financial support, Research Project, #198151 and ethics code IR.MUI.RESEARCH.REC.1398.498. In addition, the authors are grateful to Isfahan North WWTP for their technical and logistic assistance during this work.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Falås P, BaillonDhumez A, Andersen HR, Ledin A, la Cour Jansen J. Suspended biofilm carrier and activated sludge removal of acidic pharmaceuticals. Water Res 2012;46:116775. 
2.  Bina B, Mohammadi F, Amin MM, Pourzamani HR, Yavari Z. Evaluation of the effects of AlkylPhenolic compounds on kinetic coefficients and biomass activity in MBBR by means of respirometric techniques. Chin J Chem Eng 2018;26:8229. 
3.  Berardi C, Fibbi D, Coppini E, Renai L, Caprini C, Scordo CV, et al. Removal efficiency and mass balance of polycyclic aromatic hydrocarbons, phthalates, ethoxylated alkylphenols and alkylphenols in a mixed textiledomestic wastewater treatment plant. Sci Total Environ 2019;674:3648. 
4.  Mohammadi N, Mirabedini SJ. Comparison of particle swarm optimization and backpropagation algorithms for training feedforward neural network. J Math Comput Sci 2018;12:11323. 
5.  Tarentino AL, Maley F. A comparison of the substrate specificities of endobetaNacetylglucosaminidases from Streptomyces griseus and Diplococcus pneumoniae. Biochem Biophys Res Commun 1975;67:45562. 
6.  
7.  Boeri C, Chiappa C, Galli F, De Berardinis V, Bardelli L, Carcano G, et al. Machine learning techniques in breast cancer prognosis prediction: A primary evaluation. Cancer Med 2020;9:323443. 
8.  Karimi H, Ghaedi M. Application of artificial neural network and genetic algorithm to modelling and optimization of removal of methylene blue using activated carbon. Ind Eng Chem 2014;20:24716. 
9.  Ding Y, Zhang W, Yu L, Lu K. The accuracy and efficiency of GA and PSO optimization schemes on estimating reaction kinetic parameters of biomass pyrolysis. Energy 2019;176:5828. 
10.  Mohammadi F, Bina B, Amin MM, Pourzamani HR, Yavari Z, Shams MR. Evaluation of the effects of AlkylPhenolic compounds on kinetic parameters in a moving bed biofilm reactor. Can J Chem Eng 2018;96:17629. 
11.  Li Z, Zhang F, Luo X, Chang W, Cai Y, Zhong W, et al. Material removal mechanism of laserassisted grinding of RBSiC ceramics and process optimization. Eur Ceram Soc 2019;39:70517. 
12.  Mohammadi F, Samaei MR, Azhdarpoor A, Teiri H, Badeenezhad A, Rostami S. Modelling and optimizing pyrene removal from the soil by phytoremediation using response surface methodology, artificial neural networks, and genetic algorithm. Chemosphere 2019;237:124486. 
13.  Mammadli S. Financial time series prediction using artificial neural network based on levenbergmarquardt algorithm. Procedia Comput Sci 2017;120:6027. 
14.  Bahrami M, Akbari M, Bagherzadeh SA, Karimipour A, Afrand M, Goodarzi M. Develop 24 dissimilar ANNs by suitable architectures & training algorithms via sensitivity analysis to better statistical presentation: Measure MSEs between targets & ANN for Fe–CuO/Eg–Water nanofluid. Phys A Stat Mech its Appl 2019;519:15968. 
15.  Nasr MS, Moustafa MA, Seif HA, El Kobrosy G. Application of artificial neural network (ANN) for the prediction of ELAGAMY wastewater treatment plant performanceEGYPT. Alexandria Eng J 2012;51:3743. 
16.  Mjalli FS, AlAsheh S, Alfadala HE. Use of artificial neural network blackbox modelling for the prediction of wastewater treatment plants performance. Environ Manage 2007;83:32938. 
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6]
[Table 1], [Table 2], [Table 3]
