BIO Web Conf.
Volume 31, 2021VI International Scientific Conference “Problems of Industrial Botany of Industrially Developed Regions” 2021
|Number of page(s)||4|
|Published online||13 August 2021|
Methodology for assessing water quality using a neural network
1 Federal Research Center for Information and Computing Technologies, Kemerovo Branch in cooperation with the Institute of Water and Environmental Problems SB RAS, Russia
2 Federal Research Center for Information and Computing Technologies, Kemerovo Branch, Russia
* Corresponding author: email@example.com
A methodology for assessing the quality of surface and groundwater is proposed, using artificial intelligence methods, in particular, a fully connected neural network. Preliminary testing was carried out on water objects and a sufficiently high percentage of model accuracy was obtained.
© The Authors, published by EDP Sciences, 2021
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Supervised learning was used to create a neural network. As the correct answers, we used the results of calculations of hydrochemical data by the method of assessing water quality by associative indicators [3 – 6].
In accordance with the proposed method for choosing the correct answers, the water quality assessment is carried out according to two formulas (1) and (2). First, the normalized water indicators (NI) is determined (formula 1), and then the associative indicators of water composition (AI) (formula 2). (1)
Where, AINj – associative indicators of water composition, NIi – normalized water indicators arithmetic mean concentration of ingredient (i); N – number of ingredients.
Depending on the value of the AI indicator according to Table 1, the corresponding class of water quality is selected. Since, in the Kemerovo region, there are practically no waters corresponding to the first class of water quality, Table 1 was transformed by the author in order to optimize it for the use of a neural network, and contains only six classes of quality, in contrast to the previous methods.
To improve the performance of the neural network, it is better to supply quality class numbers converted into vectors consisting of 1 and 0. For this, the encoding function - LabelBinarizer was imported from the Scikit-learn library . As a result, the quality classes were converted to vectors:
Experimentally, the parameters of the neural network were selected (number of layers, number of neurons in each layer, activation functions, optimizer, number of Bath_Saze). As a result, we obtained the optimal architecture of the neural network: three fully connected (Dense) layers, on the first two layers there are 64 neurons and an activation function - relu, on the last - 6 neurons and an activation function - softmax (a screenshot of the program is shown in Fig. 1). As a result, a sufficiently high accuracy was obtained on the training (98.96 %) and test sample (96.63 %) (the screenshot is shown in Fig. 2).
Below is an algorithm for assessing water quality using the created neural network, tested on the example of hydrochemical data on the Chumysh River (Kemerovo Region) for 2011–2018:
loading a table with hydrochemical data for the Chumysh River;
normalization of hydrochemical data (subtract the mean and divide by the standard deviation);
launching a trained neural network;
obtaining the result of the neural network operation (determining the quality class of each sample and displaying the generalized result, in the form of the sum of the samples by class):
class – 1 sample;
classes –17 samples;
classes – 4 samples;
choose the quality class, where most of the samples (in our case, it is the – 2 class);
determine the name of the water quality according to table 1, depending on the quality class.
As a result of data analysis, the water quality in the Chumysh river for 2011–2018 was determined, assessed as “little contaminated” (class 2).
According to the above algorithm, it is possible to assess the water quality of any water object.
Water quality classes.
Screenshot of neural network architecture for water quality assessment.
Screenshot of graph of dependence of training accuracy on the number of training epochs.
To analyze the quality of surface and groundwater in the Kemerovo region, a methodology for assessing water quality using a fully connected neural network is proposed. Preliminary testing was carried out on water bodies and a sufficiently high percentage of accuracy was obtained, which indicates the adequacy of the proposed method. The results obtained by the neural network coincide with the results of traditional methods, such as the method for assessing the quality of water using the specific combinatorial index of water pollution (UCIPI) and can be used to assess the quality of techno-natural waters.
- Tensor Flow library https://www.tensorflow.org/ [Google Scholar]
- Keras library https://riptutorial.com/ru/keras/ [Google Scholar]
- E.L. Schastlivtsev, N.I. Yukina, I.E. Kharlampenkov, Information-analytical system of geoecological monitoring of water resources of the coal-mining region, Bull. of KuzGTU, 2 (114):157–164 (2016) [Google Scholar]
- E.L. Schastlivtsev, Technogenic impact of coal mining enterprises on the environment (on the example of Kuzbass). Abstract dis. Doctor of Technical Sciences (Barnaul, 2006) [Google Scholar]
- V.P. Potapov, V.P. Mazikin, E.L. Schastlivtsev, N.Yu. Vashlaeva, Geoecology of coalmining regions of Kuzbass (Novosibirsk: Nauka, 2005) [Google Scholar]
- V.A. Kovalev, V.P. Potapov, E.L. Schastlivtsev, Yu.I. Shokin, Modeling of geoecological systems of coal mining regions (KuzGTU. - Novosibirsk: Publishing house of the SB RAS, 2015) [Google Scholar]
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.