Issue |
BIO Web Conf.
Volume 130, 2024
International Scientific Conference on Biotechnology and Food Technology (BFT-2024)
|
|
---|---|---|
Article Number | 03007 | |
Number of page(s) | 10 | |
Section | Water Environmental Biotechnology | |
DOI | https://doi.org/10.1051/bioconf/202413003007 | |
Published online | 09 October 2024 |
Optimizing water quality classification using random forest and machine learning
1 Reshetnev Siberian State of Science and Technology, Krasnoyarsk, Russia
2 Bauman Moscow State Technical University, Artificial Intelligence Technology Scientific and Education Center, Moscow, Russia
3 Agriculture Krasnoyarsk state agrarian university, Krasnoyarsk, Russia
* Corresponding author: vasi4244@gmail.com
Water is the most precious and essential resource among all natural resources. With the increase in industrialization and human activities over recent decades, the state of water resources has been significantly impacted. Effective water quality monitoring has become a priority for cities worldwide. Modern technologies such as cloud computing, artificial intelligence, remote sensing, and the Internet of Things provide new opportunities to enhance water resource monitoring systems. This paper explores the application of the random forest model for water quality classification based on chemical attributes. The study includes three experiments: using the full set of features, excluding the pH feature, and using only the top three significant features. The random forest model trained on the full dataset achieved 100% accuracy. When the pH feature was excluded, the model maintained an accuracy of 76%, highlighting the importance of this feature but also showing the potential for compensation by other parameters. Using only the top three significant features (pH, conductivity, and nitrate), the model again achieved 100% accuracy. The results demonstrate that feature optimization without significant loss of model accuracy is a promising approach to improve water quality monitoring and assessment processes. This approach allows for reduced data collection time and costs while maintaining high predictive accuracy. The findings confirm that machine learning, particularly random forest models, can be effectively used for water quality classification, ultimately supporting better management and conservation of water resources.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.