Issue |
BIO Web Conf.
Volume 141, 2024
IX International Scientific Conference on Agricultural Science 2024 “Current State, Problems and Prospects for the Development of Agricultural Science” (AGRICULTURAL SCIENCE 2024)
|
|
---|---|---|
Article Number | 04050 | |
Number of page(s) | 14 | |
Section | Agriculture and Agri-food Systems | |
DOI | https://doi.org/10.1051/bioconf/202414104050 | |
Published online | 21 November 2024 |
Machine learning in environmental sustainability factor analysis in the agricultural sector
1 Moscow Timiryazev Agricultural Academy, Russian State Agrarian University, 127550 Moscow, Russia
2 Reshetnev Siberian State University of Science and Technology, 660037 Krasnoyarsk, Russia
3 Bauman Moscow State Technical University, 105005 Moscow, Russia
* Corresponding author: ankoz9@yandex.ru
The study employed several key data analysis methods aimed at enhancing the understanding of relationships between variables and improving prediction accuracy. The primary tool used was correlation analysis, which allowed for the identification of the degree of association between two variables by determining how changes in one variable relate to changes in another. This established a foundation for further in-depth data analysis. For a deeper understanding and simplified interpretation of the data, factor analysis was utilized. This method helped to identify latent factors that explain the relationships between observed variables and to reduce the number of variables by grouping them. This made the analysis easier and facilitated the identification of key components affecting the data. Logistic regression was applied to build data models. This method is used to model the probability of a specific event occurring based on independent variables, allowing for the classification and prediction of categorical outcomes. The logistic function was used to estimate probabilities and the relationship between the dependent variable and predictors. To enhance the performance of the logistic regression model, a Weight of Evidence (WoE) analysis was conducted. This method converts categorical and continuous variables into numerical formats, simplifying data interpretation and improving the model’s predictive capabilities. WoE analysis helps to identify significant factors, improve the linear relationship between predictors and the dependent variable, and reduce the impact of outliers, which is particularly important in areas such as credit scoring. The results of applying these methods showed that the model based on correlation and factor analysis explained 27.51% of the information on the training set and 76.04% on the test set.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.