A Robust Multi-Validation Approach for Evaluating Machine Learning-Based Intrusion Detection Models

Samuel Aleksander Mandowen; Alexey Mikhailovich Vulfin; Vladimir Ivanovich Vasilyev; Emil Ramilevich Khairullin; Jonathan Kiwasi Wororomi

doi:10.1051/bioconf/202621303001

Open Access

Issue		BIO Web Conf. Volume 213, 2026 The 1^st Papua International Conference on Biodiversity, Natural Sciences, and Technology (PICoBNST 2025)


Article Number		03001
Number of page(s)		15
Section		Interdisciplinarity in Sciences and Technology
DOI		https://doi.org/10.1051/bioconf/202621303001
Published online		27 January 2026

BIO Web of Conferences 213, 03001 (2026)

A Robust Multi-Validation Approach for Evaluating Machine Learning-Based Intrusion Detection Models

Samuel Aleksander Mandowen¹^*, Alexey Mikhailovich Vulfin², Vladimir Ivanovich Vasilyev³, Emil Ramilevich Khairullin⁴ and Jonathan Kiwasi Wororomi⁵

^1-4 Department of Computer Technology and Information Security, Ufa University of Science and Technology, Ufa, Russia
⁵ Department of Statistics, Cenderawasih University, Papua, Indonesia

^* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.

Abstract

Intrusion Detection Systems (IDS) play a vital role in protecting modern networks from cyber threats by detecting abnormal or malicious traffic behaviors. Machine Learning (ML) techniques have been applied extensively to enhance automation, scalability, and detection accuracy. However, most ML-based IDS studies still rely on single validation schemes such as basic train-test split or Simple K-Fold Cross-Validation, which often produce biased estimates, overfitting, and poor generalization across datasets. This research presents a Multi-Validation Evaluation Framework designed to integrate six mutually supportive validation techniques: three single-validation methods (Hold-Out, Simple K-Fold, Stratified K-Fold), and three multi-validation methods (Repeated K-Fold, Bootstrapping, and Nested Cross-Validation), ensuring fair, consistent, and statistically reproducible assessment. The framework was validated on two benchmark datasets, NSL-KDD and UNSW-NB15, using five ML models: Random Forest, Extreme Gradient Boosting, Decision Tree, K-Nearest Neighbors, and Linear Support Vector Classifier. Model performance was evaluated using the Accuracy, Precision, Recall, F1-Score, ROC-AUC, and PR-AUC metrics. The outcomes are reported as mean ± standard deviation. The results show that Random Forest has the highest accuracy (99.56% and 94.69%) and ROC-AUC (>0.989) for all datasets. The multi-validation technique reduced metric variance by up to 40% while maintaining a mean accuracy steady, which shows that it is more stable and repeatable. Statistical tests (Wilcoxon, Friedman, and Nemenyi) showed significant disparities in performance (p < 0.001). The proposed method provides a robust, comprehensive, and scientifically valid framework to evaluate ML-based IDS models.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.