Depression, anxiety, and stress disorders detection in students during the Covid-19 pandemic using Naïve Bayes algorithm

. During the Covid-19 pandemic, students in Indonesia carried out bold learning from home as a social effort during the pandemic. This bold learning process is considered to be still less effective and efficient and has resulted in some students, especially students having homework during the brave learning period. This has an impact on the psychology of students such as the emergence of depression, anxiety, and stress. Sources of psychological disorders not only from academics but from within themselves also affect mental health. The results of a survey on mental health during the pandemic conducted by the Association of Indonesian Mental Medicine Specialists (PDSKJI) showed that 64.8% of respondents experienced psychological problems in the age group of 19-24 years and over 60 years. In this study the author will make a system of Tests for Depression, Anxiety and Stress Disorders in Students. The results of this test are the severity of each psychological disorder and treatment recommendations based on the test results. The psychological scale used in this study is the DASS-42 (Depression, Anxiety, and Stress Scale) which has 42 statements and 3 categories of disorders, namely depression, anxiety, and stress. Each category has 5 levels, namely normal, mild, moderate, severe, and very severe. The Test System for Depression, Anxiety, and Stress Disorders for Students uses the Naïve Bayes method with the accuracy of the dataset obtained by 86.44%, so it can be said that this system is running according to the purpose. kkkkkkkkkKkkkkkkkkkkkkkk


Introduction
The Covid-19 pandemic has had a major impact on all fields, one of which is education.The Indonesian government has issued an online/distance learning policy since March 2020.This was done to stop the spread of the Covid-19 outbreak.Online learning has advantages and disadvantages in its application.The Association of Indonesian Mental Medicine Specialists (PDSKJI) conducted a survey on mental health during the COVID-19 pandemic.The results of this selfexamination showed that 64.8% of respondents experienced psychological problems with the proportion of 64.8% anxious, 61.5% depressed, and 74.8% traumatized.Most psychological problems are found in the age group of 17-29 years and above 60 years [1].
For students, this pandemic causes stress and anxiety because it is related to changes in the lecture process and daily life.Therefore, a test is needed to determine the level of depression, anxiety, and stress for students in the midst of this covid-19 period, so that they can take preventive measures from the start before going to a more severe level.Depression, Anxiety and Stress Scale (DASS) is a selfassessment scale used to measure a person's negative emotions.for DASS development, not only the factor structure but also the relative performance of each item was found to be nearly the same in clinical and non-clinical samples [2].However, the main purpose of measurement  Corresponding author: annisarahmadani@student.telkomuniversity.ac.id with the DASS in this study was to determine the severity of symptoms of depression, anxiety and stress and recommend some treatment.In applying the Depression, Anxiety, and Stress Disorder Test to Students, the author uses the Naïve Bayes algorithm.With this application, students and the general public can carry out initial screening and find out the level of depression, anxiety and stress based on the symptoms experienced and recommend treatment to help users in the first treatment.

Related Work
Naïve Bayes algorithm have been used in much research in health problems.Triyanna Widiyaningtyas, Ilham A Zaeni, and Nadiratin Jamilah (2020) in their research developed a method to diagnose the symptoms of fever in both diseases.Their Naïve Bayes algorithm used to classify the diagnosing fever symptoms.Algorithm testing is done using k-fold crossvalidation, with k equal to 10.The evaluation of the algorithm is measured by calculating the value of accuracy, precision, and recall from prediction results.The results showed that the average accuracy rate was 94%, precision was 90%, and recall was 92%.This shows that the Naïve Bayes algorithm has good performance in diagnosing fever in patients [3].
There are several studies on the prediction of anxiety and depression in elderly patients using machine learning technology.Arkaprabha Sau and Ishita Bhakta (2017) in their research developed a predictive model for automatic diagnosis of anxiety and depression in geriatric patients.The data used in this study are socio-demographic factors and patient health.Geriatric patients were also classified into two using the Hospital Anxiety and Depression (HADS) scale classification process using ten algorithms including Bayesian Networks, Logistics, Multiple Layered Perceptrons, Naïve Bayes, Random Forests, Random Trees, J48, Sequential Minimal Optimization, Space Sub Random, and K Star.Results from 10 machine learning classifications were evaluated and Random forFst (RF) got a prediction accuracy of 91% and false positives only 10%.this accuracy is tested by 10-fold cross-validation [4].
Another research from Setiyo Budiyanto and Harry Candra Sihombing (2019) explained about measuring the tendency of depression and anxiety through social media using the closed loop method with Facebook text mining posts.with preprocessing stages including text extraction using the Naïve Bayes model for text classification and symptoms of depression and anxiety were measured using Depression, Anxiety, Stress Scale (DASS)-21.Facebook post data used as training data is 22,934 and the result is an analysis of user social demographic mapping which is usually a trigger for depression, and anxiety, such as sadness, illness, household affairs, children's education and others are available [5].
Apart from predicting anxiety and depression through social media, there is a study on the assessment of anxiety, depression, and stress using machine learning by Prince Kumara and Shruti Garg (2020) where the tools used are also DASS-21 and DASS-42.The difference in the five severity levels of anxiety, stress and depression were predicted using eight machine learning algorithms.The method used is divided into four different categories: Bayes, neural network, lazy tree, the K-star hybrid technique and the random forest method.All methods were applied to two different databases, DASS-42 and DASS-21. the results showed that the Radial Basis Function Network (RBFN) performed the best for depression in both datasets.random forest results are 100 percent for anxiety in the DASS-21 database [6].
In a study by Anu Priya and Shruti Garg (2019) the prediction of anxiety, depression and stress was made using a machine learning algorithm.The scale used is DASS-21 and is predicted based on five severity levels using five different machine learning algorithms.The algorithm used is Naive Bayes, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest and Decision Tree.Naive Bayes accuracy was found to be the highest with the percentages of depression, anxiety, and stress at 85%, 73% and 74%, respectively [7].

Research Method
This section discusses the methods used in this study.

Depression, Anxiety, Stress Scale (DASS-42)
Depression, Anxiety, Stress Scale (DASS) is a measuring tool developed by Lovibond and Lovibond in 1995.This DASS test consists of 42 statement items with three psychological scales namely depression, anxiety and stress.Each psychological scale consists of 14 items, which are further divided into sub-scales consisting of 2 to 5 items which are estimated to measure the same thing [8].The severity of the DASS-42 is based on the mean population score obtained from a large and relatively heterogeneous sample.If a person has significant symptoms of depression, anxiety, and stress, they are still referred to a psychologist.
In the DASS-42 standard, the distribution of items/symptoms that affect certain disorders can be seen in Table 1 and when taking sample data, items using Indonesian [9].In each item there are four types of answers with different weights, namely never: 0, sometimes: 1, quite often: 2, very often: 3.In Table 2 is the division of each item for symptoms of depression, anxiety and stress consisting of 14 items.After the final score is calculated, it will be labeled according to the severity level, namely "Normal", "Mild", "Medium", "Severe", and "Very Severe".Table 3 is an assessment indicator of severity [10]:

Naïve Bayes
Bayes theorem is an approximation to uncertainty as measured by probability.Bayes theorem is used for classification purposes and to assume that classification is an independent predictor.It is assumed that the Naive Bayes classifier in the presence of certain features in the class is not related to other features.The Naive Bayes model is compatible for very large data sets to build on and beyond analysis.This model is a very simple and sophisticated classification method, and it well done even in complicated scenarios [11].
The Naive Bayes algorithm is suitable for classifying datasets of nominal and numeric types.If the dataset is of numeric type, the calculation of the gaussian distribution is used.The calculation of the gaussian distribution can be seenin equation 1 [12].Calculate the mean value according to equation 2: Description: µ: arithmetic mean (mean).xi: sample value i. n: number of samples.
Standard deviation value according to equation 3: Description: µ: arithmetic mean (mean).xi: sample value i. n: number of samples.
: standard deviation.4 System Design and Overview

System Overview
First, the user inputs the name first and then performs the Depression, Anxiety, Stress Scale (DASS)-42 test.The results of the DASS-42 test will be processed using the naive Bayes method based on the results of the severity of depression, anxiety, and stress.The output of the system will display the severity of depression, anxiety, and stress as well as treatment recommendations for users.

Treatment and Rules mapping
Table 4 is a treatment for the recommendations of this system.This treatment is collected by discussing with psychologists and what is recommended to users is a treatment that can be done alone without the need for a therapist.However, this treatment is only temporary because it does not eliminate the root of the problem from the user or only as a first treatment.The rules used for treatment recommendations obtained by discussing with psychologists are listed in Table 5.

Preprocessing
After the data is obtained from openpsychometrics.org,There are 172 columns and 39,975 rows in the dataset.Then the data cleaning process is carried out where the value of E in Figure 1 which represents the position of the questions from 42 questions and the value of I as the recording time is ignored or omitted.taken only the value of A as a collection of answers from respondents.Figure 2 is the result of data cleaning: Figure 3 is the result of scoring symptoms of depression, anxiety, stress labeled score_D, score_A and Score_S.Then there is a category label for each symptom that contains the severity of psychological symptoms such as "Normal", "Ringan" means Mild, "Sedang" means Medium, "Parah" means Severe, and "Sangat Parah" means very severe.After that, the category will be mapped by treatment using the rules in Table 5.

SMOTE
In this case, because the DASS-42 dataset experiences class imbalance, with the difference in each class being very much different in number, the SMOTE method is adopted.After oversampling with SMOTE, all classes have sizes of 4,683 each.Figure 4 is a class imbalance before applying SMOTE and Figure 5 is a class that is already balanced.After all classes are balanced, it is continued to build a naive Bayes model. in Figure 6 is a naive Bayes path that is carried out first, namely reading the training data, then if the data is numeric then look for the mean and standard deviation of each parameter which is numerical data.Find the probabilistic value by calculating the number of matching data from the same category divided by the number of data in that category [14].Table 6 is an example of a case that will be calculated using manual calculations from the Nave Bayes algorithm.The training data taken are 8 data and 1 data for testing with two classes, namely "IF" and "IG".Calculate the probability value of each class and calculate the mean and standard deviation of each attribute using equations 2 and 3.In Table 7 is the result of the mean value and Table 8 is the standard deviation.From the results above, it can be seen that the highest probability value is in the class (P|IF) so it can be concluded that the classification of the test data belongs to the "IF" class.

Testing and Result
This chapter describes the results and discussion of the tests that have been carried out to determine the success of the system.The tests carried out are testing the Naive Bayes algorithm and the results of the interface implementation.

Naïve Bayes Algorithm Testing
Then to test this Naive Bayes model using data partition by dividing the portion of the dataset into two parts, namely training data and test data.This test parameter also determines the random state value for the best data partition.data partitioning was carried out five times and the results of the data partitioning can be seen in Table 10 where the accuracy obtained in this system is 84.66% with data partitioning 90% training data that contain 526,838 data, 10% test data that contain 58,537 data and the random state is 34.8 shows the results page display.This page displays test results in the form of the severity of depression, anxiety, and stress as well as treatment recommendations for the first treatment for users.

Conclusion
The conclusion of this research is that the Test System for Depression, Anxiety and Stress Disorders in Students can run according to its purpose, namely knowing the level of user depression, anxiety, and stress disorders, and recommending treatment based on the results of the Depression, Anxiety, Stress Scale (DASS)-42 test.The method used is Naive Bayes with an accuracy obtained of 86.44% with 90% training data and 10% partition test data.
Attribute to i. xi: Attribute value to i. Y: The class you are looking for.yj: subclass Y you are looking for.

Fig. 8 .Figure 7
Fig. 8. DASS-42 test results page/ Figure 7 is the Depression, Anxiety, Stress Scale (DASS)-42 test page which contains 5 question items on one page and the user can fill in the responses according to the circumstances experienced.Figure8shows the results page display.This page displays test results in the form of the severity of depression, anxiety, and stress as well as treatment recommendations for the first treatment for users.
Figure7is the Depression, Anxiety, Stress Scale (DASS)-42 test page which contains 5 question items on one page and the user can fill in the responses according to the circumstances experienced.Figure8shows the results page display.This page displays test results in the form of the severity of depression, anxiety, and stress as well as treatment recommendations for the first treatment for users.