Predicting piperine content in javanese long pepper using fluorescence imaging and machine learning model

. The conventional method for determining piperine content involves a series of labor-intensive steps, including drying the pepper samples, grinding them, and then extracting them using high-grade ethanol through a reflux method. While effective, this process is time-consuming and resource-intensive, posing limitations in terms of efficiency and the ability to address potential variations. Therefore, there is an urgent need to explore more efficient and rapid approaches for accurately measuring and predicting piperine content, with machine learning approach. This research aims to explore the potential of using fluorescence imaging methods and ANN models to increase the efficiency of measuring piperine content on Javanese long pepper. We propose a machine learning approach using UV-induced fluorescence imaging of Javanese long pepper. UV LEDs (365 nm) induced fluorescence, with color variation indicating piperine content. An artificial neural network (ANN) model, trained on color texture features from fluorescence images, predicted piperine content, achieving an R 2 value of 0.88025 with ten selected features using the One-R attribute. The final ANN, configured with 'trainoss' learning, 'tansig' activation, 0.1 learning rate, and 10-40-10 nodes, demonstrated a testing R 2 of 0.8943 and MSE of 0.0875. LED-induced fluorescence enhances machine learning's piperine content prediction. This research contributes to more efficient piperine content measurement methods.


Introduction
Piperine, a type of alkaloid amide, exhibits a wide range of properties with positive impacts on health, including antioxidant, anticancer, anti-inflammatory, antihypertensive, hepatoprotective, neuroprotective properties, and its ability to enhance bioavailability and fertility [1,2].Piperine has been demonstrated to offer therapeutic benefits to individuals suffering from various conditions such as diabetes, obesity, arthritis, oral cancer, breast cancer, multiple myeloma, metabolic syndrome, hypertension, Parkinson's disease, Alzheimer's disease, cerebral stroke, cardiovascular diseases, kidney diseases, inflammatory disorders, and rhinopharyngitis.Furthermore, piperine is known to enhance the therapeutic efficacy of drugs, vaccines, and nutrients by inhibiting specific metabolic enzymes [3].Recently, there has been a significant increase in attention towards the utilization of functional foods in various disease management efforts [4].Javanese long pepper, a type of Indonesian spice widely utilized, particularly in the preparation of traditional beverages like Jamu [5], contains piperine as its major constituent, holding significant medical value [6].The conventional method for determining piperine content involves a series of labor-intensive steps, including drying the pepper samples, grinding them, and then extracting them using high-grade ethanol through a reflux method.While effective, this process is time-consuming and resource-intensive, posing limitations in terms of efficiency and the ability to address potential variations.Therefore, there is an urgent need to explore more efficient and rapid approaches for accurately measuring and predicting piperine content.
The development of efficient and accurate testing technology has been a primary focus across various domains.The application of machine learning in image analysis has demonstrated significant performance enhancements, with neural network models (ANN) capable of addressing multi-dimensional and multi-variate challenges [7].ANNs operate analogously to biological neural networks, with interconnected processing units governed by calibrated weights.The utilization of ANNs presents several advantages, particularly in predictive modeling of content.In a recent study by Rohmatullah [8], the use of ANN models for quantifying piperine content has yielded high validation R-values, reaching 0.9457.This underscores the substantial potential of image analysis in ANN modeling for content prediction.Furthermore, the source of information in predicting piperine content can be influenced by the type of imagery employed, with fluorescence imaging.Fluorescence imaging allows for the observation of ions, small biological molecules, biological macromolecules, and intracellular microenvironments at the cellular level when the fluorescent material specifically binds to the target.This capability makes fluorescence imaging a powerful tool for in situ and real-time visualization of important analytes and biological sensitivity, enabling the detection and monitoring of biochemical substances without damaging biological samples [9][10][11].The adoption of LED lamps as an alternative excitation source for inducing fluorescence signals has been gaining popularity.With LEDs' capability to generate varied fluorescence signals, the acquired images provide richer information from each image sample [12,13].The utilization of this technology is anticipated to enhance accuracy and precision in ANN modeling within the context of piperine content measurement.
In this study, we employed fluorescence imaging with UV light generated by a 365 nm LED lamp, because UV light in the 365 nm range can excite a wide range of fluorophores for imaging samples [14].UV light generally has a shorter wavelength compared to visible light, which can lead to better penetration into biological tissues.This can be advantageous when imaging thick samples or tissues.Using 365 nm UV light can help reduce background noise from autofluorescence, improving the signal-to-noise ratio in the fluorescence images  [14,15].UV LED at 365 nm, are readily available and can be integrated into compact and cost-effective imaging systems.This makes them practical for various laboratory and field applications [16,17].We collected color texture features [18,19] from fluorescent images and piperine content values from three color classifications (green, orange, and red) of Javanese long pepper as the dataset for ANN modeling.This research is based on supervised learning using ANN, where input and output data are used in the preparation of mathematical models, so that they are able to make predictions and classifications based on pre-existing data.The ANN can dynamically adapt to color change and piperine content of Javanese long pepper, after being trained with sufficient data.ANN has generalization capabilities to make good predictions and often robust to noisy or incomplete data.Data scientist seems to act as a supervisor to train the algorithm, and training data is needed to predict and classify Javanese long pepper.Color features such as RGB, HSL, HSV, LAB, gray, and texture features associated with each color type (including entropy, energy, contrast, homogeneity, sum mean, variance, correlation, maximum probability, inverse difference moment, and cluster tendency) were used as input dataset for predictions in the input layer.Meanwhile, the piperine content values obtained from laboratory testing were used as the data to be predicted in the output layer.We also conducted sensitivity analysis to obtain the best model by varying relevant ANN parameters.This research aims to explore the potential of using fluorescence imaging methods and ANN models to increase the efficiency of measuring piperine content on Javanese long pepper, in a non-destructive and fast way.

Sample preparation
The materials used in this research were Javanese Long Peppers obtained from Pandean Region, Nguling Timur Village, sub-district of Nguling, Pasuruan Regency, Malang, East Java, Indonesia.These peppers were selected based on three different levels of ripeness determined by their colors: green, orange, and red.Fifteen peppers were chosen at each ripeness level, resulting in a total of 45 Javanese Long Peppers.The selected peppers were then cleaned using a brush.

Piperin analysis
The acquired pepper samples were subsequently dried in an oven (MMM medcenter ecocell 55-Gemini BV) at a temperature of 70 °C until they reached a state of equilibrium moisture content.They were then pulverized by grinding them using a mortar and pestle.Next, the samples were extracted using 99.7% P.A ethanol through a reflux method for 3 hours.The filtrate was passed through microfiltration paper into a 100 ml volumetric flask nd adjusted to the mark using ethanol, then take 5 mL of the solution and dilute it into a 50 mL volumetric flask using ethanol, and take 5 mL of the solution to dilute it again into a 25 mL volumetric flask using ethanol.Then the absorbance value was measured at a wavelength of 343 nm.Quantification was performed using external calibration curve of piperine standard, and the concentration was expressed as % w/v.The piperine content can be calculated using the following formula (SNI 0005: Where A: absorbance of material solution, m: mass of sample (g), w: moisture content, 1238: The absorbance at 343 nm of 1% piperine solution and 1 cm cell.

Image pre-processing
The data pre-processing performed includes background removal, cropping, and converting the file extension into bitmap format.Data preprocessing was carried out with the assistance of Adobe Photoshop 2021.The background removal process involved selecting the desired object and eliminating the background.The cropping process was performed by adjusting the size of the object.

Feature extraction
The feature extraction process is a crucial step in image analysis, performed using Visual Basic 6.0.In this application, the focus is on extracting both texture and color features from the images.Texture features are extracted using Haralick texture analysis, which includes parameters such as entropy, energy, homogeneity, sum mean, variance, contrast, correlation, maximum probability, inverse difference moment, and cluster tendency.We employ the RGB, HSV, HSL, gray, and Lab* color models for color analysis.The feature extraction process will yield a comprehensive set of texture metrics, including Haralick texture analysis parameters, derived from predetermined color components.The result is 120 combined texture and color features, including metrics derived from specific color components.These texture metrics provide valuable insights into image properties, such as texture patterns and color relationships.The results are represented as numerical data and stored in Excel files for further analysis.The values of texture features are obtained from the following calculations [20]: Where P(I, j) is the element at (I,j) th position of the normalized co-occurrence matrix, μ and σ are the mean and standard deviation of pixel elements.Ten texture features are extracted at a distance (d = 1) and angle (θ = 0).Feature selection is the process of selecting the most relevant features from a set of extracted features.The primary objective of feature selection is to reduce data dimensionality, enhance model efficiency, and prevent overfitting.Feature selection is conducted by pre-processing data in data mining.Selection of features becomes an important stage to speed up the modelling process and to facilitate the design of tools.The main purpose of feature selection is to prevent overfitting, as characterized by high MSE validation; to reduce training time and to improve model accuracy [21,22].In this research, feature selection is carried out using the Weka application.Three types of attribute selection methods are employed in this study; One-R (One Rule), Chi-Square (X 2 ) Test, and ReliefF Feature Selection.

Artificial Neural Network Model
The dataset consisted of 180 fluorescence images, with input data derived from color texture features of pepper images and output data representing the piperine content in Javanese Long Pepper.The ANN topology planning involved two main phases: preprocessing and sensitivity analysis, which included training and validation datasets.During the preprocessing phase, the dataset was divided into training data (70%) and validation data (30%).Data normalization was performed to bring all features within a uniform range of -1 to 1, an essential step in ANNs due to their non-linearity.This process ensures that measurements are on the same scale as the actual system, enhancing training efficiency.The sensitivity analysis in the second phase encompassed varying parameters, such as learning rate (0.1, 0.5, 0.9), momentum (0.1, 0.5, 0.9), hidden layer node count (10,20,30,40), number of hidden layers (1 and 2), and activation functions.Multiple combinations were tested to identify the ANN configuration with the lowest Mean Squared Error (MSE) in validation data.

Result and discussion
Based on Table 1, it can be concluded that the lowest MSE (Mean Squared Error) value was obtained through the use of the One-R attribute evaluator with features 1-10.This indicates that the texture and color features used as input in the ANN model are the topranking features, as per the One-R attribute evaluator.These selected features have an MSE value of 0.0095 during training and 0.2688 during validation, with an R value of 0.982 during training and 0.8802 during validation.The R value represents the correlation coefficient, which reflects the relationship between the selected texture and color features and the piperine content in Javanese Long Pepper.The results show that the selected features have a very strong correlation with the piperine content in Javanese Long Pepper.Detailed information about the texture and color features used as input in the ANN model can be found in Table 2.
The results of feature selection were utilized as input for the selection of texture and color features to be used in the ANN modeling.The selection of texture and color features was carried out through trial and error using the ANN model, which in this study featured 2 hidden layers, each with 10 nodes, an output hidden layer with 1 node, trainlm learning function, tansig activation function for both the hidden and output layers, a learning rate of 0.1, and a momentum of 0.9.
Analysis of ANN modelling result the lowest validation MSE is achieved when using the 'Trainoss' learning function (Fig. 2).Therefore, in this sensitivity analysis, we selected 'Trainoss' as the optimal learning function.3. Based on the table, it was observed that the lowest validation MSE was achieved when the tansig activation function was applied to all layers (input, hidden, and output).Consequently, it was decided to employ the tansig activation function for all layers.The utilization of tansig as the activation function for the input, hidden, and output layers yielded a training MSE of 0.01, a validation MSE of 0.0933, a training R-value of 0.9794, and a validation R-value of 0.8993.The lowest validation MSE was achieved with a learning rate and momentum of 0.1 (Table 4).The learning rate and momentum in artificial neural networks can significantly influence the convergence speed during the learning process.Furthermore, the hidden layer with the lowest validation MSE consisted of 2 hidden layers, with 10 nodes in the first hidden layer and 40 nodes in the second hidden layer.Implementing multiple hidden layers in artificial neural networks can enhance the performance of the ANN model.Additionally, the number of nodes also plays a crucial role in determining ANN performance.Too few nodes can lead to underfitting, while an excessive number of nodes can result in overfitting.

Conclusion
In this study, successful prediction of piperine content in Javanese long pepper was achieved using fluorescence imaging by selecting 10 key features through the One-R feature selection method, with R 2 value of 0.88025.Experimental results identified an ANN structure with the highest performance, denoted as 10-10-40-1 (10 inputs, 10 nodes in the first hidden layer, 40 nodes in the second hidden layer, and 1 output), with a learning rate of 0.1, momentum of 0.1, and employing the 'trainoss' learning function and 'tansig' activation function in each layer, with a testing R 2 of 0.8943 and MSE of 0.0875.This research provides evidence that fluorescence imaging and ANN modeling hold significant potential in predicting piperine content in Javanese long pepper.The implications of these findings are the development of a rapid and non-destructive piperine content measurement method for Javanese long pepper, which holds crucial relevance in the food and pharmaceutical industries.

Fig. 3
Fig. 3 illustrates the regression plots for both the training and validation datasets.The R-value signifies the correlation between the input variables and the output variable.A value of R approaching 1 indicates a stronger correlation.In the training dataset, using the ANN topology of 10-10-40-1, a learning rate and momentum of 0.1, trainoss activation function, and tansig learning function in the hidden and output layers resulted in an R-value of 0.97926.Meanwhile, in the validation dataset, the same ANN topology and parameter settings produced an R-value of 0.89427.The R-values represent the correlation coefficients, which measure the relationship between the selected color and texture features and the piperine content in Javanese long pepper.This analysis reveals that features such as a* Maximum Probability, L* Maximum Probability, a* Energy, b* Energy, L* Energy, Red Maximum Probability, Light Maximum Probability, Saturation (HSV) Maximum Probability, Saturation (HSL) Maximum Probability, and Blue Maximum Probability exhibit a highly robust correlation with the piperine content in Javanese long pepper.

Table 1 .
The use of 'Trainoss' resulted in a training MSE of 0.01, a validation MSE of 0.1116, a training R value of 0.9792, and a validation R value These R values depict correlation coefficients, which measure the relationship between the selected texture and color features with the piperine content in Javanese long pepper.Attribute feature selection results.

Table 2 .
Selected texture and color features.

Table 3 .
The sensitivity analysis results on activation functions.

Table 4 .
The results of sensitivity analysis on the learning rate, momentum, number of nodes, and hidden layer parameters