Influence of unregulated storage conditions on physico-chemical, organoleptic and NIR spectral characteristics of yellow cheese

. In the present work, software and hardware tools are proposed for determining the change in the main characteristics of Bulgarian yellow cheese during storage in conditions not regulated by the manufacturers. NIR images in the 800-1100 nm range of yellow cheese samples from 3 manufacturers were obtained using a GT-903 video camera with the IR-filter removed from the camera lens. Several physicochemical characteristics of the product were determined - active acidity, electrical conductivity and completely dissolved solids. Data from organoleptic evaluation of the product are presented. Using ABC-XYZ analysis, informative wavelengths are selected from the spectral features. Spectral indices calculated as ratios of the reflectance coefficients of selected wavelengths were defined and used to predict the storage characteristics of yellow cheese. It has been found that the shelf life of yellow cheese can be predicted with an accuracy of up to 95%, and the active acidity with an accuracy of up to 88%, depending on the manufacturer. The obtained results can be used for analyzes of yellow cheese during its storage and applied in automatic measurement and control systems, as well as in advisory systems for evaluating the quality of yellow cheese in the different stages of its production, transport and storage.


Introduction
Kashkaval is a type of yellow cheese produced mainly in Bulgaria and some other countries of the Balkan Peninsula. They are mainly produced from cow's or sheep's milk, as well as from a mixture of both milks.
As a result of the study of the available literature [1], it can be summarized that the main problem with Bulgarian yellow cheese is its resistance to spoilage, depending on the technology used for its production, as well as the prevention of the development of harmful to human health microorganisms.
Spectral analysis in the visible (VIS) and near infrared (NIR) region of the spectrum is widely used for noncontact and non-destructive determination of main quality characteristics of Bulgarian yellow cheese.
Atanassova et al. [2], indicate that by data from NIR spectral characteristics (900 -1700 nm), in selected spectral ranges, the determination of protein content, dry matter and titratable acidity in cheese is possible with an accuracy of 95% and a ratio of standard calibration error and data sample standard deviation not greater than 3. Prediction accuracy was low for NaCl in cheese. In subsequent studies, Veleva-Doneva et al. [3], indicate that for the characterization of cheese samples, by reduced spectral characteristics data obtained in the range (VIS and NIR, 600 -2500 nm, the SIMCA classifier is more accurate compared to ANFIS and statistical models, but more good accuracy is achieved by using a larger number of principal components This leads to an increase in computational operations, analysis time and hardware resources required. Mladenov [4] and after him Vasilev [5], analyzed the possibilities of applying reduced data from NIR spectral characteristics (900 -2500 nm) for cheese purchased from the commercial network. Regression predictive models have been proposed that have sufficient accuracy to predict key indicators of product quality and safety. A disadvantage of these studies is that, due to the complexity of the measurements, cheese from only one Bulgarian producer was analyzed. The mentioned authors found that the use of data in the visible range of the spectrum (380-780 nm) is not appropriate for predicting changes in the surface characteristics of cheese, especially in the early stage of the development of mold fungi. Vasilev et al. [6], develop predictive models for a day of cheese storage from three Bulgarian producers. For this purpose, the authors use vectors of color components and spectral indices obtained from spectral features in the VNIR region (600-1100 nm). On reduced data from the selected feature vectors, the authors achieved an accuracy of 62%.
Spectroscopy in the near-infrared range is presented as a quick and sufficiently accurate method for determining changes in the main characteristics of cheese during its storage. The prediction accuracy mainly depends on the data reduction method from the spectral features. One of the problems associated with the wider application of spectral analysis in automated systems for the early diagnosis of undesirable changes in cheese is that spectrophotometers operating in the VIS (400 -1200 nm) and NIR (900 -2500 nm) spectral ranges are used. These devices are suitable for laboratory analysis and are not widely distributed, from the point of view of the user, a problem is also their relatively high cost. Bosakova-Ardenska et al. [7], propose a partial solution to the mentioned problems using a consumer camera and an open-source software product for the analysis of different types of cheese. A disadvantage of these studies is that analyzes are made only in the visible region of the spectrum. Through the proposed methods and technical means, the surface characteristics of the product can be assessed with sufficient accuracy, but with a low degree of changes in its composition. It is necessary to analyze the application of NIR spectral characteristics in the evaluation of changes in the main characteristics of cheese. A solution to this problem, for work in the NIR, in the 800 -1100 nm range, is proposed by Vilaseca et al. [8]. To obtain data in the specified spectral range, affordable video cameras with the infrared filter removed are required.
The aim of the present work is to predict the change of basic physicochemical characteristics of Bulgarian yellow cheese, during storage, in conditions not regulated by the manufacturers, using data from NIR spectral characteristics, in the range 800 -1100 nm.

Material and methods
Yellow cheese from three Bulgarian producers, designated as M1, M2 and M3, was used. The cheese is distributed in the commercial network of Yambol, Bulgaria. The product was stored for a period of 16 days, under conditions not regulated by the manufacturers. Cheese storage temperature is in room at 20 ± 2℃ and relative air humidity 45 ± 5 % RH, and these conditions are significantly different from standard storage conditions, from -2 to +4℃. The yellow cheese contains cow's milk, rennet enzyme, salt and calcium dichloride.
The preparation of the samples for determining the physico-chemical characteristics was done according to the following methodology: the cheese raw material was dissolved in distilled water, heated to 70℃, in a ratio of 1/10. After the mixture was cooled to room temperature, three consecutive measurements of each characteristic were made, and their mean value and standard deviation were determined. Active acidity (pH), electrical conductivity (EC, µS/cm) and total dissolved solids (TDS, ppm) were measured.
On the yellow cheese samples from three Bulgarian producers, the sensory parameters were evaluated, such as taste and smell, consistency, cut surface, general appearance. An evaluation scale was developed in accordance with BNS 14:2010. A 5-point scale was used and the maximum number of points that the product can receive is 5. Nine teachers and students of the Faculty of Engineering and Technology, Yambol, Bulgaria, specialists in the field of food technology, participated in the evaluation.
NIR images of yellow cheese were obtained using a GT-903 video camera (Z Top International Co. Ltd., Shenzhen Guangdong, PR China) with the IR filter removed from the camera lens. An STSN-120IR2835 diode strip (Shenzhen Suntech Company Ltd., Shenzhen, PR China) was used to illuminate the photographed object. LEDs have the highest intensity of emitted light at 850 nm. The capturing distance is 20 cm from the camera to the subject. Images were obtained in RGBNIR mode. They consist of three matrices RNIR, GNIR and BNIR.
According to the methodology published by Vilaseca et al. [8], spectral characteristics in the NIR region (800 -1100 nm) were obtained. In this spectral range, indices reflecting the water content of the studied objects are more often calculated. Calculations are at observer 2° and illuminance, according to D65 (average daylight with UV component (6500 K).
ABC and XYZ analyses were applied for selection of informative wavelengths from spectral features [9]. The ABC analysis was implemented in the following sequence: The sum of the reflections in the individual wavelengths of the spectral characteristics for the measurement period was determined and then the data were sorted in ascending order; A share in the total value was calculated as the ratio between the sum for a particular wavelength and the sum of all sums; The share in the total quantity is determined by a cumulative sum of the share in the total value, the values are obtained as a sum of the two previous ones. The data are grouped into three groups A, B and C respectively. Fig. 1.a) graphically presents the grouping of wavelengths using ABC analysis. The grouping was carried out such that the wavelengths with a share in the total amount of 0 -80% were assigned to group A, 80 -95% to group B and 95 -100% to group C. At the next stage, the data were analyzed by XYZ analysis: It was determined the arithmetic average value (mean) of the reflections in each wavelength, in the entire measurement period and the data are sorted in ascending order; The standard deviation (SD) was determined; The coefficient of variation (CV), which is the ratio between the standard deviation and the arithmetic mean, was calculated; The data are grouped into groups 0 -20% for X, 20 -50% for Y and 50 -100% for Z. Fig. 1.b) graphically presents the grouping of the data according to the coefficient of variation in the XYZ analysis.
The next step is to combine the data from the ABC and XYZ analyzes as shown in Table 1. In the case of the study of cheese, during the storage period, the combined group A-Z will be informative, since it contains those wavelengths where the spectral characteristics show the largest coefficient of variation for the period under consideration.
Basic spectral indices defined in Ju et al [10] and Mendiguren et al. [11] were used. An advantage of these indices is that they are not calculated at fixed spectral wavelengths. It is necessary to select six informative wavelengths (λ, nm) for the specific investigated product. The calculated indices can then be used as input for classification, regression and clustering. These indices have the form: For the selection of informative features, a regression feature selection method by neighbour component analysis, FSRNCA [12], was used. This algorithm is suitable when evaluating feature significance for distancebased models. The algorithm is applicable in feature selection for regression analysis.
To determine the predictive ability of the selected traits, the following methods were used: principal component regression (PCR) [5] and partial least squares regression (PLSR) [13]. Coefficient of determination (R 2 ) was used to evaluate the performance of PCR and PLSR. Also, Sum of squared errors (SSE) and Root mean squared errors (RMSE).
A regression model, more commonly used in food data processing, was used [14,15]. The coefficient of determination (R 2 ), its coefficients, their standard error (SE), p-value, Fisher's criterion (F), F is compared with its critical value (Fcr), at certain degrees of freedom (DF). Estimates were made for the degree of influence of the coefficients of the model, as well as their significance values (p-value). The software product Matlab 2017b (The Mathworks Inc., Natick, MA, USA) was used to create the regression models. All data were processed at a significance level of α = 0.05.  Fig. 4 shows the overall consumer organoleptic evaluation (OA) of cheese from three manufacturers. It is evident from the figure that for all three cheese samples (M1, M2 and M3) stored in conditions not regulated by the manufacturers, after 6 days the values of the consumer evaluation decrease significantly, and this tendency is more pronounced for sample M2. After day ten of storage, all samples have low consumer ratings. Fig. 5 shows averaged spectral characteristics in the NIR range for cheese from three manufacturers over a period of sixteen days. In the first four -five days of product storage, a strong overlap of their spectral characteristics is observed. This also corresponds to the close values of the measured physico-chemical characteristics in the first days of product storage. Separability of these characteristics was observed in the following days of storage. The strongest overlap of characteristics is observed in manufacturer M2, and the weakest in M3. It can be summarized that with the three manufacturers, resolution is observed in the spectral region 850 -970 nm. For this reason, the spectral indices that are calculated in these spectral ranges are expected to be more informative and to show a stronger relationship with the change of the main characteristics of cheese during storage. Regardless, in the data from the three manufacturers, group Z of XYZ has the most wavelengths, indicating that they have the highest coefficient of variation values. When combined with group A of ABC, the number of informational wavelengths is significantly reduced. This is because the separability of the spectral features is between the initial and final stages of cheese storage. For this reason, the separability in the spectral features is not constant for the entire storage period, which is accounted for by the ABC analysis. The smallest number of informative spectral wavelengths was selected for yellow cheese by producer M2 (3 wavelengths), followed by M1 with 8 wavelengths selected. With the largest number of selected wavelengths are the data for yellow cheese from M3, where a total of 15 pieces were selected ( Table 2).  The first 6 spectral wavelengths, from the groups A-Z, which are informative for cheese from the three producers, are selected. These wavelengths have the following values: A selection of informative spectral indices suitable for predicting basic characteristics of cheese from the three producers was made. Those features that have values of the weighting coefficients above 0.6 were selected. It can be noted that for the four characteristicsday of storage, pH, EC and TDS, the same signs are informative. In summary of the obtained results from a selection of informative spectral indices, it can be said that for the three manufacturers, the signs calculated as ratios between the reflections of the spectral characteristics in the selected wavelengths are informative. Indices that reflect reflection in a certain wavelength are not informative for the considered products from three manufacturers. Generalized feature vectors for the three cheese producers were compiled, such as:

Results and discussion
The PCR and PLSR results of prediction the characteristics of the three yellow cheese producers. The highest values of the coefficient of determination were obtained for day of storage and pH of cheese. The error values are significantly lower compared to the EC and TDS predictability check. In EC and TDS prediction, the error values are high and have levels above 10%. The highest predictive ability of the reduced data from the feature vectors for the three cheese manufacturers was observed when predicting product storage day and pH.
Predictive models have been created for these characteristics. Predictability of EC and TDS is low. Therefore, no predictive models were constructed for these characteristics.
Regression predictive models were created for day of storage and active acidity of yellow cheese stored under unregulated conditions by manufacturers. Data from the selected spectral indices, reduced to two principal components, were used.
The models for predicting a day of storage have the form:  Fig. 6 shows an overview of models for predicting the storage day of cheese from three manufacturers. The first two principal components PC1 and PC2 were used as independent variables and day of storage was the dependent variable. These variables are plotted as axes in a three-dimensional coordinate system. The resulting model plot is oriented so that the deviation of the points is minimized along the vertical axis. The area of variation of the two independent variables of the M1 cheese model, in which the storage days have the largest values, occurs when PC1 is at a level around the value 5. In the M2 model, D has the largest values, when PC2 is at its upper levels, around the value 5. For the regression model of M3, D is at its largest values when PC1 is at its upper levels.
The coefficient of determination R 2 reaches values of 0,7-0,9. According to Fisher's criterion, with degrees of freedom DF = (3.396), F>>Fcr = 2.63. The low error value (SE = 0.2-1.6) indicate that the obtained models have sufficient accuracy. Fig. 7 shows an overview of models for predicting the pH of cheese from three manufacturers. The first two principal components PC1 and PC2 were used as independent variables and active acidity was the dependent variable. These variables are plotted as axes in a three-dimensional coordinate system. The resulting graphs of the models are oriented so that the deviation of the points is minimized along the vertical axis. The area of variation of the two independent variables of the models for cheese producers M1 and M2, in which pH has the largest values, occurs when PC2 is at its upper levels.
For producer M3, the dependent variable is at its upper levels, with PC2 close to 0. When predicting the day of storage, the coefficient of determination has larger values, compared to that for pH. On the other hand, the storage day prediction errors are higher than those for pH. It can be considered that the day of storage and the active acidity of cheese from three manufacturers are predicted with the same satisfactory accuracy.
Predictive models are also suitable for detecting relationships between difficult and easily measurable characteristics of cheese [4]. Updated and supplemented data for forecasting a day of cheese storage. Through appropriate regression models, compiled on the basis of reduced data from spectral indices developed in the work, an accuracy of up to 95% has been achieved. This improves the accuracy reported by Vasilev et al. [16], where the authors indicated a maximum accuracy of 89% in predicting a day of cheese storage. In the present work, by up to 33% the accuracy of prediction of the storage day of cheese, which was reported by Vasilev et al. [6], was increased who use data in the visible region of the spectrum. Added the data of Atanassova et al. [2], who indicated that by means of data from NIR spectral characteristics (900 -1700 nm), in selected spectral ranges, the determination of basic technological characteristics of cheese is possible with an accuracy of 95%. An advantage of the results presented here is that available technical means and reduced data of vectors of spectral indices defined by appropriate methods are used to achieve this goal.

Conclusion
In the present work, software and hardware tools are proposed, through which the change in the main characteristics of Bulgarian yellow cheese can be determined with sufficient accuracy, during storage, in conditions not regulated by the manufacturers, using reduced data from NIR spectral characteristics, in the range of 800 -1100 nm. As a result of the conducted research, the results indicated in the available literary sources have been supplemented, which are related to the possibility of forming a complex assessment of the changes in the characteristics of cheese during storage.
Informative wavelengths of spectral characteristics in the range 800 -1100 nm were selected. ABC -XYZ analysis was used for this purpose.
Spectral indices calculated as ratios of reflectance at selected spectral wavelengths are defined. After selecting the most informative of these indices, they were used to predict basic characteristics of cheese during storage. As a result of these analyses, a prediction accuracy of these characteristics was found to be up to 33% higher than that reported in the available literature. It was found that the direct use of reflectance values in the spectral range 800 -1100 nm is inappropriate because they do not describe with sufficient accuracy the changes in the main technological characteristics of cheese during storage.
Regression models were created for the automated prediction of the active acidity of cheese during storage, which can be used to predict the change of this characteristic of the product, depending on the storage time. A comparative analysis of the predictive regression models, for a day of storage of yellow cheese and its active acidity, which models can be used, both by the producers of this product and by consumers, because their use requires affordable technical means, was made. More research can be done on the application of the results of the NIR analysis, in automatic measurement and control systems, as well as in advisory systems for evaluating the quality of cheese in the different stages of its production, transport and storage.