Creating a digital platform with a deep neural network for detecting plant diseases using information technology

. This article is devoted to the detection of plant diseases using a platform with a deep neural network using information technologies. The goal of the work is to create a publicly available platform for detecting plant diseases, which is based on a model of a deep neural network trained in 45 classes of 15 crops (apple, corn, blueberry, rice, cherry, grapes, peach, orange, bell pepper, potatoes, raspberries, soybeans, strawberries, tomato and tea). The use of digital image processing is proposed to detect diseases. The study of many plant species has shown that this method has a high potential for determining the yield and quality of plants and is superior to traditional methods. Based on the finished Plant Disease Expert image data set taken from Kaggle, an EfficientNetB3 model was created that showed impressive results in the average accuracy of determining plant diseases - 98.1%. The article is supplied with graphic materials and tables, as well as a detailed description of each stage of the study.


Introduction
Plant diseases pose a serious threat to the economy and food security worldwide [7], deep training of neural networks can help solve this problem.A study of transfer training methods and the use of databases such as PlantVillage (an open database at that time containing 54,306 images of 14 crop species) to diagnose plant diseases [2,3,5,8] showed that the percentage of detection accuracy exceeds 96%.Despite the large number of studies, today there are no information platforms on which it would be possible to obtain a forecast from the image about the current state of the plant.The only analogue that can recognize plant diseases is the Plantix website, but testing of a set of 80 images showed that the detection accuracy on a given platform is slightly more than 16%.A study was also conducted to achieve high results in the detection of corn diseases in PlantVillage images and accuracy was more than 99%, and on a test sample collected from the Internet -less than 55%.This is due to the synthetic nature of PlantVillage images, which had the same light, background and leaf orientation.For further research, an alternative to PlantVillage was found in the Plant Disease Expert image set [12], which was hosted in the Kaggle DataScience community.To better detect and prevent crop diseases, it is necessary to create not only a good model, but also to ensure all the necessary conditions for its use [13].Therefore, this work proposes the development of its own HealthsyPlant platform, which will use modern technologies of organization and deep learning to provide a new level of service to the farming community.

Materials and methods
In most areas, the problem of image classification is solved by deep neural networks trained on large amounts of data and then finely tuned to a particular data set.A study of publicly available transfer training models showed that the classification accuracy of the ResNet50 architecture [6] on the PlantVillage test dataset is 98.4%, but a self-collected dataset determines an unacceptable accuracy of 46%.The elaboration of such an information theory of probability helped to reveal that the results obtained are related to the type of images used: PlantVillage photographs are collected and processed under special controlled conditions, which makes them more synthetic and different from real images.
When searching for an alternative image base, the Plant Disease Expert [12] dataset was discovered, which has the necessary drawbacks: different degrees of illumination, turns, framing and poor focusing -everything that can be applied for neural network processing in "combat" conditions.It is necessary to note another advantage of the found data set -the number of images is 199,672, of which, on average, 13,000 are for each culture and, on average, 3,000 -for each class (see Figure 1).It is necessary to clarify here that all cultures do not have an equal number of classes of diseases, and it can vary from 1 to 8. Scaling the width is an increase in the number of channels in an image (or neurons in a layer), which allows layers to study more detailed features, but can complicate the study of complex features; Depth scaling is an increase in the number of CNN layers, allowing the network to study more complex features.However, the problem of disappearing gradients makes learning deep neural networks difficult.Packet normalization and workarounds help ease this problem, but increasing network depth quickly reduces accuracy gains; Resolution scaling -Increases the number of pixels in an image, allowing the network to find smaller structures through additional image details.However, resolution scaling provides limited accuracy gains, as do other kinds of scaling.Rollup operations are the most computationally expensive for CNN.The number of floating point operations (FLOPs) per convolution operation is approximately proportional to the square of depth (d), the square of width (w) and the square of resolution (r) -that is, doubling the depth will double the number of FLOPs, and doubling the width and resolution will quadruple the number of FLOPs.Based on these assumptions, a simple scaling technique based on a combined phi coefficient describing the amount of available resources has been proposed.Scaling the network (by width, depth, or resolution) will increase the number of FLOPs by (dw 2 r 2 ) phi times.This restriction was introduced to ensure that the number of FLOPs does not increase by more than 2 phi .
EfficientNet [4] is a family of neural networks developed by researchers at Google Brain that are composite models optimized for computational efficiency and accuracy.The efficiency and accuracy of the architecture is achieved through a combination of several optimizations, such as scaling the width and depth of the network, optimizing the network by finding the best hyper parameters using the GridSearch and AutoML methods, and others [14].
The researchers used a basic EfficientNetB0 neural network-to-network search for optimal scaling parameters with a fixed combined phi factor of one.The optimal parameters found were aplha = 1.2, beta = 1.1 and gamma = 1.15.Then, by capturing optimal aplha, beta and gamma values, the researchers scaled the amount of available phi resources to create larger models from EfficientNetB1 to EfficientNetB7.The results of the studies showed that the EfficientNetB3, EfficientNetB5 and EfficientNetB7 (see Figure 3) models show better accuracy among all models of the EffectiveNet family approximately equally, however, the latter two use significantly more time and resources.
Due to the limited computing resources and the desire to get an excellent result in a short time, our work gave preference to the EfficientNetB3 architecture.Other architectures with similar learning times were considered (these were mainly ResNet family architectures, one of which was mentioned earlier).Among them, EfficientNetB3 showed the best result (see Figure 4).The HealthsyPlant platform (see Figure 5) is a set of related services and tools that are available to users through a public web service, and the TensorFlow model runs on a virtual server or GPU cluster.Users can upload photos of diseased plants through the website interface to identify the cause of the disease and view a description of the disease.They can also check if the disease has been properly recognized and learn how to treat it.Experts can review user requests and verify correct recognition, as well as request the addition of their images or user images to the image database.They may also request a change in disease description or initiate retraining of the model using new images.Data analysts can add new images to the database, initiate model retraining, and obtain various statistics about platform users.A separate study was conducted to improve the classification accuracy of test images, which compared different types of estimates, such as: logistic regression, methods of support vectors with cosine similarity in the form of a kernel, decision tree, random forest, gradient increase and simple single-layer perceptron with one input and one output layers ending with softmax activation.As a result, a single-layer perceptron trained by the Adam optimizer [1] was found to show the best classification accuracy of test images, reaching 98.1%.Having trained and tested the model based on ResNet50, it was not possible to obtain an accuracy of more than 65%.Due to the lack of non-synthetic image datasets, an important solution was not just the training of a deep neural network for plant disease recognition purposes, but the deployment of a public platform with an integrated neural network model and databases.This allows you to collect and accumulate your own database of images uploaded by users for research, which will be used to further learn the model and develop the project.
HealthsyPlant users can run recognition tasks using a web portal, but the most convenient way is a mobile application.Therefore, in order to make the platform more accessible and more convenient, it is planned to develop an application for mobile operating systems IOS and Android, which will make it easier for users to send photos of sick plants for research, and the creation of an NLP model (Natural Language Processing) will allow you to determine the diseases of the plant by text description.Comparative analysis of the developed platform with several commercial AutoML platforms: Google Cloud Vision [9], Microsoft Custom Vision [11] и IBM Watson Visual Recognition [10], showed that a test set of images consisting of 30 images that were used to train the model and 30 images, which were not used for training, as well as 20 images from the field of crop diseases, thus, the new model has a detection level similar to models created on commercial platforms (see Table 1).

Discussion
As mentioned earlier, the only alternative to our solution is the Plantix platform, the result of which on our test set of 70 images is unsatisfactory -46%.However, it is worth taking into account that this platform is associated with a large agricultural community, which allows its developers to obtain a large number of images to improve the quality of their models.
The developed HealthsyPlant platform (see Figure 5) to facilitate the detection and prevention of agricultural plant diseases is ready for use as a web portal that has a database of 15 crops and 45 classes, a total of 199,672 images.The basis was the convolutional neural network architecture of the EfficientNetB3 (see Figure 3), which made it possible to achieve high accuracy in a short period of time and with low computing power.
Comparative analysis (see Table 1) confirmed that the model of the developed platform detects diseases well, even better than some known AutoML products, and expanding the image database will improve the web portal: introduce the ability to determine the disease by text description and develop a mobile application.

Conclusion
In conclusion, I would like to note that the trends in the development of deep training of the neural network for the detection of plant diseases have high potential and can be actively introduced into various areas of activity.Thus, the implementation of such projects will allow you to quickly and efficiently find solutions to problems associated with plant diseases, and thereby ensure environmental protection in the regions where they will be implemented.

Fig. 1 .
Fig. 1.Images of disease classes from the HealthsyPlant databaseConvolutional neural networks (CNNs) have three dimensions: width, depth, and resolution.Depth is determined by the number of layers, width by the number of channels (for example, three channels for RGB), and resolution by the number of pixels in the image.Scaling each of these measurements can improve the accuracy of CNN (see Figure2):Scaling the width is an increase in the number of channels in an image (or neurons in a layer), which allows layers to study more detailed features, but can complicate the study of complex features;Depth scaling is an increase in the number of CNN layers, allowing the network to study more complex features.However, the problem of disappearing gradients makes learning deep neural networks difficult.Packet normalization and workarounds help ease this problem, but increasing network depth quickly reduces accuracy gains;

Fig. 4 .
Fig. 4. Comparison of convolutional neural network architectures at similar learning times.

Table 1 .
Comparison of detection accuracy by number of correctly recognized images of HealthsyPlant and AutoML models