Increasing agricultural yield through plant disease detection (Image Classification using ResNet50)

Original Source Here

Increasing agricultural yield through plant disease detection (Image Classification using ResNet50)

I have always been interested in solving real-world issues and overcoming challenges using data. For my final project at IronHack Data Analytics (Barcelona), I decided to analyze issues surrounding food insecurity trends, understanding the production/yield of crops, and finally propose a solution that could work in solving one of the most important issues of our time.

Where does our food come from?

Let’s start by looking at the production of crops in countries/regions in 2019.[Colour scale: Green (relatively lower production), Yellow (relatively mid-production), and Red (relatively high-production)]

Production of food per country/region in 2019 (in tonnes)

Based on the findings from the map, it is evident that almost every country is contributing to crop production. And as expected, bigger countries in terms of land area are leaders in crop production.

Comparison of Food production per region

And, this trend continues as we compare the trend over 15 year period. However, we can also see that the production levels have been consistently increasing across each country/region.

Comparison of Production per crop

Similarly, the trend also continues when we look at the production of crops based on types. The production has been increasing over the same period and major production crops remain stable in terms of production levels.

Yield of crops in Countries/Regions in 2019

However, when we look at yield of crops in countries/regions measured in hectogram/hectare, the map looks different from the one with production of the crops.

Comparison of Yield per region

Specifically, we notice that the yield is quite high in advanced economies and this makes sense as there is more government support for farmers, better technologies, and preventive measures in these countries. Additionally, we should also note that there is a considerable difference in yield in other countries even though production is high in those respective countries.

What are the impacts on humans due to these lower yields?

One of the key things we can analyze is the food insecurity trends around the world. Even though the food production has been increasing over time and more land area is used for food production, due to lower yield we are unable to meet food demands.

Number of Undernourished People (in millions)

For instance, the number of undernourished people (in millions) is projected to keep growing until 2030. Even though the number of undernourished people is decreasing in Asia and developed nations, the number is projected to keep increasing in Latin America and Africa. It’s projected that the highest number of undernourished people in the world in 2030 will be from Africa.

Projected hunger in the world due to effects of COVID-19

This trend is exacerbated by the effects of COVID-19. In the three scenarios projected by FAO, in each of them it is projected that world hunger will continue to grow and will be higher than the projected levels pre-COVID-19.

On top of these, the effects of lower yield also have a huge impact on farmers in developing economies. In some countries, the cases of farmer suicides are on the rise and studies mention that the top two reasons for these suicides are attributed to debt and crop failure. In most cases, due to low agricultural yield, farmers are forced to accumulate more debt as they are unable to pay back the loans due to lower yield.

In summary, these impacts of lower yield in crop production have real effects on human lives and our future generations.

Causes of lower yield factors:

Now we have established the toll on human lives due to lower yield, it is important to investigate the causes of lower yield factors.

Broadly, these causes can be classified into two major categories:

  1. Abiotic Factors: Drought/Water scarcity, War and Violence, Lack of government support, Climate change/rise in global temperatures, etc.
  2. Biotic Factors: Pests, Viruses, Pathogens, and Weeds.

In the case of my project, I chose to investigate biotic factors and in particular the actual percentage loss in yield due to diseases caused by pathogens and viruses.

Comparison of Yield and Actual % Loss Due to Biotic Factors

The main reason for my choice in the analysis is evident from the chart above. When we compare the yield of two crops (Maize and Wheat — measured over two time periods) and actual percentage loss in yield, we notice that the production has increased and the loss in yield due to pests and weeds have been decreasing. This makes sense as wider adoption of the use of insecticides, fungicides, and controlling weed has contributed to a lesser loss of yield in these products. However, we notice that the loss of yield continues to increase due to plant diseases caused by pathogens and viruses.

Commonalities in plant diseases due to pathogens:

From the images below, we can see that we can potentially use images of leaves as there are patterns and spots on leaves that could help us identify what kind of disease the plant is suffering from. For instance, in the case of Apple scab and Corn common rust, we can see black spots and rusty dots respectively. In the case of Tomato blights, we can see wilting of leaves and dark brown/black clusters in the leaves.

As we are able to identify these through our eyes, we can also feed these images to a deep learning algorithm and try predicting the type of disease. In this case, 52000+ images were used with a breakdown of 38 classes (images included both healthy and unhealthy leaves).

For the first couple of iterations, the algorithm learned from grayscale images. And for the last two iterations, color images were used. For the first three iterations, Sequential() was used and for the last one, ResNet50() was used.

The summary of model improvement iterations are as follows:

Note: These are just the best ones I had from each iteration. There were a few iterations that did not have good predictive ability.

However, after several iterations, it improved quite significantly, and specifically the ResNet50() iteration was able to achieve an accuracy of about 94% after 25 epochs.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: