Small space, big solution

Farmers need an answer but the phone memory space is the biggest challenge in implementing an AI-based solution.
In India, there 1 billion handsets but only 450 million are smartphones.

We work with cotton farmers. Not only is cotton one of the biggest cash crops in the world it is also plagued with its own risks. Over the past few years, pests ravage farms, destroying, at times, over 50% of the total yield. Farmers are at a loss. Traditional methods have stopped being as effective. AI could be an answer. But there are challenges. 

First, let’s take a quick look at the headline numbers around India. 

  • India is a land where 1.4 billion people coexist. 
  • 1 billion handsets
  • But just 450 million smartphones. 
  • It doesn’t mean that 450 million people have smartphones, it means people have multiple handsets. 

Among smartphones, Apple holds just 2.6% of India’s smartphone market share. What does that tell us? There are a handful of premium phone users in India. These users aren’t in small-town India. 

Almost 50% of phones cost less than Rs 10,000 (~$130). Let’s take a look at our user. Our typical user is male, around 35 years old. Uses WhatsApp to forward messages and YouTube to watch videos when he can. That’s his interaction with the internet. He is not digitally literate. So, there are three major challenges.

  1. Limited computation capacity
  2. Data connectivity
  3. Digitally semi-literate user

Our solution uses a simple smartphone and pest traps. The biggest problem facing cotton farmers across the world is the need to know how much pesticide is needed to ward off attacks and if it is too late to spray in the first place. Our solution asks the farmer to take a picture of a pest trap and get actionable advice. 

What happens behind the scenes? Using features generated from a convolutional neural network, our system first determines whether the image is valid. If it is, those features are then used to detect and count the number of pests that are present. Using this count and with the help of guidelines set by entomologists, the system recommends an action to the farmer—to spray or not to spray pesticide. We train our models on a data set of around 4,000 images we collected in collaboration with smallholder farmers all over the country.

But our model size was almost 250MB and since the initial deployment was web-based, it required a steady internet connection on the user’s side. During our field experiments, we realized that network connectivity became a major bottleneck to great user experience. One way to address this problem is by doing all the computation offline. However, we cannot simply store a 250MB large model on even high-end phones There needed to be a solution. If you need context for the problem we faced, this from Apollo 13 should set the mood.

Now that you have context. Let’s get to some brass tacks. 

Begin the begin

Compute resources are largely constrained by two factors. The first is mobile network coverage. Many smallholder farms in India are located in areas that are poorly connected. And these conditions hamper the user experience. The second constraint comes from integrating products into partner interfaces. Solutions need to be mindful of size and quality of service guarantees. Evidence of these challenges was seen during our data collection effort. 

Our solution was placed on 17 phones that belonged to a mix of extension workers and farmers. The experiment ran for 60 days across 25 farms, providing 89 recommendations based on pink bollworm catch. The average per image upload time was approximately 38.5 seconds (95% CI [33.6, 43.3]), which represented 48% of task completion time. For users, this was especially problematic in cases where they had network connectivity in a location different from where they took the photo. In these conditions, the users had to travel to an area with connectivity to complete the task. This inconvenience was exacerbated in cases where a retake was required: a farmer would take a photo in the field, walk to an area with connectivity to perform the upload, then be forced to walk back into the field for a retake upon learning that the image was invalid. In addition to the log analysis, qualitative follow-ups showed that these upload times would be a barrier to adoption.


We first established a target model size by examining the size of our partner’s app and understanding the general sizes of other agri-tech apps. General sizes were found to be between 15 and 20 MB. From there, we decided a model size of 5MB would be a good target. Getting to this size was essentially an exercise in model compression.

Stumble 1

The easy thing to do would be to tell you what we finally did. We will, but in a bit. But every wrong solution is a path to the right one. So let’s start with how each step led us to our ultimate goal. 

As we already mentioned before, our multi-task learning model consists of a shared base network that computes features from the given image and the features, in turn, feed into the detection and validation head. It is important to note that the base network takes up almost 80% of the compute time in SSD, the single-stage object detection model from which our multi-task network was eventually developed. So, as a first step, we tried to use a smaller base network. MobileNets has been shown to effectively reduce the model size by using Depthwise Separable convolutions instead of regular convolutions throughout the network, while still maintaining a reasonable accuracy for classification tasks. We tried using both MobileNet v1 and v2 instead of our base network, which was a VGGNet. Even though the accuracy drop was not huge, we realized that we are still very far from our goal of achieving a 5MB model, and the subsequent accuracy drops to reach there by following this approach would accumulate, making the solution infeasible. 

The answer

We adopted a version of filter-level pruning known as iterative pruning, specifically using the technique mentioned here. The idea is to iteratively prune a fixed number of filters followed by training the pruned model until we achieve our desired memory size.

The figure above shows how one filter gets pruned as described in this paper. Let’s look at two consecutive layers, L and L+1. The number of input channels for layer L is K and the number of output channels is N. Since the output of layer L serves as the input for layer L+1, the number of input channels for layer L+1 is also N, while the number of output channels here is M. Pruning the filter at index i from layer L would reduce the number of output channels of layer L, and hence, the number of input channels in layer L+1, by one. Thus, for pruning the chosen filter, we also need to update all the layers which branch out from layer L.

There are always speed bumps

We adapted this technique for our model, which consisted of both a detection head and a classification head. This had its own challenges. If you look at the figure above, there are several branches that snake through the network. So, if say, a layer in the base network is pruned, the correspondent branches in the box-predicting layer and the values associated with it also needs to be pruned. Furthermore, if the layer being pruned is the last layer of the shared base network, which is not only a box predicting layer, but also the input for both the detection and image validation branches, the corresponding values from the first layer of both the heads have to be pruned accordingly. 

It is a time-consuming task. We pruned 1,024 filters in each iteration, followed by 30 epochs of training until ~80% of the total number of filters in the original model were pruned. The weights of the final pruned model were saved in half-precision to further reduce the memory footprint on disk.

The figure above figure details the outcome of this effort. Iteration zero represents the original, unmodified model that we begin with; its size after quantization was approximately 132MB. Subsequent iterations come after a round of pruning. The required model size of 5MB was reached after 15 iterations, where the mean absolute error between the predicted and true counts increased only from 0.91 to 1.02. In addition to a reduction in size, the compressed model was also less compute-intensive, as seen from the drop in multiply-accumulate operations (MACs).

Lessons we learned

Our typical user is probably more sensitive to small changes in experience than the average smartphone user across the world. The solutions need to be not only easy to use but also lightweight and have the ability to work offline. This was a core principle and it drove our approach, which has been to do as much computation offline as possible. Having offline inference allows better management of data privacy as the solution scales because there is finer control on what user data leaves the phone. We managed to do that by compressing our model to the smallest size with minimum loss in performance. It is important to quantify this need early on, to set the right expectations with the respective stakeholders in terms of latency, model size and accuracy. 

We wrote a paper on our entire journey, our learnings, and the technology. It was accepted at KDD 2020 in the Applied Data Science track. You can read it here. There is a video as well.

Rameshwar Bhaskaran and Siddharth Bhatia were interns assigned to this project.

  • Wadhwani AI

    We are an independent and nonprofit institute developing multiple AI-based solutions in healthcare and agriculture, to bring about sustainable social impact at scale through the use of artificial intelligence.

ML Engineer


An ML Engineer at Wadhwani AI will be responsible for building robust machine learning solutions to problems of societal importance; usually under the guidance of senior ML scientists, and in collaboration with dedicated software engineers. To our partners, a Wadhwani AI solution is generally a decision making tool that requires some piece of data to engage. It will be your responsibility to ensure that the information provided using that piece of data is sound. This not only requires robust learned models, but pipelines over which those models can be built, tweaked, tested, and monitored. The following subsections provide details from the perspective of solution design:

Early stage of proof of concept (PoC)

  • Setup and structure code bases that support an interactive ML experimentation process, as well as quick initial deployments
  • Develop and maintain toolsets and processes for ensuring the reproducibility of results
  • Code reviews with other technical team members at various stages of the PoC
  • Develop, extend, adopt a reliable, colab-like environment for ML

Late PoC

This is early to mid-stage of AI product development

  • Develop ETL pipelines. These can also be shared and/or owned by data engineers
  • Setup and maintain feature stores, databases, and data catalogs. Ensuring data veracity and lineage of on-demand pulls
  • Develop and support model health metrics

Post PoC

Responsibilities during production deployment

  • Develop and support A/B testing. Setup continuous integration and development (CI/CD) processes and pipelines for models
  • Develop and support continuous model monitoring
  • Define and publish service-level agreements (SLAs) for model serving. Such agreements include model latency, throughput, and reliability
  • L1/L2/L3 support for model debugging
  • Develop and support model serving environments
  • Model compression and distillation

We realize this list is broad and extensive. While the ideal candidate has some exposure to each of these topics, we also envision great candidates being experts at some subset. If either of those cases happens to be you, please apply.


Master’s degree or above in a STEM field. Several years of experience getting their hands dirty applying their craft.


  • Expert level Python programmer
  • Hands-on experience with Python libraries
    • Popular neural network libraries
    • Popular data science libraries (Pandas, numpy)
  • Knowledge of systems-level programming. Under the hood knowledge of C or C++
  • Experience and knowledge of various tools that fit into the model building pipeline. There are several – you should be able to speak to the pluses and minuses of a variety of tools given some challenge within the ML development pipeline
  • Database concepts; SQL
  • Experience with cloud platforms is a plus

ML Scientist


As an ML Scientist at Wadhwani AI, you will be responsible for building robust machine learning solutions to problems of societal importance, usually under the guidance of senior ML scientists. You will participate in translating a problem in the social sector to a well-defined AI problem, in the development and execution of algorithms and solutions to the problem, in the successful and scaled deployment of the AI solution, and in defining appropriate metrics to evaluate the effectiveness of the deployed solution.

In order to apply machine learning for social good, you will need to understand user challenges and their context, curate and transform data, train and validate models, run simulations, and broadly derive insights from data. In doing so, you will work in cross-functional teams spanning ML modeling, engineering, product, and domain experts. You will also interface with social sector organizations as appropriate.  


Associate ML scientists will have a strong academic background in a quantitative field (see below) at the Bachelor’s or Master’s level, with project experience in applied machine learning. They will possess demonstrable skills in coding, data mining and analysis, and building and implementing ML or statistical models. Where needed, they will have to learn and adapt to the requirements imposed by real-life, scaled deployments. 

Candidates should have excellent communication skills and a willingness to adapt to the challenges of doing applied work for social good. 


  • B.Tech./B.E./B.S./M.Tech./M.E./M.S./M.Sc. or equivalent in Computer Science, Electrical Engineering, Statistics, Applied Mathematics, Physics, Economics, or a relevant quantitative field. Work experience beyond the terminal degree will determine the appropriate seniority level.
  • Solid software engineering skills across one or multiple languages including Python, C++, Java.
  • Interest in applying software engineering practices to ML projects.
  • Track record of project work in applied machine learning. Experience in applying AI models to concrete real-world problems is a plus.
  • Strong verbal and written communication skills in English.