Tuberculosis

As an official AI partner of the Central TB Division (CTD), we are developing multiple interventions across the TB care cascade and helping India’s National TB Elimination Programme (NTEP) become AI-ready.

Automated interpretation of LPA test results

We use AI to interpret results of the LPA test to determine drug resistance to TB. Each LPA strip encodes the drug resistance pattern of the patient via a series of activated (dark) and inactivated (light) bands corresponding to different regions of the genome of the Tuberculosis bacterium. The AI problem consists of identifying this band pattern and then employing a set of rules to determine the specific type of drug resistance.

The challenging aspects of this problem lie in visually isolating each strip on a (possibly crumpled) piece of paper and then identifying the bands and their sequence. This is carried out using both classical computer vision techniques that detect activated bands via edge detection and match the band pattern against reference templates. Most importantly, the models need to work in a resource-constrained environment without impacting clinical outcomes. We are  working to develop end-to-end supervised deep learning techniques to address this multi-task problem, with a human-in-the-loop as an integral part of the AI ecosystem, along with novel data augmentation techniques.

Prediction of risk for Loss to Follow Up (LFU)

Our solution is a risk prediction project. Using a set of patient indicators like age, gender, location, time interval between diagnosis and treatment initiation, etc., collected at treatment initiation time from TB patients, we carry out advance prediction of whether or not the patient will eventually complete treatment. The AI employs an ensemble of models trained on Nikshay data corresponding to treatment outcomes for roughly half a million TB patients across the country. The data is feature-engineered, in the sense that a vast number of categorical indicators are encoded in different ways. New features, for example those that represent patient migratory behavior, are created through application of specific data transformations. These models were rigorously evaluated on a dataset of roughly half a million patients as part of a blind, prospective evaluation process.

The principal AI challenge here lies in early prediction of treatment dropoff (also referred to as Lost to Follow Up or LFU). Treatment dropoff is governed by a number of factors, many of them dynamic in nature, and it is therefore expected that early prediction of dropoff is a difficult problem. Our AI solution, however, yields accuracies that are more than twice as high as those of the best rule-based decision making systems, and fair across a number of important cohorts like gender, public vs. private treatment facilities, and month of treatment initiation. We are also working on interpretability methods to ensure that predictions are explainable in terms of the underlying indicators.

TB Ultrasound (USG)

Our USG solution aims to demonstrate that abnormal features found in chest ultrasound scans can be used as signals to diagnose TB using AI. Building upon previous works, we first show that the chest ultrasound scans of TB patients indeed contain distinctive features that are discernible to the radiologist’s eye. We formulate the AI task to detect these abnormal features in an automated fashion and predict the likelihood of the patient being TB-positive. Our current AI proof-of-concept uses deep learning  to identify abnormal features within individual frames of a USG video scan, followed by frame-level aggregation to make a video- or patient-level prediction for TB.

A challenging aspect of this solution is to ensure that data collection for building the AI model happens in an unbiased way, i.e.,USG scans are collected in the same protocol regardless of whether the patient has TB or not. We accomplish this by recommending  ‘lawn-mower’ style complete-chest scans for all patients, as well as localized scans that are carried out in equal proportion for TB and non-TB subjects.

ML Engineer

ROLES AND RESPONSIBILITIES

An ML Engineer at Wadhwani AI will be responsible for building robust machine learning solutions to problems of societal importance; usually under the guidance of senior ML scientists, and in collaboration with dedicated software engineers. To our partners, a Wadhwani AI solution is generally a decision making tool that requires some piece of data to engage. It will be your responsibility to ensure that the information provided using that piece of data is sound. This not only requires robust learned models, but pipelines over which those models can be built, tweaked, tested, and monitored. The following subsections provide details from the perspective of solution design:

Early stage of proof of concept (PoC)

  • Setup and structure code bases that support an interactive ML experimentation process, as well as quick initial deployments
  • Develop and maintain toolsets and processes for ensuring the reproducibility of results
  • Code reviews with other technical team members at various stages of the PoC
  • Develop, extend, adopt a reliable, colab-like environment for ML

Late PoC

This is early to mid-stage of AI product development

  • Develop ETL pipelines. These can also be shared and/or owned by data engineers
  • Setup and maintain feature stores, databases, and data catalogs. Ensuring data veracity and lineage of on-demand pulls
  • Develop and support model health metrics

Post PoC

Responsibilities during production deployment

  • Develop and support A/B testing. Setup continuous integration and development (CI/CD) processes and pipelines for models
  • Develop and support continuous model monitoring
  • Define and publish service-level agreements (SLAs) for model serving. Such agreements include model latency, throughput, and reliability
  • L1/L2/L3 support for model debugging
  • Develop and support model serving environments
  • Model compression and distillation

We realize this list is broad and extensive. While the ideal candidate has some exposure to each of these topics, we also envision great candidates being experts at some subset. If either of those cases happens to be you, please apply.

DESIRED QUALIFICATIONS

Master’s degree or above in a STEM field. Several years of experience getting their hands dirty applying their craft.

Programming

  • Expert level Python programmer
  • Hands-on experience with Python libraries
    • Popular neural network libraries
    • Popular data science libraries (Pandas, numpy)
  • Knowledge of systems-level programming. Under the hood knowledge of C or C++
  • Experience and knowledge of various tools that fit into the model building pipeline. There are several – you should be able to speak to the pluses and minuses of a variety of tools given some challenge within the ML development pipeline
  • Database concepts; SQL
  • Experience with cloud platforms is a plus
mle

ML Scientist

ROLES AND RESPONSIBILITIES

As an ML Scientist at Wadhwani AI, you will be responsible for building robust machine learning solutions to problems of societal importance, usually under the guidance of senior ML scientists. You will participate in translating a problem in the social sector to a well-defined AI problem, in the development and execution of algorithms and solutions to the problem, in the successful and scaled deployment of the AI solution, and in defining appropriate metrics to evaluate the effectiveness of the deployed solution.

In order to apply machine learning for social good, you will need to understand user challenges and their context, curate and transform data, train and validate models, run simulations, and broadly derive insights from data. In doing so, you will work in cross-functional teams spanning ML modeling, engineering, product, and domain experts. You will also interface with social sector organizations as appropriate.  

REQUIREMENTS

Associate ML scientists will have a strong academic background in a quantitative field (see below) at the Bachelor’s or Master’s level, with project experience in applied machine learning. They will possess demonstrable skills in coding, data mining and analysis, and building and implementing ML or statistical models. Where needed, they will have to learn and adapt to the requirements imposed by real-life, scaled deployments. 

Candidates should have excellent communication skills and a willingness to adapt to the challenges of doing applied work for social good. 

DESIRED QUALIFICATIONS

  • B.Tech./B.E./B.S./M.Tech./M.E./M.S./M.Sc. or equivalent in Computer Science, Electrical Engineering, Statistics, Applied Mathematics, Physics, Economics, or a relevant quantitative field. Work experience beyond the terminal degree will determine the appropriate seniority level.
  • Solid software engineering skills across one or multiple languages including Python, C++, Java.
  • Interest in applying software engineering practices to ML projects.
  • Track record of project work in applied machine learning. Experience in applying AI models to concrete real-world problems is a plus.
  • Strong verbal and written communication skills in English.
mls