From 2D videos to 3D babies: Introduction to the anthropometry journey 

We document how a simple 2D video can help find the weight of a baby.
An AI-based anthropometry tool can help Ashas measures baby accurately.

A Unicef report estimates that 50% of low birth weight babies fall through the cracks and are never identified. This comes up to about 10 million babies per year. Low birth weight babies (LBW) are prone to grow up with several developmental disorders. But if one can identify these babies on time, their quality of life can be improved by providing timely care. 

At Wadhwani Institute for Artificial Intelligence (Wadhwani AI), we believe in empowering the ASHA (Accredited social health activist) workers who care for these babies and their mothers. But first, the ASHAs need to find the babies and weigh them accurately. Our answer is an AI-based anthropometry tool, which uses state of the art reconstruction methods to recover 3D pose and shape of the babies. This helps the healthcare worker measure medically important criteria such as weight, height, chest circumference, head circumference and arm circumference to identify at-risk neonates. 

Our approach involves recovering a 3D mesh (parameterized by pose and shape) of the baby from a monocular RGB video. We use a custom deformable neonate model to compress the representation of the mesh to 92 dimensions. Once we recover the 3D mesh, calculating measurements is straightforward. Given that babies have constant density, weight can be calculated using the estimated volume. We built the deformable neonate model from 3D scans we collected. To train our reconstruction algorithms we use a combination of 2D and 3D, synthetic and real data, and the deformable neonate model that enables the compact representation of the 3D mesh. 

How do we do this?

The first question we tried to answer — can we collect large amounts of 3D data? Collecting accurate 3D data is hard and expensive. It typically involves a large immobile multi-camera system and requires a trained technician to operate. Hence, collecting large datasets with 3D ground truth is not feasible. We took the route of creating a large 2D dataset and using that as a proxy to help our models learn 3D reconstruction. Recent research in 3D reconstruction on adults has shown that it is possible to obtain fairly accurate reconstructions using large amounts of 2D data in tandem with small amounts of 3D data.

But even a large number of 2D video datasets pertinent to our problem are difficult to come by and, to our knowledge, no such dataset exists. So we started a large scale data collection exercise spanning four states across India with appropriate ethics committee approvals. Our data collectors are a mix of ASHAs, ANMs (Auxiliary Nurse Midwives) and nurses collecting videos from homes, primary healthcare centres and hospitals respectively, spanning   around 50 locations in the country. They undergo appropriate training and periodic retraining to ensure a minimal bar on data quality.

Collecting data 

We collect 2D data in the form of a video using a generic low-cost smartphone accessible to our data collectors and for each baby, record height, weight, chest circumference, head circumference and arm circumference which represent the target variables we wish to estimate during the evaluation.

A baby is placed on a flat surface with a reference object next to the baby. We ensure that we capture different profiles of the babies (including extreme side angles) by moving the camera in an arc-like motion above the baby. This entire process is encapsulated in 10-15 second videos. 

We then obtain manual keypoint and segmentation mask annotations (that are proxies for 3D pose and shape) for the videos. Our researchers train a team who then annotate frames sampled from the video and use the annotations to interpolate to the other frames.

We use the keypoints and segmentation masks by employing a re-projection loss where we project the 3D predictions back on to 2D. We then devise loss functions to compare the ground truth annotations with the re-projected 3D keypoints and segmentation masks. This re-projection loss is what allows us to build our algorithm with limited 3D scans while using the 2D videos as a proxy. 

This dataset has now become the bedrock for all our research in anthropometry. 

  • Wadhwani AI

    We are an independent and nonprofit institute developing multiple AI-based solutions in healthcare and agriculture, to bring about sustainable social impact at scale through the use of artificial intelligence.

ML Engineer


An ML Engineer at Wadhwani AI will be responsible for building robust machine learning solutions to problems of societal importance; usually under the guidance of senior ML scientists, and in collaboration with dedicated software engineers. To our partners, a Wadhwani AI solution is generally a decision making tool that requires some piece of data to engage. It will be your responsibility to ensure that the information provided using that piece of data is sound. This not only requires robust learned models, but pipelines over which those models can be built, tweaked, tested, and monitored. The following subsections provide details from the perspective of solution design:

Early stage of proof of concept (PoC)

  • Setup and structure code bases that support an interactive ML experimentation process, as well as quick initial deployments
  • Develop and maintain toolsets and processes for ensuring the reproducibility of results
  • Code reviews with other technical team members at various stages of the PoC
  • Develop, extend, adopt a reliable, colab-like environment for ML

Late PoC

This is early to mid-stage of AI product development

  • Develop ETL pipelines. These can also be shared and/or owned by data engineers
  • Setup and maintain feature stores, databases, and data catalogs. Ensuring data veracity and lineage of on-demand pulls
  • Develop and support model health metrics

Post PoC

Responsibilities during production deployment

  • Develop and support A/B testing. Setup continuous integration and development (CI/CD) processes and pipelines for models
  • Develop and support continuous model monitoring
  • Define and publish service-level agreements (SLAs) for model serving. Such agreements include model latency, throughput, and reliability
  • L1/L2/L3 support for model debugging
  • Develop and support model serving environments
  • Model compression and distillation

We realize this list is broad and extensive. While the ideal candidate has some exposure to each of these topics, we also envision great candidates being experts at some subset. If either of those cases happens to be you, please apply.


Master’s degree or above in a STEM field. Several years of experience getting their hands dirty applying their craft.


  • Expert level Python programmer
  • Hands-on experience with Python libraries
    • Popular neural network libraries
    • Popular data science libraries (Pandas, numpy)
  • Knowledge of systems-level programming. Under the hood knowledge of C or C++
  • Experience and knowledge of various tools that fit into the model building pipeline. There are several – you should be able to speak to the pluses and minuses of a variety of tools given some challenge within the ML development pipeline
  • Database concepts; SQL
  • Experience with cloud platforms is a plus

ML Scientist


As an ML Scientist at Wadhwani AI, you will be responsible for building robust machine learning solutions to problems of societal importance, usually under the guidance of senior ML scientists. You will participate in translating a problem in the social sector to a well-defined AI problem, in the development and execution of algorithms and solutions to the problem, in the successful and scaled deployment of the AI solution, and in defining appropriate metrics to evaluate the effectiveness of the deployed solution.

In order to apply machine learning for social good, you will need to understand user challenges and their context, curate and transform data, train and validate models, run simulations, and broadly derive insights from data. In doing so, you will work in cross-functional teams spanning ML modeling, engineering, product, and domain experts. You will also interface with social sector organizations as appropriate.  


Associate ML scientists will have a strong academic background in a quantitative field (see below) at the Bachelor’s or Master’s level, with project experience in applied machine learning. They will possess demonstrable skills in coding, data mining and analysis, and building and implementing ML or statistical models. Where needed, they will have to learn and adapt to the requirements imposed by real-life, scaled deployments. 

Candidates should have excellent communication skills and a willingness to adapt to the challenges of doing applied work for social good. 


  • B.Tech./B.E./B.S./M.Tech./M.E./M.S./M.Sc. or equivalent in Computer Science, Electrical Engineering, Statistics, Applied Mathematics, Physics, Economics, or a relevant quantitative field. Work experience beyond the terminal degree will determine the appropriate seniority level.
  • Solid software engineering skills across one or multiple languages including Python, C++, Java.
  • Interest in applying software engineering practices to ML projects.
  • Track record of project work in applied machine learning. Experience in applying AI models to concrete real-world problems is a plus.
  • Strong verbal and written communication skills in English.