About the Project

Socialcops is a data intelligence startup based out of Delhi. As a part of their application process for Data for Impact Fellowship, I was provided with this task, which I had to complete in 7 days. The whole project was done using R and Excel.

  1. Problem Statement
  2. Data Structuring
  3. Background Information on NRHM

Problem Statement

The main problem statement given was - "What analyses should be undertaken by SocialCops to help the Government of Maharashtra's Health Department to identify the root causes of the extreme Health trends in Nagpur?"

Tasks which were needed to be done -

  1. Clean, structure, and prepare the data for the analysis so that you can compare all the files together.
  2. Prepare a methodology to analyse the data and report 5 important health trends and figure out the reasons behind at least 1 of them.
  3. Figure out the variables that will help you study the health trends, as well as the explanatory variables that will help you understand the reasons for the current health trends. Give us the list of indicators that you decide to use for the analysis.


Data Structuring

The National Rural Health Mission (NRHM), now under National Health Mission is an initiative undertaken by the Government of India to address the health needs of under-served rural areas. A sample of the data for NRHM Maharashtra, Public Healthcare Centre (PHC), had been shared along with the task.

The data given, was in the form of over 150 individual excel files, each representing the health data of that particular area. To consolidate, the first job at hand was to understand this arrangement and merge all the files.




Each excel file was arranged like the above snapshot, each row representing various indicators collected from April 2015 - March 2016. So as mentioned, all files had to be merged into one data file so that analysis can be easily performed on the whole set.

Background Information on NRHM

The National Rural Health Mission (NRHM), now under National Health Mission is an initiative undertaken by the Government of India to address the health needs of under-served rural areas. Launched in April 2005 by Indian Prime Minister Manmohan Singh, the NRHM was initially tasked with addressing the health needs of 18 states that had been identified as having weak public health indicators.

The NRHM is divided into 4 components –

  1. NHM Finance
  2. Health System Strengthening
  3. RMNCH+A (Reproductive, Maternal, New Born, Child, Adolescent Health)
  4. National Disease Control

The decentralized Indian Healthcare system aims in providing medical and healthcare support to the most underserved regions of the country. The primary level consists of ASHA(Accredited Social Heath Activist) workers, who are selected amongst the village populations and are given basic healthcare training. The next level consists of SCs(Sub-Centres) & Primary Health Centres followed by CHC(Community Health Centres). The last level is made up by District Hospitals. Moreover, to motivate citizens to use public healthcare system, various schemes like JSY(Janani Suraksha Yojna) have been started which also includes monetary incentives.

Broadly, the major sectors of the programme can be divided into the following categories –

  1. Maternal Health
  2. Child Health – Immunisation & Disease Control
  3. Family Planning - Population Growth Stability
  4. Adult & Adolescent Health