Measure Dhs MEASURE DHS: Quality Information to plan, monitor and improve population, health, and nutrition programs
spacer
spacer spacer
 
spacer
spacer
Working with Datasets
spacer spacer

Dataset Types

MEASURE DHS produces many different types of dataset files. These types vary by individual survey, but are based upon the types of data collected and the file formats used for dataset distribution. Dataset types are organized into three distribution categories: Survey Data, HIV Test Results, and Geographic data. Survey Data can comprise many different types of data depending on individual survey design. HIV Test Results and Geographic Data are available for most surveys conducted in recent years. It is important to note that there is an additional requirement for GPS/HIV dataset requests.

Survey Data and HIV Data from MEASURE DHS surveys may be distributed in either raw or recode formats. A raw data file includes the data as they were collected, without any structural changes. A recode data file uses a standardized data definition in order to facilitate comparisons across surveys, and is distributed in several different file formats for use with statistical software packages. Geographic data are distributed in a format designed for use with GIS software packages.

On This Page

Survey Data

In order to facilitate the analysis of data, DHS has developed the concept of recode files. Recode files have standard data definitions across countries and across DHS phases. Because of changes in questionnaires between DHS phases, there is a different recode definition for each phase. However, variables that are common across phases keep their names and the names of variables that are removed from a phase are not reused unless reinstated in another phase. Recode definitions are available for the DHS, AIS and MIS surveys. Work is currently under way for a recode definition for SPA surveys. DHS questionnaires allow different units of analysis (i.e., households, household members, women, children etc.) and they are ultimately translated into datasets. The types of datasets generated for each survey vary by survey design; however there are seven common types of recode data files associated with the core questionnaires. The datasets are available in the standard recode file formats in SPSS, SAS, Stata and CSPro; only completed questionnaires are included in these files.

Standard Recode Files:


Household Data - Household Recode (HR)

This dataset has one record for each household. It includes household member's roster but no information from the individual women/men questionnaires is present in this file. The unit of analysis (case) in this file is the household.

Household Listing Data - Household Member Recode (PR)

This dataset has one record for every household member. It includes variables like sex, age, education, orphan hood, height and weight measurement, hemoglobin, etc. It also includes the characteristics of the households where the individual lives. The unit of analysis (case) in this file is the household member.

Individual Woman's Data - Individual Recode (IR)

This dataset has one record for every eligible woman as defined by the household schedule. It contains all the data collected in the woman's questionnaire plus some variables from the household. Up to 20 births in the birth history, and up to 6 children under age 5, for whom pregnancy and postnatal care as well as immunization and health data were collected, can be found in the file. The fertility and mortality programs distributed by DHS use this file for data input. The unit of analysis (case) in this file is the woman.

Man's Data - Male Recode (MR)

This dataset has one record for every eligible man as defined by the household schedule. It contains all the data collected in the man's questionnaire plus some variables from the household. The unit of analysis (case) in this file is the man.

Couple's Data - Couple's Recode (CR)

This dataset has one record for every couple. It contains data for married or living together men and woman who both declared to be married (living together) to each other and with completed individual interviews (questionnaires). Essentially the file is the result of linking the two files previously described based on whom they both declared as partners. The unit of analysis (case) in this file is the couple.

Children's Data - Children's Recode (KR)

This dataset has one record for every child of eligible women, born in the last five years. It contains the information related to the child's pregnancy and postnatal care and immunization and health. The data for the mother of each of these children is included. This file is used to look at child health indicators such as immunization coverage, vitamin A supplementation, and recent occurrences of diarrhea, fever, and cough for young children and treatment of childhood diseases. The unit of analysis (case) in this file is the children of women born in the last 5 years (0-59 months).

Births' data - All Children's Recode (BR)

This dataset has one record for every child ever born of eligible women. Essentially, it is the full birth history of all women interviewed including its information on pregnancy and postnatal care as well as immunization and health for children born in the last 5 years. Data for the mother of each of these children is also included. This file can be used to calculate health indicators as well as fertility and mortality rates. The unit of analysis (case) in this file is the children ever born of eligible women.

Additionally, there are a number of files that can be associated to the files previously described but because of several reasons they are distributed separately.

Wealth Index data (WI)

This dataset has one record for every household . Wealth Index analysis was introduced to DHS by the end of the 90’s. When the decision to include the wealth index as part of DHS was made, standard variables were introduced to the recode definition for both the household and individual questionnaires (HV270 and HV271 for households; V190 and V191 for women; and MV190 and MV191 for men). For previous surveys a file containing the score and the quintile variables, was created. Essentially wealth index files were created for all DHS surveys except surveys carried out as part of the first DHS phase. This file can be linked to any of the files described in the previous section.

Height and Weight data according to WHO (HW)

This dataset has one record for every child measured for height and weight. In 2007 new child growth standards were introduced by WHO; in the past DHS used the NCHS/CDC/WHO reference. After the decision to adopt the new WHO standards was made, standard recode variables HC70 to HC73 and HW70 to HW73 were introduced to the recode definition to store the standard deviations of the new WHO child growth definition. Essentially all files using the DHS-5 recode structure have these variables. For previous surveys a file containing the same z-scores, was created. In early DHS phases only children of eligible women were measured. Starting with DHS-3 onwards all children under five listed in households interviewed have been measured. This file can be linked to the household members (PR), the children (KR) or the births (BR) files described above if height and weight was taken for children in the households. The file can only be linked to the children (KR) or birth (BR) files when only children of eligible women were measured for early DHS phases.

HIV Test data (AR)

This dataset has one record for every individual for which blood was drawn for HIV testing. In 2004 DHS began collecting blood for HIV testing but because of the sensitivity of the data instead of merging the results of HIV testing to the individuals a file that is distributed separately was created. This file can be linked to the household members (PR), the women (IR) or men files (MR).

Other Biomarkers data (OB)

This dataset has one record for every individual for which samples were taken for different kinds of biomarkers. This type of file includes  test results for health conditions such as syphilis, tuberculosis, hepatitis B, etc. and in general any other tests different from HIV,  that requires  the data to be anonymous.  The same protocol used to request HIV data applies  to requests for other biomarkers.  This file can be linked to the household members (PR), the women (IR) or  the men files (MR).

Other standard types of datasets include:

   Births Recode (BR)

   Child Under 5 Recode (XR)

   Geographic Data (GE)

   Height and Weight Scores - WHO Child Growth Standards (HW)

   HIV Test Results Raw (HT)

   HIV Test Results Recode (AR)

   Household Member Raw (PQ)

   Household Raw (HH)

   Individual Raw (IQ)

   Individual/Household Raw (IH)

   Male Raw (ML)

   Other Biomarkers (OB)

   Parent/Guardian Raw (PG)

   Safe Motherhood (SM)

   Service Availability Raw (SQ)

   Village Recode (VR)

   Wealth Index (WI)

HIV Datasets

Some surveys include national voluntary HIV testing of respondents. Datasets showing test results and variables to link them to other findings from the DHS or AIS are available for more than 20 countries. There is a special terms of use that must be accepted before access can be granted to HIV datasets.

GPS Data

Geographic information is collected in the DHS and AIS surveys. All survey data are presented both nationally and by sub-national reporting area. These reporting areas are often, but not always, provinces or groups of provinces, and are included in all recoded datasets. It is important to note that there is an additional requirement for GPS dataset requests.


spacer
spacer spacer
vertical line
spacer
spacer spacer spacer spacer