MEASURE DHS produces many different types of dataset files. These types vary by individual survey, but are based upon the types of data collected and the file formats used for dataset distribution. Dataset types are organized into three distribution categories: Survey Data, HIV Test Results, and Geographic data. Survey Data can comprise many different types of data depending on individual survey design. HIV Test Results and Geographic Data are available for most surveys conducted in recent years. It is important to note that there is an additional requirement for GPS/HIV dataset requests.
Survey Data and HIV Data from MEASURE DHS surveys may be distributed in either raw or recode formats. A raw data file includes the data as they were collected, without any structural changes. A recode data file uses a standardized data definition in order to facilitate comparisons across surveys, and is distributed in several different file formats for use with statistical software packages. Geographic data are distributed in a format designed for use with GIS software packages.
In order to facilitate the analysis of data, DHS has developed the concept of recode files. Recode
files have standard data definitions across countries and
across DHS phases. Because of changes in questionnaires between DHS phases, there is a
different recode definition for each phase. However, variables that are common across phases keep
their names and the names of variables that are removed from a phase are not reused unless reinstated
in another phase. Recode definitions are available for the DHS, AIS and MIS surveys. Work
is currently under way for a recode definition for SPA surveys. DHS questionnaires allow
different units of analysis (i.e., households, household members, women, children etc.) and they
are ultimately translated into datasets. The types of datasets generated for each survey vary by
survey design; however there are seven common types of recode data files associated with the
core questionnaires. The datasets are available in the standard
recode file formats in SPSS, SAS, Stata and CSPro; only
completed questionnaires are included in these files.
Standard Recode Files:
Household Data - Household Recode (HR)
This dataset has one record for each household.
It includes household member's roster but no information from the individual
women/men questionnaires is present in this file. The unit of analysis (case) in this file is
the household.
Household Listing Data - Household
Member Recode (PR)
This dataset has one record for every household
member.
It includes variables like sex, age, education, orphan hood, height and weight measurement,
hemoglobin, etc. It also includes the characteristics of the households where the individual lives.
The unit of analysis (case) in this file is the household member.
Individual Woman's Data - Individual
Recode (IR)
This dataset has one record for every eligible woman
as defined by the household schedule. It contains all the data collected in the
woman's questionnaire plus some variables from the household. Up to 20 births in the birth history,
and up to 6 children under age 5, for whom pregnancy and postnatal care as well as immunization
and health data were collected, can be found in the file. The fertility and mortality programs
distributed by DHS use this file for data input. The unit of analysis (case) in this file is the
woman.
Man's Data - Male Recode
(MR)
This dataset has one record for every eligible man
as defined by the household schedule. It contains all the data collected in the
man's questionnaire plus some variables from the household. The unit of analysis (case) in this file
is the man.
Couple's Data-
Couple's Recode (CR)
This dataset has one record for every
couple. It contains data for married or living together men and woman who both declared
to be married (living together) to each other and with completed individual interviews (questionnaires).
Essentially the file is the result of linking the two files previously described based on whom they
both declared as partners. The unit of analysis (case) in this file is the couple.
Children's Data - Children's Recode
(KR)
This dataset has one record for every child of
eligible women, born in the last five years. It contains the information related to
the child's pregnancy and postnatal care and immunization and health. The data for the mother of
each of these children is included. This file is used to look at child health indicators such as
immunization coverage, vitamin A supplementation, and recent occurrences of diarrhea, fever, and
cough for young children and treatment of childhood diseases. The unit of analysis (case) in this
file is the children of women born in the last 5 years (0-59 months).
Births' data - All Children's Recode
(BR)
This dataset has one record for every child ever born
of eligible women. Essentially, it is the full birth history of all women interviewed
including its information on pregnancy and postnatal care as well as immunization and health for
children born in the last 5 years. Data for the mother of each of these children is also included.
This file can be used to calculate health indicators as well as fertility and mortality rates. The
unit of analysis (case) in this file is the children ever born of eligible women.
Additionally, there are a number of files that can be associated to the
files previously described but because of several reasons they are distributed separately.
Wealth Index data (WI)
This dataset has one record for every household
. Wealth Index analysis was introduced to DHS by the end of the 90’s. When the decision
to include the wealth index as part of DHS was made, standard variables were introduced to the recode
definition for both the household and individual questionnaires (HV270 and HV271 for households; V190
and V191 for women; and MV190 and MV191 for men). For previous surveys a file containing the score and
the quintile variables, was created. Essentially wealth index files were created for all DHS surveys
except surveys carried out as part of the first DHS phase. This file can be linked to any of the files
described in the previous section.
Height and Weight data according to WHO (HW)
This dataset has one record for every child measured
for height and weight. In 2007 new child growth standards were introduced by WHO; in
the past DHS used the NCHS/CDC/WHO reference. After the decision to adopt the new WHO standards
was made, standard recode variables HC70 to HC73 and HW70 to HW73 were introduced to the recode
definition to store the standard deviations of the new WHO child growth definition. Essentially
all files using the DHS-5 recode structure have these variables. For previous surveys a file
containing the same z-scores, was created. In early DHS phases only children of eligible women
were measured. Starting with DHS-3 onwards all children under five listed in households interviewed
have been measured. This file can be linked to the household members (PR), the children (KR) or
the births (BR) files described above if height and weight was taken for children in the households.
The file can only be linked to the children (KR) or birth (BR) files when only children of eligible
women were measured for early DHS phases.
HIV Test data (AR)
This dataset has one record for every individual
for which blood was drawn for HIV testing. In 2004 DHS began collecting blood for
HIV testing but because of the sensitivity of the data instead of merging the results of HIV
testing to the individuals a file that is distributed separately was created. This file can be
linked to the household members (PR), the women (IR) or men files (MR).
Other Biomarkers data (OB)
This dataset has one record for every individual for which samples were taken for different kinds of biomarkers. This type of file includes test results for health conditions such as syphilis, tuberculosis, hepatitis B, etc. and in general any other tests different from HIV, that requires the data to be anonymous. The same protocol used to request HIV data applies to requests for other biomarkers. This file can be linked to the household members (PR), the women (IR) or the men files (MR).
Other standard types of datasets include:
Births Recode (BR)
Child Under 5 Recode (XR)
Geographic Data (GE)
Height and Weight Scores - WHO Child Growth Standards (HW)
HIV Test Results Raw (HT)
HIV Test Results Recode (AR)
Household Member Raw
(PQ)
Household Raw (HH)
Individual Raw (IQ)
Individual/Household Raw (IH)
Male Raw (ML)
Other Biomarkers (OB)
Parent/Guardian Raw
(PG)
Safe Motherhood (SM)
Service Availability Raw (SQ)
Village Recode (VR)
Wealth Index (WI)
HIV Datasets
Some surveys include national voluntary HIV testing of respondents. Datasets showing test results and variables to link them to other findings from the DHS or AIS are available for more than 20 countries. There is a special terms of use that must be accepted before access can be granted to HIV datasets.
GPS Data
Geographic information is collected in the DHS and AIS surveys. All survey data are presented both nationally and by sub-national reporting area. These reporting areas are often, but not always, provinces or groups of provinces, and are included in all recoded datasets. It is important to note that there is an additional requirement for GPS dataset requests.