Measure Dhs MEASURE DHS: Quality Information to plan, monitor and improve population, health, and nutrition programs
spacer
spacer spacer
 
spacer
spacer
Working with Datasets
spacer spacer

Introduction to Datasets

MEASURE DHS believes that widespread access to survey data by responsible researchers has enormous advantages for the countries concerned and the international community in general. Therefore, MEASURE DHS policy is to release survey data to researchers after the main survey report is published, generally within 12 months after the end of fieldwork. Since 1984, MEASURE DHS has collected representative data from more than 200 surveys in over 75 countries. With few limitations these data have been made available for wide use.

On This Page:

Production of Datasets

MEASURE DHS uses a special software package named CSPro (previously ISSA) to process all the survey data. All steps, from entering the data collection to the production of statistics (including sampling errors) and tables published in DHS final reports, are done with CSPro. CSPro also provides a mechanism to export data to the statistical packages SPSS, SAS and STATA. Data files exported using CSPro are stored using ASCII representation.

Dataset files are generated in the last phase of a survey, after the final report. The process of recoding can take several months and it involves consistency checking and comparisons between the standard recode and raw datasets.

Dataset Types

There are three types of MEASURE DHS datasets distributed. Survey Data and HIV Data from MEASURE DHS surveys may be distributed in either raw or recode formats. A raw data file includes the data as they were collected, without any structural changes. A recode data file uses a standardized data definition in order to facilitate comparisons across surveys, and is distributed in several different file formats for use with statistical software packages. Geographic Data are distributed in a file format designed for use with GIS software packages. It is important to note that there is a different approval process for requesting access to dataset files in each distribution category.

Survey Datasets
Recoding of survey datasets is available for DHS, AIS and MIS surveys. Work is currently under way for a recode definition for SPA surveys. The types of survey datasets generated for each survey vary by survey design. Survey datasets are available in SPSS, SAS, Stata and CSPro formats.

HIV Datasets and Other Biomarkers
Starting in 2004 some DHS and AIS surveys include HIV testing of respondents. Datasets including the HIV test results as well as the variables used to link these results to other survey’s units of analysis (datasets) are available for surveys where the test was taken. These datasets must be requested separately from the country DHS or AIS survey. However, once the use of an HIV dataset is authorized it automatically grants access to the corresponding DHS or AIS survey. Since 2008 DHS also started collecting other types of biomarkers such as syphilis and Hepatitis B. The same protocol used to request HIV data applies to requests for other biomarkers.

GPS Datasets
Geographic information is collected for the DHS, AIS and MIS surveys. The GPS datasets normally include variables such as the longitude, latitude and altitude for each cluster (Census Enumeration Area) as well as the cluster number. The cluster number should be used to link these geographical variables to other units of analysis in the survey. These datasets must be requested separately from the country DHS, AIS or MIS survey datasets. However, once the use of a GPS dataset is authorized it automatically grants access to the corresponding DHS, AIS or MIS survey.

File Formats

MEASURE DHS uses standard file formats for distributing datasets:

Recode Data

  • Hierarchical CSPro File
  • Flat File (ASCII data with syntax file)
  • SPSS System File
  • SAS System File
  • STATA System File

Raw Data (Structure varies by file type)

  • Hierarchical CSPro File
  • Flat File (ASCII data with syntax file)
  • SPSS System File
  • SAS System File
  • STATA System File

Geographical Data

  • .DBF and .MDB formats

Distribution of Datasets

MEASURE DHS is authorized to distribute, at no cost, unrestricted survey data files for legitimate academic research, with the condition that we receive a description of any research project that will be using the data.

Datasets are available for download to all registered users, free of charge. To download datasets, you must first register online and request the country(ies) and datasets that you are interested in. When submitting a dataset request, users must include a brief description of how the data will be used.

Click here for more information on requesting and downloading datasets

Dataset Terms of Use

Datasets are made available with the following conditions:

  • Survey data files are distributed by MEASURE DHS for academic research/statistical analysis. Researchers need to provide a description of any research/analysis that will be using the data, before access is granted to the datasets.
  • Once downloaded, the datasets must not be passed on to other researchers without the written consent of MEASURE DHS.
  • All reports and publications based on the requested data must be sent to the MEASURE DHS Data Archive as a Portable Format Document (pdf) or a hard copy, for us to forward to the country(ies) whose data have been used.
spacer
spacer spacer
vertical line
spacer
spacer spacer spacer spacer