logo

Gianluca Boo1, Roland Hosner2, Pierre Z Akilimali3, Edith Darin1, Heather R Chamberlain1, Warren C Jochem1, Patricia Jones1, Roger Shulungu Runika4, Henri Marie Kazadi Mutombo4,5, Attila N Lazar1 and Andrew J Tatem1

1WorldPop Research Group, University of Southampton, Southampton, United Kingdom
2Flowminder Foundation, Stockholm, Sweden
3École de Santé Publique de Kinshasa, Kinshasa, Democratic Republic of the Congo
4Institut National de la Statistique, Kinshasa, Democratic Republic of the Congo
5Bureau Central du Recensement, Kinshasa, Democratic Republic of the Congo

Introduction

This report is a supplement to the modelled gridded population estimates for the Haut-Katanga, Haut-Lomami, Ituri, Kasaï, Kasaï-Oriental, Lomami and Sud-Kivu provinces in the Democratic Republic of the Congo (DRC) (2021). The report describes the processing of the microcensus data collected in these provinces between March and May 2021, which is used as input for a Bayesian statistical model used to produce the gridded population estimates, following an approach described by Wardrop et al. (2018). The data processing consists of five main steps involving 1) attribute selection and pre-processing, 2) processing of listed persons, 3) processing of listed households and 4) processing of listed clusters.

1) Attribute selection and pre-processing

## [1] "Y:/mydocuments"

We accessed the most recent version of the microcensus data, which was pre-checked, pre-processed and consolidated by the École de Santé Publique de Kinshasa and the Flowminder Foundation, and selected the following attributes.

Attribute Description Format
x_cluster_ezid Unique identifier of the enumeration zone (EZ) Integer [1442 unique values]
x11_building_id Unique identifier of the building String [free text]
a2_gpslong GPS coordinate of the building [longitude] Numeric
a2_gpslat GPS coordinate of the building [latitude] Numeric
a2_gpsprec GPS accuracy Numeric [meters]
a3_buildingtype Building type Integer [1 to 2 ( residential building), 3 to 4 (collective residential building), 5 (non-residential building), 6 (mixed use), or 7 (non-functional)]
b1_hhid Unique identifier of the household String [free text]
b4_hhocc Is there at least one person who lives in this housing unit? Logic [TRUE or FALSE]
c4_consent Agree to take part in the study Logic [TRUE or FALSE]
c5_hhsize Number of individuals in the household, including visitors that stayed last night Integer [1 to 30]
c10_lastnight Did the person stay here last night? Logic [TRUE or FALSE]
c16_nrmonths During the past 12 months, how many months of the year has the person lived in this household? Numeric [0 to 12 months, -98 (Don’t know) or -99 (Prefer not to say)]
c17_reason What is the main reason the person does not live in the household all year round? Integer [1 (moved during the last year), 6 (look for/take up temporary work), 8 (be close to school/university/other educational institute]), 9 (seek or receive medical care), 10 (spend time with family members or friends), or 11 (other)]
c17_other What is the other reason the person does not live in the household all year round? String [free text]
c11_gender Is the person male or female? String [F (female) or M (male)]
c12_age How old was the person at their last birthday? Integer [0 to 99 years]

We pre-processed and recorded the selected attributes into more actionable formats (e.g., logical for binary attributes) to facilitate the processing steps presented below.

2) Processing of listed persons

We retrieved individual microcensus records for the 367,831 listed persons and subsequently selected the de jure population (United Nations 1991) — hereafter named residents — according to the following criteria.

The implementation of these criteria involved dropping 4,373 listed persons. The remaining 363,458 residents were subsequently allocated to a unique age (i.e. 0, 1-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-29, 60-64, 65-69, 70-74, 75-79 and 80+) and sex (i.e. F and M) group.

To tackle age-reporting issues for under-ones (i.e. months were sometimes reported as years), both individuals with the attribute c12_age==0 and the attribute c17_other containing the following keywords — nouvelle or nourrisson or nourrison or nourisson or bebe or naissance or nouveau or beb3 or mois or semaine de vie or moin.+an — were allocated to the 0 age group.

827 residents had either no age or sex reported but were not discarded from the data. The count of residents belonging to each age and sex group is presented in the interactive plot below.

3) Processing of listed households

We retrieved individual microcensus records for the 85,954 listed households and selected the 80,970 households with at least one resident. We subsequently assessed the three following scenarios to assess whether the household was eligible for imputation in the case of non-response.

Over the 80,970 households with at least one resident, 3,098 refused to be interviewed and were considered eligible for imputation (see the processing of listed clusters section).

4) Processing of listed clusters

We accessed the most recent version of the microcensus cluster monitoring data, which was pre-checked, pre-processed and consolidated by the École de Santé Publique de Kinshasa and the Flowminder Foundation, and selected the following attributes.

Attribute Description
ez_id Unique identifier of the enumeration zone (EZ)
cluster_accessible Was the cluster accessible to the surveyor?
cluster_reason What is the reason for inaccessibility?
cluster_security_issue Was there any security issue reported?
cluster_surveyed Was the cluster surveyed?
cluster_empty_people Was the cluster empty (people)?
cluster_partial_coverage Was the cluster only partly surveyed
cluster_building_count Count of listed buildings
cluster_people_count Count of listed persons
cluster_comments Comments

The map below shows the location of the microcensus clusters and whether these clusters were successfully surveyed and considered for population modelling.

We linked the microcensus cluster monitoring data with the individual records pre-processed for listed persons and listed households to obtain summaries at the cluster level. After linking and aggregating the records at the cluster level, we imputed 13,259 residents in 3,098 eligible households based on the mean household size for each cluster.

Conclusions

The microcensus data processing described in this report enabled us to produce summaries of population counts and age and sex breakdowns at the cluster level within the seven provinces included in the GRID3 Mapping for Health Project. Over the 1,596 sampled clusters, 99 cluster were not surveyed, 3 clusters were only partially listed, 94 clusters surveyed but empty of population and 3 clusters dropped as outliers/anomalies. The remaining 1,397 clusters were considered for population modelling.

Acknowledgements

These data were produced by the WorldPop Research Group at the University of Southampton as part of the GRID3 Mapping for Health Project. This project was delivered under the leadership of the Ministry of Public Health, Hygiene and Prevention of the DRC and funded by Gavi, the Vaccine Alliance (RM 86720420A2). The project was led by the Flowminder Foundation and the Center for International Earth Science Information Network (CIESIN) at the Columbia University, in collaboration with the WorldPop Research Group at the University of Southampton and national partners including, but not limited to, the École de Santé Publique de Kinshasa and both the Bureau Central du Recensement and the Institut National de la Statistique. This work was a continuation of the GRID3 (Geo-Referenced Infrastructure and Demographic Data for Development) programme funded by the Bill and Melinda Gates Foundation (BMGF) and the United Kingdom’s Foreign, Commonwealth & Development Office (INV 009579, formerly OPP 1182425). The study was approved by the Faculty Ethics Committee of the University of Southampton (ERGO II 62716).

Suggested citation

G Boo, R Hosner, PZ Akilimali, E Darin, HR Chamberlain, WC Jochem, P Jones, R Shulungu Runika, HM Kazadi Mutombo, AN Lazar and AJ Tatem. 2021. Modelled gridded population estimates for the Haut-Katanga, Haut-Lomami, Ituri, Kasaï, Kasaï-Oriental, Lomami and Sud-Kivu provinces in the Democratic Republic of the Congo (2021), version 3.0. WorldPop, University of Southampton, Flowminder Foundation, École de Santé Publique de Kinshasa, Bureau Central du Recensement and Institut National de la Statistique. DOI: 10.5258/SOTON/WP00720

License

This report may be redistributed following the terms of a Creative Commons Attribution 4.0 International (CC BY 4.0) license.