Chapter 5 Calculating global and regional estimates
Few countries have survey estimates of poverty available every year. To estimate poverty at the regional and global level, the survey estimates need to be aligned to a reference year and aggregated. Such alignment and aggregation require assumptions about how to interpolate and extrapolate data as well as how to treat countries without any household survey data at all. This section explains how these calculations are made. Source code can be found here and here.
5.1 Extrapolations
For countries that do not have welfare aggregates at a specific reference year, but which do have earlier welfare aggregates available, their most recent aggregate is extrapolated forward using growth rates from national accounts, either real GDP per capita or real Household Final Consumption Expenditure (HFCE) per capita. Only a fraction of growth is passed through to the survey welfare aggregate based on evidence suggesting a misalignment between growth in survey means and national accounts (Ravallion 2003; Prydz, Jolliffe, and Serajuddin 2022). PIP uses a passthrough rate of 0.7 for consumption aggregates and 1 (meaning full passthrough) for income aggregates. The selection of a 0.7 passthrough rate, and the decision to only apply this to countries that use a consumption aggregate is based on empirical evidence from Mahler, Castañeda Aguilar, and Newhouse (2022). The extrapolation is implemented by first finding the growth in national accounts, \(g^{NA}_{s,s+1}\), between the survey year, \(s\), and the following year, \(s+1\), and scaling the survey welfare distribution, \(\mathbf{y_{s}}\), by (a fraction of) this factor. This is continued year-by-year until the reference year of interest, \(r\), is reached. Concretely, this means that the extrapolated welfare distribution, \(\mathbf{y_{r}}\), is given by:
\[\mathbf{y_r} = \mathbf{y_{s}}*\prod_{t=s}^{r-1}{(1+passthrough*g^{NA}_{t,t+1})}\]
where \(g^{NA}_{t,t+1}\) is growth real GDP per capita or real HFCE per capita as explained in the national accounts data section, and \(passthrough\) equals \(0.7\) or \(1\). Poverty for the reference year is then estimated using this extrapolated distribution. A similar approach is used to extrapolate backwards, when the earliest survey estimate available is more recent than the desired reference year. The extrapolation method assumes distribution-neutral growth, i.e. that everyone’s welfare grows at the same rate. This implies that inequality is assumed to stay constant.
The example below illustrates the consumption distribution from the 2014 Sudanese survey expressed in 2017 PPP-adjusted dollars, and the 2019 Sudanese distribution when extrapolating this welfare vector to 2019 using the method described above. The 2014 daily consumption mean is \(\$4.56\) and the growth rate in real GDP per capita between 2014 and 2019 is \(-13.5\%\). The negative growth rate pushes the extrapolated distribution to the left, causing the share of extreme poor – those living below the \(\$2.15\) poverty line – to increase from \(15.3\%\) in 2014 to \(20.8\%\) in 2019.
When household surveys span two calendar years, the national accounts data used for extrapolation thes survey forward is the weighted average of the two years. The weights of the national accounts data is determined by the share of the fieldwork that took place in each year (see the end of the section 1.1 for details).
5.2 Interpolations
In cases where the reference year falls between two surveys, poverty is interpolated for the reference year using the nearest survey on each side of the reference year. One of two approaches are used depending on the correspondence in growth between national accounts and survey data.
Denoting \(s_1\) the survey predating the reference year and \(s_2\) the survey after the reference year, “same direction” interpolation is used when growth in the survey mean, \(g^\mu_{s_1,s_2}\), between the two surveys is of the same sign as (1) the growth in national accounts from the first survey to the reference year, \(g^{NA}_{s_1,r}\) and (2) from the reference year to the second survey, \(g^{NA}_{r,s_2}\). This means that same direction interpolation is used when
\[sign (g^{\mu}_{s_1,s_2}) = sign (g^{NA}_{s_1,r}) = sign (g^{NA}_{r,s_2})\]
“Diverging directions” interpolation is used when the equation above does not hold. The main difference between the two is that “same direction” interpolation estimates how much of the growth in national accounts that had accrued at the reference year, and uses this to back out a predicted survey mean at the reference year. When the survey mean and national accounts grow in opposite directions this is not meaningful, and an alternative approach is used.
Interpolations are never done between consumption and income aggregates. Whenever both are available, specific rules are used to determine which aggregate to use.
5.2.1 “Same direction” interpolation
“Same direction” interpolation works in three steps. First, the survey mean at the reference year, \(μ_{r}\), is estimated using the following interpolation formula:
\[μ_{r} = (μ_{s_2} - μ_{s_1}) \times \dfrac{\prod_{t=s_1}^{r-1}{(1+passthrough*g^{NA}_{t,t+1})}-1}{\prod_{t=s_1}^{s_2-1}{(1+passthrough*g^{NA}_{t,t+1})}-1} + μ_{s_1}\]
Second, the welfare distributions at the two surveys are scaled to reflect this mean:
\[\mathbf{y_{r_1}} = \dfrac{μ_{r}}{μ_{s_1}} \times \mathbf{y_{s_1}}, \hspace{5mm} \mathbf{y_{r_2}} = \dfrac{μ_{r}}{μ_{s_2}} \times \mathbf{y_{s_2}}\]After this alignment, there will be two distributions with the same mean for the reference year but with different distributional shapes.
At the third step, poverty is estimated from each of these distributions, and the final poverty estimate at the reference year is the weighted average poverty rate from both distributions where each poverty estimate is weighted by the inverse of the relative distance between the survey year and the reference year:
\[P_{0,r} = \dfrac{P_{0,r_1}*(t_{s_2}-t_{r})+P_{0,r_2}*(t_{r}-t_{s_1})}{t_{s_2}-t_{s_1}} \]For example, if a reference year falls two years after the first survey and one year before the second survey, the poverty estimate from the first survey is given a weight of \(1/3\) and the estimate from the second survey a weight of \(2/3\).
5.2.2 “Diverging directions” interpolation
If the growth rates in surveys and national accounts diverge, an approach similar to the extrapolation is applied to the two closest surveys; poverty is extrapolated forward by (a fraction of) the national account growth rate using the early survey and backwards using the later survey. Poverty for the reference year is estimated using both distributions and the estimates are averaged using the formula above.
The mechanics of the extrapolation and interpolation (without accounting for passthrough rates) are described in Ravallion (2003), Chen and Ravallion (2004), box 6.4 in Jolliffe and Prydz (2015), and in Appendix A of World Bank (2018).
5.3 National accounts data
All of the interpolation and extrapolation methods described above rely on national accounts data. Two national accounts variables are used: Household Final Consumption Expenditure (HFCE) and Gross Domestic Product (GDP) – both in real per capita terms. Due to its close conceptual relation with the welfare aggregate, when available, HFCE is used as the preferred national accounts data for extrapolation and interpolation methods. In low- and lower-middle-income countries, however, growth in HFCE is a worse predictor of growth in welfare than growth in GDP, likely because HFCE is calculated as the residual component of GDP in countries with low statistical capacity. Hence, GDP is preferred in these cases (Mahler and Newhouse 2024). The World Development Indicators (WDI) database is the primary source for the HFCE data (we use the Households and NPISHs Final Consumption Expenditure per capita (constant 2015 US$) series with the series code NE.CON.PRVT.PC.KD) and the GDP data (where we use the GDP per capita (constant 2015 US$) series with the series code NY.GDP.PCAP.KD). The only exception to this is the interpolation done for Syria from 2009-2022, which relies on nominal GDP per capita deflated with the CPI. This exception is carried out due to quality concerns with the GDP deflator (Castaneda Aguilar et al. 2024).
For a handful of countries, the data in WDI do not reflect calendar-year estimates but country-specific fiscal-year estimates. In those cases, the WDI data are converted to calendar-year estimates (see Castaneda Aguilar et al. (2022)). When WDI data are not available, we use data from the World Economic Outlook and the Maddison Project Database. For a detailed discussion on the sources of national accounts data see (Prydz et al. 2019).
The extrapolations and interpolations use only the growth rate from national accounts. In many countries, important differences in income levels between surveys and national accounts also exist (Prydz, Jolliffe, and Serajuddin 2022). Since the growth rate is the same whether PPPs or USD are used, it is immaterial that the constant USD series is used in the extrapolations and interpolations, while the survey welfare aggregate is expressed in PPPs.
5.4 Choosing between consumption and income estimates
The discussion on interpolation and extrapolation for simplicity assumed that only one welfare aggregate was available for a given country at a given year. Yet, a number of countries have poverty estimates available from both consumption and income aggregates. Due to its closer connection to welfare, whenever both income and consumption estimates are available for a given reference year, consumption estimates are preferred. Likewise, when both kinds of poverty estimates are available for the same years (but not for a particular reference year), interpolations and extrapolation are made using the consumption estimates. When consumption and income estimates are available at different years, the choice is a bit more complicated. The precise rules are outlined in the following decision tree:
5.5 Regional classification
In total, 218 economies are included in the global poverty estimates. The entire universe of economies considered corresponds to the universe of economies in the World Bank’s Country and Lending Groups. The regions employed in PIP, however, differ from the regional classifications used by the World Bank. Some economies, mostly high-income economies, are excluded from the geographical regions and are included as a separate group referred to as “other high income” (or “industrialized economies” or “rest of the world” in earlier publications). The list of economies included in each region is as follows:
East Asia and Pacific: American Samoa; Cambodia; China; Fiji; Indonesia; Kiribati; Korea, Democratic People’s Republic of; Lao People’s Democratic Republic; Malaysia; Marshall Islands; Micronesia, Federated States of; Mongolia; Myanmar; Northern Mariana Islands; Palau; Papua New Guinea; Philippines; Samoa; Solomon Islands; Thailand; Timor-Leste; Tonga; Tuvalu; Vanuatu; Vietnam.
Europe and Central Asia: Albania; Armenia; Azerbaijan; Belarus; Bosnia and Herzegovina; Bulgaria; Croatia; Czech Republic; Estonia; Georgia; Hungary; Kazakhstan; Kosovo; Kyrgyz Republic; Latvia; Lithuania; Moldova; Montenegro; North Macedonia; Poland; Romania; Russian Federation; Serbia; Slovak Republic; Slovenia; Tajikistan; Ukraine; Uzbekistan.
Latin America and the Caribbean: Argentina; Barbados; Belize; Bolivia; Brazil; Chile; Colombia; Costa Rica; Cuba; Dominica; Dominican Republic; Ecuador; El Salvador; Grenada; Guatemala; Guyana; Haiti; Honduras; Jamaica; Mexico; Nicaragua; Panama; Paraguay; Peru; St. Kitts and Nevis; St. Lucia; St. Vincent and the Grenadines; Suriname; Trinidad and Tobago; Uruguay; Venezuela, Republica Bolivariana de.
Middle East and North Africa: Algeria; Djibouti; Egypt, Arab Republic of; Iran, Islamic Republic of; Iraq; Jordan; Lebanon; Libya; Morocco; Oman; Syrian Arab Republic; Tunisia; West Bank and Gaza; Yemen, Republic of.
Other High Income (at times referred to as “Rest of the World”): Andorra; Antigua and Barbuda; Aruba; Australia; Austria; Bahamas, the; Bahrain; Belgium; Bermuda; British Virgin Islands; Brunei Darussalam; Canada; Cayman Islands; Channel Islands; Curacao; Cyprus; Denmark; Faroe Islands; Finland; France; French Polynesia; Germany; Gibraltar; Greece; Greenland; Guam; Hong Kong SAR, China; Iceland; Ireland; Isle of Man; Israel; Italy; Japan; Korea, Republic of; Kuwait; Liechtenstein; Macao SAR, China; Malta; Monaco; Nauru; Netherlands; New Caledonia; New Zealand; Norway; Portugal; Puerto Rico; Qatar; San Marino; Saudi Arabia; Singapore; Sint Maarten; Spain; St. Martin; Sweden; Switzerland; Taiwan, China; Turks and Caicos Islands; United Arab Emirates; United Kingdom; United States; Virgin Islands, US.
South Asia: Afghanistan; Bangladesh; Bhutan; India; Maldives; Nepal; Pakistan; Sri Lanka
Sub-Saharan Africa: Angola; Benin; Botswana; Burkina Faso; Burundi; Cabo Verde; Cameroon; Central African Republic; Chad; Comoros; Congo, Democratic Republic of; Congo, Republic of; Cote d’Ivoire; Equatorial Guinea; Eritrea; Eswatini; Ethiopia; Gabon; Gambia, the; Ghana; Guinea; Guinea-Bissau; Kenya; Lesotho; Liberia; Madagascar; Malawi; Mali; Mauritania; Mauritius; Mozambique; Namibia; Niger; Nigeria; Rwanda; Sao Tome and Principe; Senegal; Seychelles; Sierra Leone; Somalia; South Africa; South Sudan; Sudan; Tanzania; Togo; Uganda; Zambia; Zimbabwe.
5.6 Population
To derive regional and global poverty rates, each country’s poverty rate is weighted using its population share. Population data are taken from the World Bank’s World Development Indicators (WDI). For Kuwait and West Bank and Gaza, population data is missing in the WDI for some years. In those cases, alternative sources are used as outlined in Arayavechkit et al. (2021).
The regional and global population may differ from the aggregates given in the World Development Indicators, because PIP uses different regional compositions of economies (See Regional classification section).
5.7 Treatment of countries without any poverty data
With the interpolation and extrapolation rules, all economies with at least one welfare vector in PIP will have poverty estimates available for all years as long as national accounts data are available. Yet, if an economy falls under any of the following three conditions, it is not possible to calculate interpolated and extrapolated poverty estimates:
- The economy entirely lacks household data that can be used to monitor poverty.
- The economy does not have credible PPPs necessary to convert welfare vectors to cross-country comparable currencies.
- The economy is missing GDP data for a subset of years necessary to extrapolate and interpolate survey estimates.
If either of the first two conditions apply, the economy will be missing for all years, while if only the last condition applies, the economy will be missing in the years where GDP data is missing. In these country-year combinations, for the purpose of calculating regional and global poverty estimates, it is assumed that the country has a poverty headcount equal to its region’s population-weighted average poverty rate.
5.8 Coverage rule
The interpolation rules, extrapolation rule, and treatment of countries without any data ensure that some poverty estimate is available for all regions and the world as a whole since 1981. Yet, the confidence one can have in these estimates depends on the how often and how long one has to extrapolate or interpolate the data, as well as the share of the population without any data at all.
To avoid presenting regional and global numbers based on outdated or no data, coverage rules are used to determine whether a particular reference year has sufficient data coverage. A country is considered to have sufficiently recent data if it has a survey-based poverty estimate at most three years from the reference year. Regional estimates are displayed for a given year if data cover 50 percent of the population in the region. For regions in which the surveys within three years either side of the reference year account for less than half of the regional population, the regional poverty estimate is not reported.
An additional coverage requirement is applied to govern when global poverty estimates are reported for a given reference year. This requirement addresses the goal of focusing the measurement of global poverty on economies where most of the poor live. Specifically, it tries to avoid a situation in which the global population threshold is met by having recent data in high-income countries, East Asia & Pacific, and Latin America & the Caribbean, which together account for a small share of the global extreme poor. Under this requirement, global poverty estimates are reported only if data are representative of at least 50 percent of the population in low-income and lower-middle-income countries, because most of the poor live in these groups of countries. The World Bank classification of economies according to income groups in the reference year is used. This requirement is only applied to the global poverty estimate, not to regional estimates (for more details see Castaneda Aguilar et al. (2020)).
During the COVID-19 pandemic, countries experienced economic shocks and volatility at an unprecedented scale, which means that extrapolating from a pre-COVID survey may work less well. For that reason, for calculating coverage in 2020, 2021, and 2022, surveys before 2020 are not counted. Analogously, surveys conducted in 2020 and after are not counted for coverage calculations for 2017, 2018, and 2019. That is, a censored three-year window is used for coverage calculations, such that only regions with sufficient data pre or post-COVID have aggregates reported (Castaneda Aguilar et al. 2024). This is a conservative approach that is warranted by the exceptional volatility in economic conditions.