Index of Public Integrity Methodology

Why Measure Corruption Risk and Public Integrity?

Corruption risk is the probability that a corrupt act will happen in transactions between citizens (firms) and the state. In some countries corrupt acts are exceptions (they happen rarely), while in many others they occur on a regular basis: citizens resort to them to gain access to public services, public servants use them to increase their income, and politicians use them to get and preserve power as well as enrich themselves and their families.

The dimension of corruption depends on the capability of a society to constrain people entrusted with power not to abuse it in their own interest and to enforce public integrity. Where this capability is low, corruption risk is high. Measuring such concepts makes sense if it helps indicate how to change them: what are the “buttons to press” for an improvement in the quality of governance.

Academic research by economists and political scientists has identified many reasons why country contexts differ in terms of governance quality. While additional factors may exist, the bottom line is that corruption risk results from an equilibrium between opportunities for corruption (such as power discretion and material resources, e.g. oil or untransparent public money) and constraints that autonomous organizations (e.g. the judiciary, media), groups (civil society), and individuals (voters, whistleblowers) can use to prevent power holders from abusing office in their own interest. Public integrity is the public good resulting from the behavior of most power holders and citizens IF they do not engage in corrupt acts and power abuse benefiting private parties. Corruption risk is low where public integrity is high.

It is not easy to measure either corruption or public integrity directly in ways comparable across countries except by very expensive surveys, and even so the measurement tells us little about which factors are enabling corruption and discouraging public integrity. Thus, we resort to measuring the different elements of context which interact to create a society’s capability (or lack thereof) to control corruption.

The Index of Public Integrity (IPI) identifies proximate measures for factors identified in research as impacting corruption risk for 114 countries for which data is available and not controversial (after removing China, Saudi Arabia, and Azerbaijan). It is a composite index consisting of six components. They are:

Opportunities:

● Administrative burden, trade openness, and budget transparency (2015, 2017, and 2019 editions)

Constraints:

● Judicial independence, e-citizenship, and freedom of the press

Starting from the 2021 edition, administrative burden and trade openness have been replaced by administrative transparency and online services, due to unavailable alternative data in the original components (based on the World Bank Doing Business project, which closed).

A more extensive explanation of the methodology and the original composition of the IPI can be found in the following peer-reviewed publication:

Mungiu-Pippidi, A., Dadašov, R. Measuring Control of Corruption by a New Index of Public Integrity. European Journal on Criminal Policy Research 22, 415–438 (2016). https://doi.org/10.1007/s10610-016-9324-z

Below is an outline of the methodology behind the updated IPI.

How Were the IPI Components Selected?

National context either enables or disables public integrity through several factors that come together to determine either corruption risk or its opposite, public integrity. In this model, governance quality is a latent variable. In other words, by identifying and measuring these factors we can indirectly measure governance quality and get a clear picture of its enablers/disablers.

Not all factors identified by research can be modified by human action: for instance, multiethnicity or an abundance of mineral resources both multiply opportunities for corruption but are rather unchangeable by policy. The presence of an informal economy is also a resource for corruption, as vulnerable workers frequently need to provide bribes if they want access to public services they do not regularly participate in, but it is very difficult to measure. The IPI thus focuses on factors that are regularly measured and can change by human action. They are selected by how they cluster together (principal component analysis) and validity tests are then run associating them with a variety of direct measures of corruption, both objective (line non-competitive tenders) and subjective (experiences and perception of corruption). Tests also take into account differences in the levels of socio-economic development across countries by controlling for the Human Development Index (HDI). See the results here. The six components of the IPI are based on years of theoretical and empirical research on the control of corruption, not only by the team of Alina Mungiu-Pippidi at the European Research Centre for Anti-Corruption and State-Building (ERCAS), but by several other researchers who have published their work in peer-reviewed journals.

Starting with the 2021 edition of the IPI, two original components had to be replaced due to issues related to their underlying data source, the Doing Business Survey conducted by the World Bank. This survey was fully discontinued and direct replacements for those components could not be identified from other sources. Thus, two alternative proximate measures for opportunities were devised instead of administrative burden and trade openness: administrative transparency, based on a fraction of ERCAS’s own Transparency Index, and public online services, based on the Online Services Index computed as part of the UN E-Government Development Index. Statistical validation tests indicate that the internal consistency of the index was preserved in the new composition.

How Were the Components Constructed?

In the current version, five of the components (budget transparency, administrative transparency, online services, judicial independence, and freedom of the press) each rely on a single data source. These components’ scales are standardized by constructing the “z-score” of the variable in order to equalize their mean values and standard deviations. For budget transparency, the mean score for the individual items considered was extracted and then standardized; administrative transparency in turn consists of the sum of four individual components from the Transparency Index, which was then similarly standardized into z-scores. The final component, e-citizenship, is the only one based on different data sources. Its individual sub-components were standardized separately and then averaged.

Every final component score was then normalized to range between 1 and 10 using a min-max-transformation with higher values representing better performance in each area. The overall IPI was finally derived by the equally weighted average of all components. The decision to assign equal weights resulted from a replication of the original methodology in Mungiu-Pippidi and Dadasov 2016 when the index was first built by principal component analysis, then the impact (upload) of every component was measured. The new components, as well as the original ones, contributed in very close (although not identical) proportions to the latent variable captured by the principal component. This determined the decision to assign equal weights and use a simpler average to build the index. The resulting aggregate correlates at 90% with the principal component.

The standardization procedure described here ensures that the IPI does not depend strongly on the component with the greatest dispersion. A country can score badly for one component but still do well on the overall IPI. Similarly, progress on just one component is insufficient for significant positive evolution of overall public integrity. The components interact to determine a certain quality of governance.

How is the IPI Validated?

The components of IPI strongly correlate despite measuring apparently different things. This indicates that they all in fact measure a latent variable: the capacity of a society to control corruption and enable public integrity. The internal consistency of the index deriving from principal component analysis was and continues to be very high, with a KMO index of 0.80. IPI also correlates at values between 60 and 80% with a variety of corruption measurements, either subjective (like Global Corruption Barometer’s “Most officials are corrupt”, the Corruption Perception Index, Government Favoritism, Control of Corruption, or most importantly, objective, like Public Administration Corruption Index (PACI) or procurement red flags (for a correlation between subjective and objective indicators, see Mungiu-Pippidi and Martinez Kukutschka 2018).

Due to the nature of its components, the IPI explains what exactly prevents a country from reaching control of corruption. The components are actionable so they can serve as an evidence basis for reform strategies.

What changed between the latest and previous editions?

Due to accusations of data manipulation, China, Azerbaijan, and Saudi Arabia were excluded from previous editions and were not included in the 2021 edition. Yemen had incomplete data for previous editions, but with the replacement of trade openness, it could now be added to the pool of countries, which totals 114.

As already described above, the original components of administrative burden and trade openness were replaced by new components: administrative transparency and online services, respectively. Due to the discontinuation of Freedom House’s Freedom of the Press indicator, the source for the freedom of the press component was also changed to the Reporters without Borders’ Press Freedom Index.

Component

Variable and Measurement

Administrative Transparency

Consists of the standardized sum of individual scores for the following items in the de facto Transparency Index:

  • T-Index de facto 3 - Public procurement
  • T-Index de facto 4 - Land cadaster
  • T-Index de facto 5 - Register of commerce
  • T-Index de facto 6 - Auditor General's annual report

The T-Index is based on independent data collection and review by the ERCAS team. The value has been standardized and transformed to be in range between 1 and 10, with 10 implying the highest administrative transparency.

The data by country can be found here.

Online Services

The score is based on the Online Services Index, with integrates the UN E-Government Development Index. The data used stems from the report released in 2020.

The values have been transformed to be in range between 1 and 10, with 10 implying the highest trade openness.

The data by country can be found here.

Budget Transparency

Simple mean value of the scores resulting from 14 specific questions from the Open Budget Survey that cover transparency of the Executive’s Budget Proposal. More information on questions and respective scores is presented in the full dataset. The data are to a large extent provided by the International Budget Partnership and in some cases reliance is placed on own data (these cases are noted with an asterisk in the spreadsheet provided below). For these countries, the same data is used for all IPI editions as no new data is available. For data extracted from the Open Budget Survey results, the same values are used for the 2019 and 2021 editions, as a new version has not yet been released.

The value has been standardized and transformed to be in range between 1 and 10, with 10 implying the highest budget transparency.

The data by country can be found here.

Judicial Independence

Based on the “judicial independence” indicator from the Executive Opinion Survey of the World Economic Forum Global Competitiveness Dataset. This indicator asks the question “To what extent is the judiciary in your country independent from influences of members of government, citizens, or firms? [1 = heavily influenced; 7 = entirely independent]. The same data is used for the 2019 and 2021 editions, as a new version has not yet been released.

The indicator has been standardized and transformed to be in range between 1 and 10, with 10 implying the highest judicial independence.

The data by country can be found here.

E-Citizenship

Simple mean of standardized values of:

  • Fixed broadband subscriptions (% population)
  • Internet users (% population)
  • Facebook users (% population)

The first two variables were taken from International Telecommunication Union’s ICT Dataset; the latter is from the Internet World Stats.

The value has been transformed to be in the range between 1 and 10, with 10 implying the highest score for E-Citizenship.

The data by country can be found here.

Freedom of the Press

The score stems from Reporters without Borders’ Press Freedom Index. Until 2019, the source used was Freedom House’s Freedom of the Press Report.

The values are standardized and transformed to be in range between 1 and 10, with 10 implying the highest freedom of the press.

The data by country can be found here.

For the older editions (2015 - 2019), these components were considered instead of Administrative Transparency and Online Services:

Administrative Burden

Consists of the simple mean of standardized values of:

  • number of procedures required to start up a business for both men and women (averaged)
  • time needed to start up a business for both women and men (averaged)
  • number of tax payments per year
  • time to pay taxes

The indicators are taken from the World Bank Doing Business Data. This mean value has been transformed to be in range between 1 and 10, with 10 implying the lowest administrative burden.

The data by country can be found here.

Trade Openness

Made up from the simple mean of standardized values of:

  • time required for border compliance for export and import procedures
  • cost required for border compliance for export and import procedures

The indicators stem from the World Bank Doing Business Data. Their value has been transformed to be in range between 1 and 10, with 10 implying the highest trade openness.

The data by country can be found here.