The Challenge of Evaluating Policy-Based Operations

PBOs aim to support implementation of a government’s overall growth and poverty reduction strategy, entailing multidimensional and complex aspects of public policy and action. Evaluating them is far more challenging than evaluating conventional “bricks and mortar” investment projects, which benefit from more clarity of measurement metrics and greater availability of data tied to specific investments and fiduciary requirements. By contrast, PBOs involve building state capacity, creating legal and regulatory frameworks, improving the quality of public institutions and policies, or deploying new technologies and approaches. Metrics for these are not well established, are more difficult to standardize across sectors and countries, and involve long-term behavioral changes that may be difficult to observe.

The fungibility of budget support compounds the difficulty of evaluating PBOs (also true of investment projects) because it creates fiscal space for any public expenditure, including those development partners may not support (such as defense outlays or untargeted transfers). Notably, the amount of financing rarely is related to estimated costs of reform objectives, except in rare cases such as recapitalization of public sector banks.

Policy dialogue aims to provide a sharper and shared focus on development outcomes—that it is a source of external discipline exercised through agreed reforms and that capacity development and related measures fill the capacity gap prioritized by both the MDB and country. Policies and institutions are contextual, differing across regions and countries, as is the interpretation of practices that build on different policy frameworks or have a cultural dimension. Funding, policy dialogue, and capacity development normally associated with PBOs are assumed to reinforce one another alongside contextual factors. Development partners cofinancing PBOs often need to agree on what policy and institutional functionality looks like, on what are considered acceptable budget management practices, even though they may disagree on priority reforms, specific policy objectives, or the appropriate measurement of results.

The theory of change underpinning PBL is based on an intervention logic that describes how budget support helps to enhance implementation of the supported development strategies to achieve established targets. Detailed requirements on how the inputs provided should be deployed are rare, although there are often rigid requirements on reporting and accounting procedures. This flexibility is particularly important during economic crises when quick-disbursing and untied financing can be critical.

Attributing Outcomes to Policy-Based Lending

A common criticism of PBL evaluations concerns the difficulty of attributing country outcomes to the use of PBOs, including the policy actions they support and the fast-disbursing financial support they provide. Most evaluations—for the ADB, IDB, and World Bank—discuss the attribution problem, particularly with multiple donors (joint funding) or simultaneous budget support and absent a well-defined counterfactual. High-level outcomes—GDP growth, levels of private sector investment, employment creation, and poverty reduction, are influenced by many exogenous factors, so claims on attribution call for modesty. Recognizing that outcomes can never be directly attributable to budget support, the World Bank evaluation notes that “it is an important finding in today’s world that the World Bank can contribute to development by recognizing and supporting committed and effective leaders without having to prove that its actions led to that commitment.”26Cheryl Gray.2022, Comments on “Policy-Based Financing at the World Bank: Evolution, Performance, and Reform” in this volume.

While rigorous attribution is difficult, the CDB argues it is often possible to establish “plausible likelihood of contribution” provided PBOs have results frameworks of reasonable quality. They propose a “reverse causal chain” analysis using the PBO’s results framework that involves several sequential steps, starting from outcomes and the associated indicators to the logic of the results chain drawing on core elements of the PBO. The EU uses a different method (described below) to assess impact. Briefly, change can be traced to outputs generated by the interplay between funding—together with policy and institutional effects emerging from dialogue, technical assistance, and capacity building—and the domestic processes of policy making, budget formulation, and budget execution. Analysis that combines these data streams lets the evaluator assess the contribution of budget support to the success or failure of the recipient governments’ policies and strategies.

Principal Evaluation Methodologies

The evaluations of the MDBs and the EU follow two main methodological approaches: an objectives-based method and the three-step approach of the OECD-DAC.27Methodological details of the three-step approach are presented in OECD (2012).Details are in Annex A. Most MDBs use an objectives-based method, which evaluates against specific program objectives of individual PBOs as stated in legal documents. The evaluation examines the relevance of policy actions and measures proposed and implemented under the program and weighs the robustness of evidence to the specific country outcomes they are meant to achieve. This approach is built on qualitative and quantitative evidence on the inputs, outputs, and outcomes expected to be delivered through specified actions or policy interventions at sectoral or regional levels that underpin the program supported by the PBO. Evaluators ultimately rank performance, judging the rigor and quality of evidence, at the intervention, program, and instrumental levels.

The EU uses a three-step approach that works at an aggregated level. It examines total budget support from all development partners, usually over a decade. The first step assesses the effects of combined budget support on policies, services, and induced outputs. The second step assesses social and economic outcomes targeted by these public policies and induced outputs. The third step relates the results of the causal analysis of the first two steps, through the links established between budget support inputs and the related policy changes, to infer the contribution of budget support.

The two evaluation approaches differ in several ways. The approach used by MDBs is objectives-based and includes performance ratings (on a six-point scale from highly unsatisfactory to highly satisfactory).28ADB uses a four-point scale in its ratings of project outcomes: highly successful, successful, less than successful, and unsuccessful.It follows standard, pre-set evaluation criteria, which include relevance, efficacy, effectiveness, impact, risks, government and MDB performance.29To improve the operational relevance of its work, the World Bank’s Independent Evaluation Group (IEG) has modified the structure and content of its evaluations and validations of PBOs, partly in response to changes in the self-evaluation the World Bank adopted (see chapter 6 of this volume). The new IEG framework better reflects the characteristics of Development Policy Financing (DFP). The main changes relate to the assessment of relevance, results indicators, and the World Bank’s performance. Instead of rating the relevance of objectives, IEG now rates the relevance of the prior actions supported by the operation (although the relevance of objectives is still discussed).The ratings cover the dimensions of assessment, including the overall rating for the budget support series. The approach is suited to assessing accountability of individual operations and extracting lessons and recommending improvements in future operations. It also supports quantitative comparison of different operations.

The OECD-DAC’s three-step approach does not include pre-set evaluative questions, although it follows a clear analytical framework (see Annex A). Evaluative questions vary depending on the operational focus and donor interests. The evaluations focus on learning and provide no ratings, although they include an element of accountability assessment with regular reporting. Moreover, comparisons across operations focus on total country budget support and provide qualitative and country-oriented lessons rather than quantitative comparisons. Both methodologies require evaluator judgment of the quality and weight of the evidence. More robust, quantitative methods and development of the counterfactual to permit attribution by donor or instrument are more difficult to implement in PBOs than in investment projects. By devising appropriate triangulation of evaluative data from various sources, however, carefully designed evaluations may establish whether the evaluative results obtained can be attributed to the intervention being evaluated.

Limitations of Evaluations

All the reports identify limitations of the evaluation work, as examples from four agencies show:

ADB. No attempt was made to assess the impact of PBL on macroeconomic conditions or on growth and poverty reduction. Rather, the evaluators used the plausible contribution of ADB’s PBL to outputs and outcomes in the areas indicated in the theory of change. The evaluation was also constrained by limited evidence in the validated program completion reports, which were informed by ADB self-assessments.
AfDB. The evaluation team limited the extent to which the overarching question on results would be addressed, constraining how far up the results chain the evaluation could go. Focus was placed on collection of primary performance data in only a few sectors and data was limited. The quality of analytic work varies across countries, but use of other sources helps to mitigate the effects of the variability.
IDB. Findings are not based on comprehensively evaluating PBOs as an instrument or the achievement of the outcomes, which was considered beyond the scope of the exercise. However, the chapter is an important steppingstone to a more in-depth future evaluation of PBOs. Findings of the Office of Evaluation and Oversight’s work invite questions, such as the extent to which PBL financing complements or substitutes for funding from financial markets and whether IDB-supported policy measures are complementary to or overlap those of other institutions.
World Bank. IEG has not done a recent comprehensive evaluation of PBOs, but it maintains a large repository of DPO performance ratings since 2005 that is an important evaluative database. IEG’s assessment of DPOs instead draws on thematic evaluations and learning products, as well as on the World Bank’s DPO retrospectives.30World Bank Group. 2022. 2021 Development Policy Financing Retrospective: Facing Crisis, Fostering Recovery, Operations Policy and Country Services, March 16, Washington, DC. These are prepared by and for World Bank management and not by the independent evaluation group.

This overview of the evaluation and conference findings has reviewed the context for PBF against an evolving global financial landscape and presented the main findings of the six agency evaluations and expert commentaries. Performance of PBF was generally strong, usually meeting its principal objectives. The overview also sounds a cautionary note, recognizing the challenges of evaluating budget support and attributing country-level outcomes to the augmented fiscal resources which PBF provides. A major contribution of the instrument is provision of countercyclical financing during crises when access to private financing is sharply limited. Another contribution central to PBF success is providing technical assistance and expert policy dialogue to support reforms aimed at improving PFM, service delivery, and the business environment. The overview also looks toward future challenges and the need for aid to developing countries, not only to improve their own welfare and economic performance but also to support vital global public goods related to addressing climate change, arresting the spread of pandemics, or preventing financial crises. These all point to the expectation that developing country need and demand for PBF will increase and could play a central role in ramping up financial commitments in support of the SDGs.

Part II of the overview summarizes each agency’s evaluation report. Each report draws on evidence and evaluations of PBF from the five MDBs and the European Union. Part II also summarizes comments from development experts on the agency reports.

26Cheryl Gray.2022, Comments on “Policy-Based Financing at the World Bank: Evolution, Performance, and Reform” in this volume.
27Methodological details of the three-step approach are presented in OECD (2012).
28ADB uses a four-point scale in its ratings of project outcomes: highly successful, successful, less than successful, and unsuccessful.
29To improve the operational relevance of its work, the World Bank’s Independent Evaluation Group (IEG) has modified the structure and content of its evaluations and validations of PBOs, partly in response to changes in the self-evaluation the World Bank adopted (see chapter 6 of this volume). The new IEG framework better reflects the characteristics of Development Policy Financing (DFP). The main changes relate to the assessment of relevance, results indicators, and the World Bank’s performance. Instead of rating the relevance of objectives, IEG now rates the relevance of the prior actions supported by the operation (although the relevance of objectives is still discussed).
30World Bank Group. 2022. 2021 Development Policy Financing Retrospective: Facing Crisis, Fostering Recovery, Operations Policy and Country Services, March 16, Washington, DC. These are prepared by and for World Bank management and not by the independent evaluation group.