Sunday, January 16, 2011

Some Thoughts about Evaluation Studies


I wonder whether we bring to evaluation all of the discipline that we have learned to use in project planning and management.

What are my credentials for writing this post? I am the "owner" of the Monitoring and Evaluation group on Zunia, and a member of the Monitoring and Evaluation Professionals group on Linked In. I have been on an evaluation team for a World Bank project, and I have done two Project Completion Reports for World Bank projects; I have led evaluations of a couple of projects for USAID; and I have done an evaluation for infoDev. I also worked with a team on an evaluation of a large project supported by a public private partnership. I have published on impact evaluation, and I have managed evaluations of programs for which I had oversight. Thus I have experience in a specific kind of evaluation of donor-funded, international development projects.

Donor agencies spend considerable effort in the design of projects. Those with which I have worked utilize the logical framework approach which involves linking project inputs, outputs, purposes, and the larger objectives. The approach requires the designers to specify objectively verifiable indicators for inputs, outputs, purpose achievement and goal achievement. It also requires explicit statement of key assumptions that are made in the project design.

Evaluations, of course, when done well are carefully designed. Teams of evaluators are carefully selected. There is a plan and a budget. If peer review is involved, peers are carefully chosen. If, as is usually the case, interviews are involved, care is taken in the design of the interviews and development of the interview forms. There is planning as to how many project sites are to be visited, and when those visits are to take place. Data analysis is planned in advance, and there is clear understanding of how information gathered in the evaluation is to be related to the questions that the evaluation is to address.

Still, I wonder if evaluation teams assume too much about the purposes of the evaluations that they will undertake? Do they draw on the theory behind the logical framework to organize their own evaluation work?

Let me give an example about examining the assumptions made prior to an evaluation. I well recall a major evaluation I led of a program that was working in Africa, Asia and Latin America. The program staff was organized into three relatively independent divisions: Africa, Asia and Latin America. We in turn divided the evaluation team into three site visit teams of regional experts -- one to make site visits in Africa, another in Asia and the third in Latin America. Several months later we discovered that the reported quality of the program in Africa was different than that of the reported program in Asia, and both were different than the reported quality in Latin America. We were never able to attribute how much of the difference was due to the difficulty in working in each of the three regions, how much due to the random variations in the small baskets of project activities chosen in each region (either in the program or in the evaluation team's sampling), how much was due to the differences in personnel in the project teams working in the different continents, and how much due to the variations among the three evaluation site visit teams. Clearly we made the (unjustified) assumption that evaluators we chose to work in Africa, Asia and Latin America would all do the same thing and not introduce a source of variation in the results they reported. Had we documented that assumption up front, we might well have chosen a different evaluation method. The reporting of the results of the evaluation turned out to be especially difficult and perhaps not as useful as it might have been.

I wonder if evaluation teams really ask the purposes of an evaluation. They sometimes seem to be simply devised to address whether the original project indicator benchmarks have been met. It seems to me that often more can be accomplished by setting very ambitious benchmarks that are unlikely to be met but which motivate the project personnel, rather than modest benchmarks that are likely to be exceeded. On the other hand, a project manager who is going to be held responsible for achieving the original benchmark is likely to opt for modest benchmarks rather than maximizing project achievement. It seems to me that evaluations should focus on what was versus what might have been accomplished.

I worry about the limits of rationality. Project planners have limited rationality, and I strongly believe that any project evaluation should carefully consider unplanned effects of the project, weather they be positive or negative. Evaluations therefore should begin with a review of what actually happened prior to an assessment of whether the project was sufficiently useful or not. Think about all the projects in Haiti that just stopped when the earthquake killed a couple of hundred thousand people and stopped most activity in the country. While seismologists may have predicted the quake, one could not have expected those in other fields to understand the risk, nor for any to predict just how severe the impact of the quake would be.

Evaluators too have limited rationality, and will not adequately predict all the events and conditions that might influence their work. Similarly, they will probably not fully understand how the social, political, economic, cultural and physical environmental conditions influenced the project that they evaluate. It is important to try to make assessments of these conditions, but it is also important to recognize that the assessments will be limited in their success.

More fundamentally, what is the "real purpose" of the evaluation, as seen by the various stakeholders. Is it merely a bureaucratic requirement, to be accomplished at as low a cost of key resources as possible? There certainly are such evaluations.

I recall an occasion in which word came down that senior officials of my organization were searching for "reliable people" to conduct a series of impact evaluations. Were the selected people to be relied upon to provide information that would make the organization look good or to make the senior officials look good; it was clear that senior officials were not looking for the most gifted people in figuring out what was really happening and making that clear to others.

If the real purpose of the evaluation is to obtain actionable information for project and program managers, what kind of information do they want and need? What kind of information is likely to really be actionable? I wonder how many evaluation teams really address these questions?

The previous paragraphs have been suggested that evaluations serve the interests of stakeholders, and indeed it is useful for evaluators to know about those stakeholders. But what about the entire community of people supporting and implementing donor assistance programs, the collaborators with those programs, and the people who should be the beneficiaries of those programs. Should not all evaluations add to the growing body of credible evidence as to what works and what does not work?


In an international development program, evaluations are likely to cross cultural bounds. In those circumstances, the cultural assumptions of the evaluation team as it carries out the evaluation might best be made explicit in order that the evaluation can be fully understood. I don't recall ever reading an evaluation report that documented the assumptions of the evaluation team.

By what means is the evaluation itself to be evaluated? What are the verifiable indices of its quality? Of its accuracy? Of its precision? Of its value to decision makers?


How should the donor community allocate resources among evaluation studies? Should each development project have a specific portion of its budget allocated to monitoring and evaluation? Should there be an algorithm for the portion of the project budget for M&E that recognizes the economies of scale and the differences in evaluation difficulty of work on different types of projects, in different environments? Should agencies simply allocate a fixed amount of money to each project? The set of possible allocations of resources for M&E in a large donor organization such as the World Bank or USAID is extremely large. The probability than any one of these allocations chosen at random is optimal is very small. Leaving the designers of each project to independently decide on how much to spend on that project's M&E is almost surely suboptimal.

Perhaps there should be at least a minimum investment in monitoring and evaluation for every project, if only to provide information for mid course corrections and to keep people honest. Indeed, the intellectual rigor in defining goals and objectives, their objectively verifiable indicators and the key assumptions is itself valuable.

On the other hand, it seems clear that there should be proportionately more resources devoted to the evaluation of projects exploring new technologies, new institutional approaches, or other elements that may be prototypes for future projects than those "me too" projects which replicate many others using well established approaches.

The essential point is that an organizational evaluation strategy should allocate resources among evaluations according to some careful thinking as to what information that organization needs to obtain to best manage its future activities. In allocating resources, money is certainly of concern, but so too are expertise and management attention.

No comments: