Sunday, April 05, 2015

On getting good information from peer review meetings


I read an interesting column in The Economist on getting good results from meetings. I quote extensively from that Free Exchange column:
In 1785 the Marquis de Condorcet, a French mathematician and philosopher, noted that if every voter in a group has a better-than-even chance of choosing the preferable of two options, and if voters do not influence each other, then large groups of voters are very likely to make the right choice.* The bigger and more diverse the group the better: more people bring more information to the table which, if properly harnessed, leads to improved decisions. But ever bigger meetings imply more time spent in them: few workers would welcome that. And even with more people in the room, all manner of behavioural flaws stand in the way. 
One problem that obstructs sensible decision-making is the “halo effect”—“owning the room” in the parlance of Silicon Valley.......A second problem is called “anchoring”. In a classic study Amos Tversky and Daniel Kahneman secretly fixed a roulette wheel to land on either 10 or 65. The researchers span the wheel before their subjects, who were then asked to guess the percentage of members of the United Nations that were in Africa. Participants were influenced by irrelevant information: the average guess after a spin of 10 was 25%; for a spin of 65, it was 45%. In meetings, anchoring leads to a first-mover advantage. Discussions will focus on the first suggestions (especially if early speakers benefit from a halo effect, too). Mr Kahneman recommends that to overcome this, every participant should write a brief summary of their position and circulate it prior to the discussion...... 
Tom Gole of the Boston Consulting Group and Simon Quinn of Oxford University studied the votes of judges at international school debating tournaments. In these tournaments, three judges are randomly assigned to each debate (with some controls to ensure each panel has a mixture of experience, gender and so on). Judges watch the debate, then immediately vote on the winner before conferring. Crucially for the experiment, the participating teams are seeded. Using some whizzy statistical modelling, the authors find that if a judge disagrees with her fellow panellists in a given round, she is more likely to vote for the pre-tournament favourite—the higher-seeded team—in later debates. That suggests, say the authors, an unspoken desire to avoid “too much” disagreement. 
Career concerns may distort incentives even if votes are secret. In a 2007 paper Gilat Levy of the London School of Economics noted that observers can work out how likely it is that committee members have voted one way or another from ballot rules. If unanimity is required for a measure to pass, and it does, then outsiders will know with certainty how every member has voted. A simple majority rule means that observers can assign at least a 50% probability to any one committee member having given their assent; if a majority of two-thirds is required, the probability that any given member has supported the proposal goes up. The incentive to vote against controversial measures rises the greater the likelihood that each member will be blamed for its passage.
I have posted a number of times on peer review (check the "peer review" tag to get them all. What we did to get good results in the peer review of research proposals was:

  • To try to screen at a preproposal level to be sure we were asking reviewers to look at decent proposals (and to save proponents from writing detailed proposals with little chance of being funded).
  • Criteria for approval of proposals were carefully defined and described. (Note that the purpose of the program determines the criteria; our program was intended to build capacity to do research in developing countries, and sometimes a proposal strong on this criterion would be funded while a stronger scientific proposal with less capacity building involved would not be funded.
  • To group proposals into small groups on related topics, and send each to a primary and secondary reviewer. They would be asked to write a review and rate the proposal prior to the meeting.
  • The primary and secondary reviewers for all the proposals in a single group would form a panel. All the proposals would be sent to each member, and all the members of the panel were expected to read all of the proposals before the face-to-face panel meeting. (Normally, something like 8 to 10 proposals would be reviewed by a single panel, and normally a single reviewer would not be the primary reviewer on more than two proposals, nor the secondary reviewer on more than two.)
  • In the meeting, the proposals would be discussed one at a time. The primary reviewer would speak first on a given proposal, and then the secondary reviewer would speak. There would then be a general discussion, after which, the proposal would be rated by all of the reviewers individually. The average rating would be calculated. Proposals at this stage would also be judged by consensus to be potentially worthy of funding or so flawed in some way as not to be worth funding.
  • When all the proposals had been reviewed, discussed and rated, they would be ranked from best to least useful by consensus of the panel. The average rating would be a guide to the ranking, but we discovered that sometimes after discussion a panel would find that comparing two proposals they would rank that with the lower rating to be in fact more worthy of funding and thus of higher rank.
  • At all panel meetings we would have at least two people to judge the panel discussion -- one from our office and one from the National Academy of Sciences staff. If the judges determined that the review was not adequate for some reason for one or more of the proposals, additional reviews would be sought and a final decision would be taken by our staff.
  • We would then allocate our total budget among the top rated projects (all of course specified as worthy of funding by their respective panels), using the ratings as a guide as well as the observations of the panel reviews.
The program was part of the foreign assistance effort of the U.S. Government, and the reviews were held in the Washington D.C. area, which has a very large and strong scientific community. We found that local scientists were willing to do the reviews and come to peer review meetings without fees or reimbursement of costs if we selected from the local universities, laboratories or government agencies. Ours was a relatively small program. Reviewers seemed to enjoy the experience.


No comments: