RE: How to evaluate science, technology and innovation in a development context? | Eval Forward

Dear Svetlana and the CAS Team,

We appreciate the Technical Note “Bibliometric Analysis to Evaluate Quality of Science in the Context of One CGIAR” (CAS Technical Note). We at the Fund International Agricultural Research (FIA) of GIZ have recently commissioned a Bibliometric Study by Science Metrix (see Evans, 2021) and welcome the increasing value that the CGIAR System is placing on a rigorous evaluation of science quality of their research. We view science quality as a necessary prerequisite on the way towards development impacts as innovations are first developed and tested CGIAR scientists. In the following paragraphs we would like to respond to your questions from a donor point of view.

1. What do you think are the challenges in evaluating quality of science and research?

As funders we expect our commissioned research projects to perform their work in line with the 4 key elements of Quality of Research for Development (QoR4D) including 1) Relevance, 2) Scientific Credibility, 3) Legitimacy, 4) Effectiveness. However, these criteria apply mostly to the process of “doing” research and science. However, we are also interested in results (outputs, outcomes, impacts). If we are talking about quality of science, we find bibliometric analyses very useful to determine the scientific impact of our funded work in relevant scientific fields. And the impact within a scientific field is still measured best by the number of citations that an article or book chapter receives. Within the field of science peer-reviewed publications (and the citations articles receive) are considered the “gold standard” in terms of impacts. Yet there are some challenges associated with measuring scientific impact:

Long time periods. A big challenge, which the 2020 SOAR Report (Alston et al. 2020) pointed out, is that “agricultural research is slow magic” – it can take decades for results from investments in agricultural research to become visible, but decision-makers need to demonstrate results quickly and assess project return on investment in the future to justify an increase in funding. As we also learned from the CAS Technical Note, a robust measurement of science impact in terms of bibliometrics is only possible about 5 years after a research project or portfolio has been completed. The peer-review cycle, individual journal requirements and citation status, often reflecting the readership amount, mean it takes time until the impact becomes evident, especially if the work is novel or highly innovative. The long time horizon poses challenges as we can’t use this information directly for programming.

Altmetrics. We fully understand that bibliometrics are an imperfect measure of true scientific impact. Some research can be highly influential and reaching a large audience via alternative channels including twitter, blogs, or as grey literature. This is captured in altmetrics but it is difficult to combine bibliometrics and altmetrics to get a full picture of scientific impact.

Cost effectiveness. As Science Metrix points out, and particularly for those papers with external co-authors, the fraction of support attributable to each funding source is not easily determined. Donors are often interested in “following the money,” but in measuring science quality, the direct investment is not so easily attributable to outputs and the longer-term impacts of applying and scaling the published scientific contributions ends up bringing the real “bang for the buck”. In our own science quality study (see Evans 2021), we also assessed efficiency in terms of cost-benefit. Specifically, we assessed the number of publications per million Euro invested. We then compared the cost-effectiveness of our funded research projects to those of comparable EU-financed projects and discovered that our research projects were more efficient. Yet computing these cost-effectiveness measures comes with a host of limitations. However, we were happy to see that such an indicator is also proposed as “Level 2 Priority Indicator” (E10) suggested in the CAS Technical Note.

1a. What evaluation criteria have you used or are best to evaluate interventions at the nexus of science, research, innovation and development? Why?

Impact Focus. In our evaluation of science quality, we focused particularly on the OECD DAC evaluation criteria “impact” (what does the research achieve? What contribution does it make?) and to a lesser extent “efficiency” (how well were resources used, how much was achieved for amount spent?). Both evaluation criteria, impact and efficiency are of particular relevance for us as funders in terms of accountability and transparency to demonstrate that tax-payer money is used wisely.

In our evaluation (see Evans 2021), Science Metrix focused mostly on bibliometric indicators, comparing publications of our funded projects, with those of other international agricultural research (outside the CGIAR system). Contribution to SDGs, based on key-word searches and content-analysis, were also part of the analysis, in order to capture the extent to which cross-cutting issues such as gender, human rights, sustainability, resilience as well as climate change mitigation and adaptation were addressed in the peer-reviewed publications. Most bibliometric indicators sought to assess the impact that publications have made, via indicators such as average of relative citations (ARC), highly cited publications (HCP), and citation distribution index (CDI).

1b. Could a designated quality of science (QoS) evaluation criterion help capture the scientific aspects used in research and development?

Yes, indeed a designated quality of science (QoS) evaluation criterion as outlined in the ISDC technical note “Quality of Research for Development in the CGIAR Context” may be highly appropriate to evaluate research within the CGIAR framework. Reflecting CGIAR’s comparative advantage and primary focus, science quality, and not only development indicators, often outside of the sphere of influence, and mandate, of research institutes, should be reflected in evaluations. The sub-components of the QoS evaluation criterion (Relevance, Scientific credibility, Legitimacy, Effectiveness) are important to measure the quality of “doing science”. Nevertheless, we would highlight that such a criterion should always be accompanied by an evaluation of the OECD DAC criteria of impact and efficiency to also capture not just the “doing” but also the “results” of the research for development enterprise.

2. What are the methods and indicators that could work in the evaluation of science and research?

QoS. Evaluating the quality of science (QoS) evaluation criterion and its sub-components (Relevance, Scientific credibility, Legitimacy, Effectiveness) requires a mixed methods approach. As was done in the CGIAR Research Programm (CRP) evaluations, the focus will be on inputs (composition of research staff, collaborations, money, management infrastructure, policies, etc.). Qualitative research methods will be most appropriate when it comes to evaluating how relevant a certain research question is or whether the research process is perceived as fair and ethical. This may require conducting interviews with key stakeholders, and/or carrying out surveys to gather feedback and insights into enabling conditions and barriers to effective and sustainable impact.

SI. In contrast, the evaluation of science impact (SI) will require the use of quantitative analysis using sophisticated bibliometric methods and measures as outlined by Science Metrix in the CAS Technical Note. We consider all all “Level 1 Priority Indicators” (CAS Technical Note, Table 6) as highly relevant science impact indicators that we hope will be computed when evaluating the scientific impact of the current round of OneCGIAR Initiatives.

3. Have you seen monitoring, evaluation and learning (MEL) practices that could facilitate evaluations of science, technology and innovation?

Ongoing monitoring of Quality of Science (QoS) as well as Science Impact (SI) will be difficult. From our perspective both criteria need to be assessed using an evaluation format (study at a certain point in time). Our own science quality study (see Evans 2021) is an example of how SI could be assessed using rigorous bibliometric methods and measures. However, the purpose of our science quality study was to investigate the reach and impact of our funded research work on the scientific field of developmental agriculture. The study served the purpose of accountability and transparency. We did not use the findings for the “L” (learning) dimension of MEL. A true qualitative or mixed methods QoS study would be a more natural fit when the goal is to derive lessons that can be used for adaptive management and steering purposes. The CGIAR Research Programme (CRP) evaluations provide a good example how results from an evaluation could be used to improve “doing science”.



Raphael Nawrotzki and Hanna Ewell (M&E Unit, FIA, GIZ)



Alston, J., Pardey, P. G., & Rao, X. (2020) The payoff to investing in CGIAR research. SOAR Foundation.

Evans, I. (2021). Helping you know – and show – the ROI of the research you fund. Elsevier Connect.