Svetlana I Negroustoueva

Svetlana I Negroustoueva

Lead, Evaluation Function

Svetlana Negroustoueva, the CGIAR CAS Evaluation Senior Manager, leads technical and operational support to develop and execute the CGIAR’s multi-year independent evaluation plan. Svetlana is a PMP Certified professional with over 15 years of experience designing and conducting evaluations, assessments, monitoring and research activities, include quantitative and qualitative data collection and analyses. She works at the intersection of sustainable landscapes, energy, health, food security and social inclusion, for projects across funders and implementing entities. She has served at African Development Bank, World Bank, the GEF and Climate Investment Funds in positions concerned with independent and demand-driven evaluations. Furthermore, Svetlana has been an independent evaluator herself, leading and participating in evaluation teams for a variety of clients during consultancy assignments. Svetlana has made her mark in many ways, and notably in the domain of gender and evaluation; she is the co-chair of EvalGender+, the global partnership to promote the demand, supply and use of Equity Focused and Gender Responsive Evaluations. Svetlana is a Russian and US national, and holds a Master’s degree in Public Affairs from the University of Texas at Austin in the US, and Advanced BA in Public Administration and Social Research from Lomonosov Moscow State University (MSU).

My contributions

    • Thank you to all the contributors, those new and familiar with the Guidelines document, or at least various supporting knowledge products. Here is a summary of the discussion structured by core themes.

      Reflections on the Guidelines: content

      Generally, the majority of participants agreed that the new Guidelines offer some solutions to evaluate quality of science in a R4D context. In particular, contributors used terms such as well-researched, useful, clear, adaptable and flexible. A couple of contributors emphasized the importance of flexibility, to seek for a middle ground and the application of the guidelines to other organizations. Another contributor praised the Guidelines for providing an interesting conceptual framework, a flexible guide, and compendium of methods and questions that would also be useful in other evaluation contexts.

      The value of a designated evaluation criterion of Quality of Science 

      There was also consensus that the four QoS evaluation dimensions (design, input, process and output) were clear and useful with well-defined indicators, especially when using a mixed methods approach. One contributor noted that the dimensions capture a more exploratory, less standardized way of doing evaluations at the R4D nexus, enriching the depth of evaluative inquiry. Another contributor emphasised the building and leveraging of partnerships under the ‘processes’ dimension. A further contributor was excited about using the framework to design a bespoke evaluation system for her department. In addition, the three key evaluation questions recommended to evaluate the QoS, were considered appropriate for R4D projects.

      In the context of the ongoing GENDER Platform (of CGIAR) evaluation, a contributor noted the usefulness of the Guidelines as a toolbox in an Agricultural Research For Development (AR4D) context to situate QoS while assessing the key questions following five DAC evaluation criteria - relevance, effectiveness, efficiency, coherence, and sustainability. A key lesson from the evaluation team in applying the guidelines was that they straddled both the perspectives of the evaluator lenses, and the researcher lens, with subject matter experts to unpack the central evaluation questions mapped along the four QoS evaluation dimensions.

      Several contributors requested clarity on whether the Guidelines were useful for evaluating development projects. They were developed for evaluating R4D in the context that co-designed research would be implemented in partnership with development stakeholders who would then be in a position to scale innovations for development impact. While framed around R4D interventions, we consider that the Guidelines are flexible enough to be adapted for evaluating development projects with science or research elements- the four dimensions for evaluating QoS would allow scope to bring them out. A recent CGIAR workshop discussed the retroactive application of the guidelines in evaluation of development interventions by means of two specific case studies: AVACLIM, a project implemented by FAO, and Feed-the-Future AVCD-Kenya project led by ILRI. Both case studies showcased the wide applicability of the Guidelines.

      Several contributors also raised the importance of evaluation of impact. While the scope of work by the CGIAR’s Independent Evaluation Function would not evaluate impact, the Guidelines consider the possibility (See Figure 6) of assessing the impact, towards SDGs and beyond. Likewise, in other contexts and organizations there may be wider windows for integrating focus on impacts. The new Guidelines could be deployed 3-5 years after the finalization of an intervention to assess the progress made in uptake of technologies.

      Echoing the 2022 discussion, some contributions highlighted inclusive or beneficiary focus in evaluations, namely emphasis on communities who might also be an important stakeholder in research and innovation. In a development or R4D intervention, a stakeholder analysis permits identifying beneficiaries  as key stakeholders; and the use of ‘process’ and ‘outputs’ dimensions would allow nuancing their participation and benefits from successful research and development activities.

      Facilitating Learning from Implementation and Uptake of the Guidelines

      Contributors raised issues related to the roll-out or use of the Guidelines, including:

      • Whether the single evaluation criterion of quality of science sufficiently captures the essence of research and development;
      • The usefulness of further clarifying  the differences between process and performance evaluations;
      • The need to include assumptions, specifically those that have to hold for the outputs to be taken up by the client; 
      • The importance of internal and external coherence;
      • The need to define appropriate inclusion and exclusion criteria when designing research evaluations;
      • The importance of defining the research context which is given priority in the revised IDRC RQ+.


      Several suggestions were made on how CGIAR can support the roll-out of the Guidelines with the evaluation community and like-minded organizations. Useful suggestions were also made about the need to build capacity to use the new Guidelines including training sessions and workshops, online resources (webinars, collaborative platforms), mentoring partners, and piloting the Guidelines in case studies and up and coming evaluations. In particular, capacity development of relevant stakeholders to understand and use the Guidelines would be appropriate to support a wider use and further engagement with the evaluation community.

      One contributor suggested conducting a meta-evaluation (perhaps a synthesis) of the usefulness of the Guidelines once the CGIAR used them to evaluate the portfolio of current projects. Remarkably, this is currently being done retrospectively with the previous portfolio of 12 major programs (implemented from 2012-2021) with notable improvements in clarity and definition of the outcomes. Further application of the Guidelines in process and performance evaluations across different contexts and portfolios will reveal more insights to further strengthen and refine this tool.

    • Dear colleagues,

      We were very excited to hear from 23 participants from a range of backgrounds. The richness of discussion by a diverse range of experts, including non-evaluators, highlights an overall agreement on the importance of framing and use of context-specific evaluation criteria for contextualizing evaluations of science, technology, and innovation.  Below please find a summary if points made, with an invitation to read actual contributions from authors if you missed.

      Frame of reference

      The following frameworks were introduced: Quality of Research for Development (QoR4D) frame of reference , the Research Excellence Framework (REF) and the RQ+ Assessment Framework

      The main discussion touched upon the Quality of Research for Development (QoR4D) frame of reference elements, and directly or indirectly linked to the evaluation criteria: relevance, legitimacy, effectiveness and scientific credibility.

      For Serdar Bayryyev and Lennart Raetzell, the context where products are used when determining their relevance was key. When assessing effectiveness (or quality) Serdar suggests to (1) assess the influence of the activities and the extent to which the science, innovation and research products have influenced the policies, approaches, or processes and (2) assess the degree of “networking”, i.e., the degree to which the researchers and scientific institutions have interacted with all relevant stakeholders. Lennart Raetzell shared her recent experience with a thematic impact evaluation for VLIR-UOS ( on pathways to research uptake mainly in the field of agriculture.

      Nanay Yabuki and Serdar Bayryyev agreed about importance of assessing transformational nature to determine whether research activities cause truly transformational change, or at least trigger important policy discourse on moving towards such transformational change. Relevance of science, technology, and innovations (STI) is context specific, as it is how STI triggers a transformational change.

      Sonal D Zaveri asserted that Southern researchers agree about the need for research to be relevant to topical concerns, to the users of research and to the communities where change is sought. Uptake and influence are as critically important as is the quality of research in a development context, hence evaluations must measure what is important for communities and people. Many evaluations are designed at distance and evaluation choices are privileged, due to power of expertise, position, or resources. It would be difficult to accept the quality of science bereft of any values related to human rights, inclusion and equity. Unless the findings are used and owned by people —and especially those with little voice in the program—, it cannot be claimed that evaluations have led to public good or used for the benefit of all peoples. On that topic, Richard Tinsley highlighted the importance of considering the final beneficiaries of the scientific results. He stressed a need to have clarity about CGIAR’s primary clients (the Host Country National Agriculture Research Systems (NARS for CGIAR), and the final beneficiaries (multitude of normally unnamed smallholder farmers), who are still one step removed from the CGIAR’s clients (NARS).

      From a donor point of view, Raphael Nawrotzki finds that the sub-components of the QoR4D frame of reference are important to measure the quality of “doing science” rather than the results (output, outcome, impacts). Hence, the need to use the OECD DAC criteria of impact and efficiency to capture the “doing” (process) but also the “results” of the research for development enterprise. He asserted that the focus for a funder is on impact (what does the research achieve? What contribution does it make?) and, to a lesser extent, efficiency (how well were resources used, how much was achieved for amount spent?).

      For Nanae Yabuki, using the scientific evidence for enhancing the impact of the interventions, reflected FAO’s mandate. For assessing these aspects, ‘utility’ of the research findings is more relevant than ‘significance’ of research findings.  Hence the need to come up with appropriate criteria for each evaluation.

      Norbert TCHOUAFFE, brought in Theory of Change as a tool to evaluate the impact of science-policy interface network on a particular society, based on five determinants (Scorecards on awareness, Know-how, Attitude, Participation, Self-evaluation).


      The discussants agreed about the importance of using a mixed-method approach to combine both qualitative and quantitative indicators. According to Raphael Nawrotzki, mixed-method approach, is needed especially when evaluating relevance of the research questions and fairness of the process.

      Quantitative methods: strengths and limitations

      Among quantitative methods, the use of bibliometric analysis was mentioned for:

      • evaluating science impact, i.e., impact within a scientific field, still measured best by the number of citations that an article or book chapter receives (Raphael Nawrotzki);
      • assessing the legitimacy of the research findings and the credibility of the knowledge products (Nanae Yabuki);
      • providing a good indication of the quality of science (QoS), since published papers have already passed a high-quality threshold as they have been peer-reviewed by experienced scientists (Jillian Lenne);
      • providing an important overview of the efforts made, and the scientific outreach achieved (Paul Engel).

      Thinking about QoS and innovation evaluation, Rachid Serraj illustrated the use of bibliometric and citation indices from the Web of Science (WoS).

      Etienne Vignola-Gagné, co-author of the Technical Note, highlights the new and broader range of dimensions of bibliometrics indicators —namely cross-disciplinarity, gender equity, pre-printing as an open science practice, or the prevalence of complex multi-national collaborations— useful for assessing relevance and legitimacy. Some bibliometric indicators can also be used as process or even input indicators, instead of their traditional usage as output indicators of effectiveness. Bibliometrics can be used to monitor whether cross-disciplinary research programs are indeed contributing to increased disciplinary integration in daily research practice, considering that project teams and funders often underestimate the complexity of such research proposals.

      For Valentina de Col bibliometrics (e.g., indexing of the Web of Science Core Collection, percentage of articles in Open Access, ranking of journals in quartiles, Altmetrics) were used on published journal articles, and Outcome Impact Case Reports (OICRs) to describe the contribution of CGIAR research to outcomes and impact. Raphael Nawrotzki suggested other related bibliometric indicators: (a) contribution to SDGs; (b) average of relative citation; (c) highly cited publications; (d) citation distribution index.

      Keith Child and Serdar Bayryyev noted limitations of bibliometric analysis. For example, not all science, innovation and research products are included and properly recorded in the bibliographic databases, or not even published, hence not all products can be assessed. Furthermore, calculating average number of citations, also presents the basis for some biases: (1) overly exaggerated attention to a specific author; (2) some authors may also deliberately exclude certain reference materials from their publications. Raphael Nawrotzki noted limitation specifically associated with measuring scientific impact through bibliometrics: (1) Long time periods (it can take decades for results from investments in agricultural research to become visible; a robust measurement of science impact in terms of bibliometrics is only possible about 5 years after a research project or portfolio has been completed); (2) Altmetrics (it is difficult to combine bibliometrics and altmetrics to get a full picture of scientific impact); (3) Cost effectiveness (the fraction of support attributable to each funding source is not easily determined; computing cost-effectiveness measures comes with a host of limitations). Paul Engel extended the list of limitations of bibliometrics: it provided very little information on policy outreach, contextual relevance, sustainability, innovation and scaling of the contributions generated through research partnerships. Ola Ogunyinka asserted that the ultimate beneficiaries of the CGIAR (smallholder farmers and the national systems) are far removed (access, funds, weak structures etc) from the journals considered in the bibliometric analyses.

      Overall, Jill Lenne and Raphael Nawrotzki agreed on the value of using altmetrics.

      Graham Thiele suggested use of social network analysis (SNA) of publications, to explore who collaborates to these and what is their social and organizational context as a complement to bibliometric analysis, particularly for the legitimacy dimension. Valentina de Col used SNA and impact network analysis (INA) to investigate the research collaboration networks of two CGIAR research programs.

      Finally, Graham Thiele warned against the risk of using state of the art methods and increased precision of bibliometric analysis (currently available and produced on a continuous basis) at the expense of missing the rounded picture that other studies —such as outcome case studies and impact studies— provide. This point is supported by Paul Engel, based on his experience evaluating Quality of Science in CGIAR research programs.

      Guy Poppy introduced the Research Excellence Framework (REF), which alongside assessing research outputs, also evaluates impact case studies and the research environment producing a blended score with outputs having the biggest weighting but impact growing in weighting.

      Qualitative methods: strengths and limitations

      Using qualitative methods, along with bibliometrics and altmetrics, is essential for a broader picture when assessing quality of science.

      Qualitative assessments can be done through interviews and/or surveys. With regards to measuring impact, Valeria Pesce highlighted that qualitative indicators are often based on either interviews or reports, and making sense of the narrative is not easy. She echoed post from Claudio Proietti, who introduced ImpresS.

      Ibtissem Jouini highlights trustworthiness of evidence synthesis, yet challenged by the variety of evidence that can be found, the evaluation criteria, approaches, focus, contexts, etc.

      Limitations of qualitative methods were also noted by Jillian Lenne and Keith Child –qualitative assessments require the evaluator to make subjective judgments.

      Consideration for participatory approaches and methods were highlighted by Sonal D Zaveri: the difference between "access" and "participation", passing though the concept of power (hidden or explicit). Women as traditional bearers of local and indigenous knowledge find themselves cut off from the networked society, where information, communication, and knowledge are ‘tradeable goods’.

      Valeria Pesce, Etienne Vignola-Gagné and Valentina de Col discussed actual tools and ways to address challenges of both qualitative and quantitative indicators: IT tools, that allow to (sometimes automatically) classify against selected concepts, identify patterns, word / concept frequency, clusters of concepts etc., using text mining and Machine Learning techniques, sometimes even starting directly from video and audio files.

      For narrative analysis: ATLAS.ti, MAXQDA, NVivo- powerful narrative analysis;  Cynefin Sensemaker and Sprockler  for design and collection functionalities NarraFirma - strong conceptual backbone, helping with the design of the narrative inquiry and supporting a participatory analysis process.

      Conclusion and next steps:

      Even without standardization of methods, efforts should be made to design the STI evaluations so that the evaluation results can be used, to the extent possible, at the institutional level, for instance for higher strategic and programmatic planning (Nanae Yabuki, FAO), but also at the level of those who are affected and impacted (Sonal D Zaveri, and others).

      Valentina de Col highlighted the value of consolidating and adopting a standardised approach to measure quality of science (QoS) within an organization like CGIAR, to help measure better the outcomes, assess effectiveness, improve data quality, identify gaps, and aggregate data across CGIAR centres.

      The worthiness of this discussion for learning is undeniable. In CGIAR, at CAS/Evaluation we have started developing the guidelines to operationalize quality of science evaluation criterion in the revised CGIAR Evaluation Policy. Let us know if you are interested in further engagement.  

      Referenced and Suggested readings

      Alston, J., Pardey, P. G., & Rao, X. (2020) The payoff to investing in CGIAR research. SOAR Foundation.

      Belcher, B. M., Rasmussen, K. E., Kemshaw, M. R., & Zornes, D. A. (2016). Defining and assessing research quality in a transdisciplinary context. Research Evaluation, 25(1), 1-17.
      DOI: 10.1093/reseval/rvv025

      Norbert F. Tchiadjé, Michel Tchotsoua, Mathias Fonteh, Martin Tchamba (2021). Ecological engineering to mitigate eutrophication in the flooding zone of River Nyong, Cameroon, Pages 613-633:

      Chambers, R. (1997). Whose reality counts (Vol. 25). London: Intermediate technology publications.

      Evans, I. (2021). Helping you know – and show – the ROI of the research you fund. Elsevier Connect.

      Holderness, M., Howard, J., Jouini, I., Templeton, D., Iglesias, C., Molden, D., & Maxted, N. (2021). Synthesis of Learning from a Decade of CGIAR Research Programs.

      IDRC (2017) Towards Research Excellence for Development: The Research Quality Plus Assessment Instrument. Ottawa, Canada.

      Lebel, Jean and McLean, Robert. A Better Measure of research from the global south, Lancet, Vol 559 July 2018. A better measure of research from the global south (

      McClean R. K. D. and Sen K. (2019) Making a difference in the real world? A meta-analysis of the quality of use-oriented research using the Research Quality Plus approach. Research Evaluation 28: 123-135.

      Ofir, Z., T. Schwandt, D. Colleen, and R. McLean (2016). RQ+ Research Quality Plus. A Holistic Approach to Evaluating Research. Ottawa: International Development Research Centre (IDRC).

      Runzel M., Sarfatti P. and Negroustoueva S. (2021) Evaluating quality of science in CGIAR research programs: Use of bibliometrics. Outlook on Agriculture 50: 130-140.

      Schneider, F., Buser, T., Keller, R., Tribaldos, T., & Rist, S. (2019). Research funding programmes aiming for societal transformations: Ten key stages. Science and Public Policy, 46(3), pp. 463–478. doi:10.1093/scipol/scy074.

      Singh,S, Dubey,P, Rastogi,A and Vail,D (2013) Excellence in the context of use-inspired research: Perspectives of the global South Perspective.pdf (

      Slafer G. and Savin R. (2020) Should the impact factor of the year of publication or the last available one be used when evaluating scientists? Spanish Journal of Agricultural Research 18: 10pgs.

      Vliruous (2019). Thematic Evaluation of Departmental Projects: Creating the Conditions for Impact.

      Zaveri, Sonal (2019). “Making evaluation matter: Capturing multiple realities and voices for sustainable development” contributor to the journal World Development - Symposium on RCTs in Development and Poverty Alleviation.

      Zaveri, Sonal (2021) with Silvia Mulder and P Bilella, “To Be or Not to Be an Evaluator for Transformational Change: Perspectives from the Global South” in Transformational Evaluation: For the Global Crisis of our Times edited by Rob Van De Berg, Cristina Magro and Marie Helene Adrian 2021-IDEAS-book-Transformational-Evaluation.pdf (

      Zaveri, Sonal. 2020. ‘Gender and Equity in Openness: Forgotten Spaces’. In Making Open Development Inclusive: Lessons from IDRC Research, edited by Matthew L. Smith, Ruhiya Kristine Seward, and Robin Mansell. Cambridge, Massachusetts: The MIT Press.


    • Dear all, 

      It is great to see such rich and insightful contributions. It seems there is a wide consensus on the importance of the use of mixed methods for evaluating science and relying on QoR4D frame of reference. I really enjoyed reading your opinions and what you found challenging during your experiences. 

      With an additional day for discussion (through tomorrow, April 13th) we still hope for additional contributions. The following may further insight and guide those who have not shared their views and experiences yet, especially outside of CGIAR context.  

      There is an interesting possible debate between what funders may find important to evaluate against the priorities for Southern researchers. My understanding is that funders are widely interested in results (outputs, outcomes, impacts) and that OECD DAC evaluation criteria “impact” and “efficiency” are of particular relevance in terms of accountability and transparency to demonstrate that tax-payer money is used wisely. However, Southern researchers prioritize the need for research to be relevant to topical concerns, to the users of research and to the communities where change is sought. The importance of the relevance dimension was highlighted in several contributions and is mostly related to the importance, significance, and usefulness of the research objectives, processes, and findings to the problem context and to society.  

      How could relevance be measured in a way that the needs of southern researchers and funders converge? When talking about impacts, Raphael Nawrotzki mentions that the impact within a scientific field is still measured best by the number of citations that an article or book chapter receives.  

      The question to ask from the audience is “How the impact within a scientific field can reflect the importance of a certain research project and its impact/contribution to the society where change is sought?”. There seems to be a consensus on importance of ‘relevance’ component and the relation between the research output and the original Theory of Change, and a process that was followed in its development. Can this be aligned to measuring ‘relevance’ in a way that can also be considered solid and credible to funders?  

      And last but not the least, what about practice “Have you seen monitoring, evaluation and learning (MEL) practices that could facilitate evaluations of science, technology and innovation?” 

      We look forward to more sharing to close off this discussion, and identifying opportunities for further engagement  


    • Dear colleagues,

      This discussion has already been an insightful and reassuring of the standards in the evaluation profession.

      It would be great to unpack the issue of 'indepedence', to contextualize based on an example not previoulsy highllighted.  For those members who have consulted for independent evaluation offices , i.e. of Rome-base agencies, AfDB, IEG, GEF, any others, and those members who work in those evaluation offices- what does 'indepedence' mean between evaluation office and a consultant/team hired to implement an evaluation.

      - How 'independent' is a consultant/consultant team from a commisioner (aka indeedent evaluation office)?

      - Is there a point at which technical guidance and quality assurance by the commisioner (aka indeedent evaluation office)  threatens indepedence of the evaluaton consultant/team?

      - What about collected evidence (interview notes): is commisioner entitled to obtain them, to be able to 'fall back' on, once the evaluation consultant/team is no longer under contract?  

      In pondering about the issue, let us all be reminded that independent evaluation arrangaments do not report to management by design of the governance and assurance structures. 

      I am looking forward to hearing from all of you, on both sides.


      Svetlana Negroustoueva

      Lead, Evaluation Function

      CGIAR Advisory Services Shared Secretariat (CAS) Rome, Italy


    • Dear colleagues,

      Thank you for your interesting insights, it is great to see an overall consensus on an important role that Theories of Change play in evaluation practice. Throughout my evaluation career, and especially more recently, I have come to appreciate the value-added of using TOC, especially if it has been co-created and/or deconstructed in a participatory manner.

      Comparatively speaking, I have found TOC approach particularly valuable when evaluating cross-cutting themes, such as local stakeholder and civil society engagement, governance and gender. Even in the presence of documents, that guide related interventions (similar to sectors), their effective implementation should take into account, and mainstream civil society engagement, gender, accountability and transparency in the work of other sectors, teams, etc. Thus ToCs help inform the evaluations and facilitate exploration against the envisioned process and outcomes, against the existing framework and operational modalities, within and external to the organization.

      In these cross-cutting domains and sometimes beyond, I also have found that teams that are being evaluated more often appreciate and welcome discussions of the TOC, including and sometimes with a particular appreciation of assumptions. Once posed with questions about feasibility within an enabling (or not) environment, the realization of why desired outcomes may have not been achieved becomes real. Consequently, having gone through TOC reconstructing, ambitions and targets are likely to become sharper and more streamlined next time around.

      Svetlana Negroustoueva