ORGAPET Section C4:
Evaluating Policy Outcomes Using Stakeholder Feedback and Expert Judgement

Ian Jeffreys and Nic Lampkin
Aberystwyth University, UK

Raffaele Zanoli and Daniela Vairo

Polytechnic University of Marche, Ancona, IT

Version 6, April 2008

C4-1 Introduction

Action plans will be evaluated in part by a group of indicators that reflect the goals and objectives of stakeholders (see Section C3). The objectives and indicators reflect the complex nature of the action plans and the effects of the action plans on the organic sector, and the rural and natural environment. However, many of the identified objectives are not easy to link in a causal relationship to specific policy measures or action points, and many indicators will prove difficult to quantify in terms of their direct impact (such as public health). In other cases, suitable data may simply be unavailable due to a lack of (or high cost of) monitoring. In these situations, it may be necessary to rely on a qualitative assessment or expert judgement techniques to determine the contribution of an action plan or specific policy. The MEANS (EC, 1999) and Evalsed frameworks identify a range of stakeholder feedback and expert judgement techniques that can be used in this context, including survey, interview and interactive group approaches, as well as more formalised techniques such as Delphi surveys and Nominal Group Technique (NGT). These techniques do not replace the more quantitative approaches outlined in Section C3, but can be seen as supplementary and may also be part of the synthesis approaches required to reach overall conclusions on the evaluation (see Section D1). Some of the techniques described are appropriate to be used with stakeholder steering groups that might be guiding the development, implementation and evaluation of action plans.

In this section, three of these tools are discussed in more detail: focus groups, Delphi surveys and Nominal Group Technique. The background and theoretical basis to each of the techniques are presented, along with examples of their application in recent organic food and farming policy and marketing contexts.

While the evaluation team conducting will need to be familiar with a particular technique if they wish to apply it, it is not necessary for the stakeholders or experts being consulted to be familiar with the techniques. Stakeholders/experts should be selected on the basis of their engagement with and/or knowledge of the subject matter, not the method being applied.

C4-2 Stakeholder feedback

Evalsed identifies a range of approaches that can be used to integrate stakeholder and other feedback as part of the evaluation process:

Social surveys involve putting a series of standard questions in a structured format to a sample of individuals who are usually selected as being representative of the population under observation. They are normally conducted where the population to be observed is large and homogeneous, the investigator has a precise and clear idea of what he or she wants to observe (in which case the simplest survey consists of closed questions to which a series of replies are given from a number of predetermined responses), and/or the evaluators want to test out an hypothesis or to collect objective facts.
Beneficiary surveys are a particular application of the use of questionnaire surveys to elicit information from those directly affected by an intervention and presumed to benefit from its consequences (whether individuals, organisations or communities). Unlike other methods of observation of beneficiaries (e.g. case studies or ethnographic observation) surveys aim to produce results which can be generalised across the target group.
Individual stakeholder interviews consist of an in-depth conversation with an individual, conducted by trained staff. The purpose is usually to collect specific qualitative information and opinions of those persons affected by a particular programme or project, its context, implementation, results and impact. Several forms of interview can be distinguished, each of which fulfils a different purpose: the informal conversation interview; the semi-structured, guide-based interview; and the structured interview (the most rigid approach). In-depth interviews can help develop preliminary ideas for actions to be undertaken as well as provide feedback on all aspects of programme inputs and outputs; provide a history of behaviour; highlight individual versus group concerns and reveal divergent views or outlier attitudes.
Case studies involve in-depth study of a phenomenon in a natural setting, drawing on a multitude of perspectives. They aim to build up very detailed in-depth understanding of complex real-life interactions and processes. The defining feature of the case study is that it is holistic, paying special attention to context and setting. The case study may be a single case, or it may include multiple cases. Provided resources are adequate, multi-site case studies provide rich opportunities for theoretically-informed qualitative evaluation. Case studies raise a number of issues at the design stage. What will count as a 'case'? What is the basis for selecting cases, and how many? What units of analysis will be included within the case, and how must the data be organised to allow meaningful comparisons to be made? What kind of generalisation is possible? Typically, generalisation is much more difficult than with surveys, for example, but the benefits of greater in-depth information and insights can offset this.
Observational techniques, including participant observation and ethnographic approaches, are a form of naturalistic inquiry that allow investigation of phenomena in their naturally occurring settings. Participant observation is where the researcher joins the population or its organisation or community setting to record behaviours, interactions or events that occur. He or she engages in the activities that s/he is studying, but the first priority is the observation. Participation is a way to get close to the action and to get a feel for what things mean to the actors. As a participant, the evaluator is in a position to gain additional insights through experiencing the phenomena for themselves. Participant observation can be used as a long- or short-term technique. The evaluator/researcher has to stay long enough, however, to immerse him /herself in the local environment and culture and to earn acceptance and trust from the regular actors. By contrast, pure observation consists of observing behaviour and interactions as they occur, but seen through the eyes of the researcher. There is no attempt to participate as a member of the group or setting, although usually the evaluator has to negotiate access to the setting and the terms of research activity. The intention is to 'melt into the background' so that an outsider presence has no direct effect on the phenomena under study. He or she tries to observe and understand the situation 'from the inside'. Aspects of the ethnographic approach are sometimes incorporated into observational methods, as for example where interest is not just in behaviours and interactions but also in features and artefacts of the physical, social and cultural setting. These are taken to embed the norms, values, procedures and rituals of the organisation and reflect the 'taken for granted' background of the setting which influences behaviours understandings, beliefs and attitudes of the different actors. Another form of naturalistic inquiry that complements observational methods is conversation and discourse analysis. This qualitative method studies naturally occurring talk and conversation in institutional and non-institutional settings, and offers insights into systems of social meaning and the methods used for producing orderly social interaction. It can be a useful technique for evaluating the conversational interaction between public service agents and clients in service delivery settings.
Participatory monitoring and evaluation is an umbrella term for a set of new approaches that stress the importance of taking local people's perspectives into account and giving them a greater say in planning and managing the evaluation process. Local people, community organisations, NGOs and other stakeholder agencies decide together how to measure results and what actions should follow once this information has been collected and analysed. The emphasis on participatory goes beyond the choice of particular methods and techniques to wider consideration of who initiates and undertakes the evaluation process and who learns or benefits from the findings. Recent literature stresses the significance of attitudes and behaviours (on the part of the evaluator) as integral to a participatory approach. Although the focus has tended to be on community-based approaches where local people are the primary focus, other forms of participatory monitoring and evaluation are geared to engaging lower level staff in assessing the effectiveness of their organisation, and working out how it can be improved. Participatory processes are considered further in ORGAPET Sections A4 and B3.
Expert panels are specially constituted work groups that meet for evaluations. They are usually made up of independent specialists recognised in the fields covered by the evaluated programme in the evaluation process, usually as a mechanism for synthesising information from a range of sources, drawing on a range of viewpoints, in order to arrive at overall conclusions. Expert panels are a means of arriving at a value judgement on the programme and its effects, which incorporates the main information available on the programme, as well as numerous previous and external experiences. Their role is considered in more detail in ORGAPET Section D1.
Other techniques that can be applied in the context of working with stakeholders and experts, and which may also supplement the techniques described here, are considered in ORGAPET Section A4.

Further information on the application of these methods in the context of policy evaluations can be found by following the links provided.

C4-3 Focus groups

Evalsed describes the focus group as a well-established method of social inquiry, taking the form of structured discussion that involves the progressive sharing and refinement of participants' views and ideas. The technique is particularly valuable for analysing themes or fields which give rise to divergent opinions or which involve complex issues that need to be explored in depth. The focus group is one of a family of group-based discussion methods. The typical format involves a relatively homogenous group of around six to eight people who meet once, for a period of around an hour and a half to two hours. The group interaction is facilitated by the evaluator or researcher who supplies the topics or questions for discussion. A variation is the workshop, implying a larger group, meeting in a larger session, with a more structured agenda. Other innovative approaches involve the application of discussion group methods to decision-making. These include, for example, citizens' juries which bring together groups of between 12 and 30 people over the course of several days. They hear from 'witnesses', deliberate, and make recommendations about courses of action. Variations of this consultative technique include deliberative polls and consultative panels. The common features of these methods are that they combine opportunities for accessing information with discussion and deliberation. Although focus groups and other kinds of group-based discussions usually involve a physical coming together of participants, there is a growing interest in virtual groups that exploit advances in new information and communication technologies. The conduct of telephone groups using teleconferencing technology has, in recent times, been supplemented by online focus groups, involving web-mediated synchronous and asynchronous discussion.

C4-3.1 Focus group principles

According to Stewart and Shamdasani (1990), the focused group interview had its origins in the evaluation of audience response to radio programs in 1941 by Robert Merton, a prominent social scientist. Merton applied this technique to the analysis of army training and morale films during World War II. The focus group process evolved from the focused interview (Merton et al., 1956) and group therapy methods of psychiatrists (Linda, 1982). A moderator facilitates the discussion of a particular topic with a homogeneous group of participants.

Stewart and Shamdasani (1990) have summarised the more common uses of focus groups to include: obtaining general information/feedback about a topic of interest, identifying problem areas, gathering consumers’ impressions about a product/service, stimulating new ideas and creative concepts. Specific subject areas in which focus groups have been used include: market research (where it was first used) to gather consumer perceptions and opinions on new product characteristics (Stewart and Shamdasani, 1990); public relations and graduate programme assessment (Sink, 1991); advertising (Linda, 1982); healthcare and family-planning projects (Bertrand et al., 1992); political campaigns and club member services (Lydecker, 1986); training evaluation (O'Donnell, 1988); and research efforts (questionnaire development and hypothesis formulation) (Morgan, 1988).

Focus groups are one of the most frequently used techniques in market research. A focus group can be defined as a loosely-structured interactive discussion conducted by a trained moderator with a small group of respondents. Focus groups are normally composed of 8 to 12 individuals (Byers and Wilcox, 1991; Stewart and Shamdasani, 1990). Smith defined group interviewing to be "...limited to those situations where the assembled group is small enough to permit genuine discussion among all its members" (Smith, 1954: 59, cited in Stewart and Shamdasani, 1990: 10). However, the number of participants will depend on the objectives of the research (Stewart and Shamdasani, 1990). A session typically lasts between one and two hours. Participants’ comments are usually recorded on audio or videotapes which eventually become the basis for a report summarising the contents of the discussion. The number of sessions conducted on a topic varies. Although Calder (1977) suggests that a sufficient number should be conducted until the moderator can anticipate what the participants are going to say, he indicates that this usually happens after three or four sessions on the same topic. The value of the technique lies in discovering the unexpected, which emerges naturally from a free-flowing group discussion.

Focus group are not designed to help a group reach consensus or to make decisions, but rather to elicit the full range of ideas, attitudes, experiences and opinions held by a selected sample of respondents on a defined topic. It is useful to distinguish focus groups from procedures that utilise multiple participants but do not allow interactive discussions, such as Delphi surveys.

Following Frey and Fontana's (1993) approach, there are five relevant dimensions to define group discussions: the role of moderator (directive or non-directive), the degree of group interactions (low, medium, high), the structure of questions (low, medium, high), the location of the interview (natural or artificial) and the nature of setting (formal, informal, etc.). In this classification, focus groups differ from brainstorming due to the directive role of the moderator and the highly structured nature of the interview. Compared with Nominal Group Technique, the focus group technique is different because it starts from the guidelines and not from participants' opinions. Finally, compared with the Delphi technique, focus group technique adds the element of group interaction.

In other words, the moderator creates a ‘permissive’ environment, whereby participants are encouraged to put forward their views and opinions. It is the interaction of the group that itself provides the key to the production of the data. Moderators need not necessarily be a hired external professionals. Morgan and Krueger (1998) report many cases of ‘collaborative focus groups’ that place volunteers, staff members and non-researchers within an organisation at the centre of the focus group sessions. These individuals are often carefully recruited and possess certain talents. Results from collaborative focus groups are often of higher credibility for the communities, and the resulting study is also credible for the researcher. Volunteers can often gather and present results more effectively than professionals. A critical element, however, is how volunteers are trained and the manner in which they work together.

C4-3.2 Focus groups in practice

This section presents two recent examples of the use of focus groups.

Zanoli (2004) and colleagues conducted a comprehensive consumer study as part of the EU-funded project 'Organic Market Initiatives and Rural Development' (OMIARD). The aim was to explore attitudes, motives, expectations and barriers towards organic products and organic farming, with a particular focus on ethical, social and environmental dimensions. Seventy-two focus group discussions were conducted in eight European countries. Participants were split into two sub-groups: regular and occasional consumers of organic food. For each sub-group, three sessions were conducted in each country. A copy of the guidelines for this study and the summary results for Italy are presented in Annex C4-5 and Annex C4-6 respectively.

RAND and the Delft Hydrological Laboratory (Kahan, 2001) conducted a policy analysis on river dike strengthening in the Netherlands. The study group conducted five focus groups to explore the opinions of different stakeholder groups. The focus groups, each with between 10 and 16 participants, were drawn from: environmental activists; environmental advocates (but not activists); the elected Waterschappen [water boards] or local governmental agencies charged with flood protection; people living along the dikes and therefore most affected by risks of flooding and damage to dike construction; and people living in cities, who were only indirectly affected but who paid taxes for the dikes. Members of each of these constituencies were recruited for separate focus groups that examined their ecological values and concerns about flooding and their views for the appropriate procedures for making decisions regarding the dikes. The groups revealed, to the surprise of the actors in the political debate, a remarkable similarity of perspective. The focus groups provided a forum for this participation and permitted the spectrum of different constituencies to have a voice in the process. This greatly aided the acceptance of the research findings.

C4-4 Formalised expert judgement methods

C4-4.1 Delphi method

The Delphi method was developed by the RAND Corporation in the 1950s, to elicit expert opinion on the impacts of possible military attacks on the USA. Linstone and Turoff (1975: 3) described the Delphi process as follows:

“Delphi may be characterised as a method for structuring a group communication process so that the process is effective in allowing a group of individuals, as a whole, to deal with a complex problem”

Dalkey and Helmer (1963) published the first paper on the Delphi method, characterising the process as the repeated questioning of experts, using questionnaires and interviews, without direct contact between the experts. The questioning was focused on a central problem and the information that experts would require in order to make a more informed appraisal of this problem. Feedback to experts was in the form of the experts' requests for information about factors or considerations considered potentially relevant by an expert. Turoff (1975) described the Delphi process as one in which a small group of researchers design a questionnaire and send this out to a larger group of experts. The questionnaire is returned to the researcher group; the responses are summarised and, based on the results, a new questionnaire is produced and sent to the expert group. The expert group is given a number of opportunities to review and revise their responses to the questionnaires. Turoff continues to define two variations of study: the 'Delphi' process is concerned with elicited estimations and valuations, whereas the 'policy Delphi' includes the generation of ideas as well as valuations.

Linstone and Turoff (1975: 5-6) define four phases of the Delphi process:

Exploration of the subject, in which each individual contributes information relevant to the issue as well as ideas for the 'policy Delphi'.
This involves a process of ascertaining the respondent group’s views on the issue, identifying areas of agreement and disagreement.
Where there is significant disagreement, the underlying reasons for the disagreement are sought and analysed. If, as in Dalkey and Helmer’s (1963) original Delphi study, the question relates to an estimate of quantity, respondents offering extreme estimates are asked to submit a rationale for these.
The final phase is where the information gathered has been analysed. Feedback to respondents takes the form of the evaluation and findings drawn from all the gathered information. This is described by Rowe and Wright (1999: 354) as a statistical aggregation of group response.

In the original Delphi of Dalkey and Helmer, all the rounds were highly structured with definitive questions to be answered. This has evolved somewhat and Rowe et al. (1991) suggest that the first round should be unstructured, allowing the respondents free scope to explore and comment on the issue. Subsequent rounds should be structured to reflect the initial ideas of the respondents, and this iterative process is repeated until consensus is reached. Turoff asserts that this should be achieved in three to five iterations but states that the researcher is likely to reach a point of diminishing returns after three rounds (Turoff, 1975: 88; 229).

It is unclear in Dalkey and Helmer’s paper whether the forecasts of other respondents were communicated to all the respondents as part of the feedback process. Armstrong (1999) suggests that this information would bias the procedure and should be withheld in the interest of a robust process. Other features of most Delphi studies include anonymity of the respondents or the experts. This fulfils a number of functions: it allows the participants to have input without making a public statement on the issue (Meyrick, 2001: 5) and allows experts to respond without undue social pressures from dominant or dogmatic individuals or from the majority (Rowe and Wright, 1999: 354).

C4-4.2 Nominal Group Technique

The Nominal Group Technique (NGT) is a specific example of expert panel approaches and was developed in 1968 by Andre Delbecq and Andrew Van de Ven from studies of decision conferences, aggregation of group judgement and problems involving citizens in planning (Delbecq et al., 1975: 7). They espouse the use of NGT (and Delphi) for “situations where individual judgements must be tapped and combined to arrive at a decision which cannot be calculated by one person. They are problem-solving or idea-generating strategies, not techniques for routine meetings, co-ordinating, bargaining or negotiations” (p.4). These processes are concerned with the generation of ideas and knowledge required for a successful solution.

NGT is also known as ‘estimate-talk-estimate’ and uses the same basic structure as the Delphi method in a group situation. Estimates are taken anonymously and presented to the group for discussion, and estimates are re-taken and re-presented. The process involves the following steps (Delbecq et al., 1975: 8):

Silent and individual (nominal) generation of ideas in writing.
Presentation of a brief summary of all ideas, and round-robin feedback on ideas.
Discussion of each recorded idea for clarification and evaluation.
Individual voting on the reactive priority of the ideas by rank-order or rating judgements – the group’s final decision is based on the aggregation of the evaluations.

Delbecq et al. (1975: 9) suggest that, prior to step 1, there is a step that includes an introduction to the process and its objectives. The purpose of this is to introduce different approaches appropriate to the different phases of the decision-making, allowing balanced and equal input from all participants, incorporating mathematical aggregation of the group’s judgement. The strength of NGT lies in the separation of idea generation and idea evaluation from the group situation. This separation ensures that ideas from all group members are given equal consideration within the process, and addresses some of the issues with regard to domineering or high-status group members dominating and excluding others from the decision-making process, thus limiting the quality and acceptability of the outcomes.

Liou (1998: 2-7) suggests that the use of Group Decision Support Systems (GDSS) will aid the NGT process, while Delbecq uses a flip chart for presenting individual ideas. The GDSS allows individuals to input data and ideas simultaneously into a shared software environment, thus allowing rapid input and feedback on ideas.

C4-4.3 Issues and discussions on the application of Delphi and NGT

There are number of issues that need to be considered in designing and applying Delphi and NGT.

C4-4.3.1 Consensus and stability

When using Delphi and NGT for policy evaluation, the task is to draw out estimations and valuations and, therefore, is more like the original Delphi of Dalkey and Helmer (1963), rather than the policy Delphi of Turoff (1975). In each round, a refined estimation is sought in the light of discussions. To start the discussion, a statistical summary of the results of the previous round is presented to the domain experts. This continues for a set number of rounds or until consensus or stability is reached. Consensus is said to be reached when the scope of responses fall within an arbitrary range (e.g. from 5% to 20%) or within a measure of statistical significance. Turoff (1975: 277) suggests that consensus can be said to be reached when the inter-quartile range is no greater than two units in a ten-point scale. The other measure used to end the Delphi iteration is a measure of stability – that is, when the response from the experts does not change between rounds. If the experts have considered all the feedback and undertaken at least two rounds of discussion and evaluations do not change, these accords are considered to present a final and unchanging opinion.

Rowe and Wright (1999: 363) assert that in the Delphi studies reviewed in their paper, conformity to a group view, or 'group think', often replaces true consensus, and that experts with divergent opinions will either conform with the group view or abandon the process. They suggest that further studies are required to evaluate the extent of these phenomena.

C4-4.3.2 Simplification of issues

Linstone and Turoff (1975: 579) identify a tendency to use a reductionist approach, applied in classical science to simplify complex, and especially socio-economic, systems. However, Meyrick (2001: 9) asserts that in a highly complex model, some simplification is required to produce a number of manageable solutions. A balance must reached between producing a managed system and maintaining an overview and understanding of the complexity of the issue.

C4-4.3.3 Expertise

Issues of simplification are compounded by the problem of illusory expertise. Experts in many fields have a history of underestimating the costs of projects or the capacity of a resource. Linstone and Turoff (1975: 581) give the example of underestimating the high cost of developing new technologies. They also state that with greater familiarity and specialisation as regards one aspect of an issue, comes a greater risk of introducing (possibly unconscious) bias. One example from land-use management is the possibility of environmental engineers only seeing the engineering solutions to a given problem, whereas others may identify other management options. In addressing these points, Meyrick states that it is the responsibility of the researcher to ensure that the expert panel represents a comprehensive mix of perspectives and disciplines which retains an understanding of the inter-relationships and complexity of an issue and possible management options.

Reid (1988) suggests that a major weakness of the Delphi technique is the selection of members for the expert panel. Mullen (2000) citing Sackman (1975) asks “What is an ‘expert’ in the target field” and “how are such experts operationally defined?”. They also question whether the responses from 'experts' are significantly better than the input of informed 'non-experts'. Loveridge (2001) addresses the question of defining expertise by the use of self-assessed measures (Box C4-1).

Mullen (2000) citing Pill (1971) suggests that anyone with relevant input should be considered an 'expert'. An additional criterion has been added to Loveridge’s definition of expertise to capture the practical experience and knowledge of farmers and land managers. The further text: “if you understand this topic and use this knowledge in land-use management” has been added to the knowledgeable rating presented in Annex C4-3. Mullen presents an example of a Delphi study related to health care in which patients are included as experts. Garrod (2003) suggests that panellists provide a brief personal profile through which their suitability to participate would be assessed. In terms of the composition of the panel, Garrod suggests that no more than one third of panellists should share the same profession or academic interest, and duplication in panellists’ interests should be avoided. He argues that the validity of panellists’ opinion comes from a careful selection procedure rather than large sample size, and suggests 15 members as an appropriate number of panellists (see also Box C4-2).

Box C4-1: Loveridge’s self-evaluation criteria: guidance to self-ranking of expertise

1. You are unfamiliar with the subject if the mention of it encounters a veritable blank in your memory or if you have heard of the subject yet are unable to say anything meaningful about it.

2. You are casually acquainted with the subject matter if you at least know what the issue is about, have read something on the subject, and/or have heard a debate about it on a major TV or radio network or on an educational channel such as the UK’s Open University.

3. You are familiar with the subject matter if you know most of the arguments advanced for and against some of the controversial issues surrounding the subject, have read a substantial amount about it, and have formed some opinions about it. However, if someone tried to pin you down and have you explain the subject in more depth, you would soon have to admit that your knowledge was inadequate.

4. You are knowledgeable with the subject matter if you were an expert some time ago but feel somewhat rusty now because other assignments have intervened (even though because of previous interest, you have kept reasonably abreast of current developments in the field); if you are in the process of becoming an expert but still have some way to go to achieve mastery of the subject; or if your concern is with integrating detailed developments in the area, thus trading breadth of understanding for depth of specialisation.

5. You should consider yourself an expert if you belong to that small community of people who currently study, work on and dedicate themselves to the subject matter. Typically, you know the literature of your country and probably the foreign literature; you attend conferences and seminars on the subject, sometimes reading a paper and sometimes chairing the sessions; you most likely have written up and/or published the results of your work. If any of your country's major scientific or technical institutions or any similar organisation were to convene a seminar on this subject, you would expect to be invited or, in your opinion, you should be invited. Other experts in this field may disagree with your views but invariably respect your judgement; comments such as ‘this is an excellent person on this subject’ would be typical when enquiring about you.

C4-4.3.4 Workload and attrition

Delphi is a time-consuming process and experts must be fully briefed prior to undertaking this process. Respondents may not have sufficient time available to complete the task and experts should be honestly informed of the time requirements of a Delphi or NGT analysis. Pressure may cause the expert to present a response which gains consensus rather represents their honest opinion. This can be said for any method of eliciting opinion from groups, such as focus groups. Linstone and Turoff also assert that participants under pressure will respond in haste, without adequate thought.

C4-4.3.5 Efficacy

Rowe and Wright (1999: 355) note that the majority of papers on the Delphi method are on the application of the method and that there has been little investigation into the usefulness of Delphi. Linstone and Turoff (1975: 277) comment that, apart from the original Delphi work (completed by Dalkey and the RAND corporation), subsequent analyses of the process have been "secondary efforts associated with some application which has been the primary interest". Rowe and Wright (1999: 355) observed that this was still the case at the time of their review in 1999.

It is often the stated aim of Delphi and NGT studies to gain consensus around a particular issue. Rowe and Wright (1999) suggest that the measure of reduced variance does not necessarily reflect true consensus but is caused by group pressure to conform, rather than a convergence in understanding and an acceptance of others’ arguments. An alternative measure, 'post-group consensus', was suggested by Rowe and Wright, which concerns the extent to which participants agree with the 'in-group consensus' value. In the three studies cited by Rowe and Wright, the post-group consensus was significantly different for final round assessment. To test this, Rowe and Wright (1999: 364) compared 14 studies in which results from a Delphi process were compared with the results using a staticised group (this is a simple approach in which a number of domain experts are canvassed regarding an estimated value for the issue in question; the value used in subsequent evaluations is the average value or a weighted average of the estimates): that is, the results of the final round of the Delphi analysis were compared to those of the first round before the respondents received any feedback. Five studies reported a statistically significant increase in accuracy over the rounds, seven studies reported an absolute increase in accuracy but no statistically significant difference, and two reported a statistically significant drop in accuracy. Comparing Delphi to interacting groups (a workshop situation), five studies showed in favour of Delphi, two found no difference and one was in favour of the interacting group. In comparison with NGT, Rowe and Wright found little difference between the two techniques.

Box C4-2: Delphi best practice guidelines proposed by Garrod (pers. comm., 2005)

1. The Delphi technique should not be seen as a main tool of investigation but a means of supporting/extending studies with better established and more reliable methods of investigation.

2. The topic must be appropriate, for example there must be no widely-perceived ‘correct answers’ to the questions posed.

3. Questions must be pilot-tested to avoid ambiguity

4. Panellists should be recognised experts in their field (a self-assessment selection procedure may be useful in this respect).

5. The panel should comprise a good balance of different disciplines and areas of expertise.

6. Adequate time must be given to experts to think deeply about the questions at hand.

7. Once a subsequent round has commenced, those completing the previous round late should nevertheless be excluded from continuing.

8. Criteria for panel balance should be set in advance. Should these no longer be met, the study should be terminated.

9. Attrition of the panel may be minimised by selecting experts who already have a strong interest in the outcome of the project.

10. This is preferable both to using monetary payment and moral persuasion as a means of ensuring that experts remain committed to the project.

11. Experts must also believe that the Delphi technique is a valid way of going about the task at hand.

12. Full anonymity must be preserved at all times between the panellists (but not necessarily between the panellists and the co-ordinating researchers).

13. The co-ordination group should make themselves available as a resource for locating further information on specific subjects or clarifying the questions.

14. The co-ordination group should intervene in the process as little as possible.

15. The panellists must do the initial scoping themselves, the co-ordination group should not set the agenda for discussion (although they will have to determine the research questions that will need to be answered through this process).

16. Where consensus is being sought, the co-ordination group should determine the criteria for bringing the consensus rounds to a close before the project begins.

C4-4.4 Recent examples

This section presents two recent examples of the use of Delphi and NGT. A Delphi study was undertaken by Padel and Midmore (2005) as part of the EU-funded Organic Market Initiatives and Rural Development (OMIaRD) project. A recent example of the use of NGT is the assessment of the Welsh Organic Farming Scheme and the Tir Gofal Agri-environment Scheme undertaken as part of the EU-funded Further Development of European Organic Farming Policies (EU-CEE-OFP) project.

The Padel and Midmore (2005) study concentrated on emerging issues concerning the development of organic markets and rural development in Europe. They used the 'policy Delphi' variation as described by Turoff (1975), which aims to provide for idea generation as well as evaluation and forecasts. Outputs of this study were a list of actors, events and influences ranked in order of importance related to the development of organic markets in Europe. This Delphi study was completed in three rounds of questions. The first round sought opinion on the organic market using six open questions; the second and third rounds drew on the issues developed in the first round. The respondents were requested to assess the importance of the issues at various scales. A copy of the guidelines for this study and the published paper are provided at Annex C4-1 and Annex C4-2 respectively.

The second example, from the EU-CEE-OFP project, was applied in three European regions: Wales and North East England in the UK and Canton Aargau in Switzerland. The aim of these studies was a set of valuations, and was similar to Dalkey and Helmer’s original Delphi in 1963. Each study was applied in the same manner; the Welsh study will be presented briefly here. In this study, experts were asked to evaluate the Welsh Organic Farming and Tir Gofal Agri-environment Schemes according to a range of criteria (similar to the Impact Indicator list in ORGAPET Section C3). The criteria were defined to reflect a broad range of agri-environmental and rural development policy objectives. The experts were asked to evaluate the schemes on a seven-point scale using their expert opinion. A computer-based group decision support system was used to collate and present individual evaluations. The system highlighted the points for which there was divergent opinion, and discussions were then focused on these divergent points. Points of agreement were not discussed. The outcome of this process was an evaluation of each scheme against the set of criteria, and a transcript of the discussions in which the rationales for each evaluation were discussed. A copy of the guidelines for this study and an evaluation of the Welsh workshops are provided at Annex C4-3 and Annex C4-4 respectively.

C4-5 Conclusions

The various methods outlined in this section all have strengths and weaknesses. Interactions with stakeholders and other experts can be both time consuming and expensive, but there may be significant benefits in terms of the quality of information obtainable and better decision-making. Key considerations for selecting the appropriate method include:

the evaluation questions to be addressed (fundamentally, this will influence who should be approached and the type of information that will be requested);
the resources available (postal surveys can be significantly less expensive to operate than other forms of surveys, interviews or case studies);
the time scale within which answers are required (for example, an NGT study will be completed in a substantially shorter time than a Delphi survey – typically, the data collection is completed in a single workshop lasting from two hours up to two days while the Delphi process may take three weeks for each round and nine weeks for a three round study);
logistics (can the stakeholders/experts gather in one place or do the evaluators need to got to them);
anonymity of respondents and interpersonal pressures (with focus groups, workshops and NGT, it can become clear who provided which comment or assessment; interpersonal issues and group dynamics can be an influence at this point and can bias the output of the process. In the Delphi approach, as well as surveys and individual interviews, the identity of individual respondents is not disclosed and there is less pressure to conform to 'group think').

In applying the techniques, there may also be considerations about the need to work separately with stakeholders and independent experts, and there may be difficulty in finding experts who are not already part of the action plan process or who are not in some way already interested stakeholders. The importance of this issue depends on the nature of the questions being asked and the ability of the evaluators to take account of the possible biases that might be present.

C4-6 Checklist

Is there a need to include stakeholder feedback in the evaluation? Identify the specific questions that need to be addressed.
Are there evaluation questions/indicators which (due to their nature or lack of data) are unquantifiable or too complex to be assessed using a statistical/evidence-based approach. Identify the specific questions that need to be addressed.

Which of the stakeholder feedback, focus group and expert judgement methods described above would be appropriate and why? Take account of the questions to be addressed, resource requirements, timescales and any other relevant considerations.
What are the outcomes and limitations of any assessments conducted?

C4-7 References

Armstrong, J. (1999) Introduction to paper and commentaries on the Delphi technique. International Journal of Forecasting, 15(4): 351-352.

Bertrand, J. T., J. E. Brown and V. M. Ward (1992) Techniques for analyzing focus group data. Evaluation Review, 16(2): 198-209.

Byers, P. Y. and J.R. Wilcox (1991) Focus Groups: A qualitative opportunity for researchers. Journal of Business Communication, 28: 63-78.

Calder, B. J. (1977) Focus Groups and the nature of qualitative marketing research. Journal of Marketing Research, 14: 353-364.

Dalkey, N. and O. Helmer (1963) An experimental application of the Delphi method to the use of experts. Management Science, 9(3): 458-467.

Delbecq, A., A. van de Ven and D. Gustafson (1975) Group Techniques for Program Planning: a Guide to the Delphi and Nominal Group Processes. Scott, Foresman and Company, Glenview.

Donnell, J. M. (1988) Focus groups: a habit-forming evaluation technique. Training and Development Journal, 42(7): 71-73.

EC (1999) The MEANS Collection: “Evaluating Socio-Economic Programmes”. Office for Official Publications of the European Communities Luxembourg.

Frey, J. H. and Fontana, A. (1993) The Group Interview in Social Research. In: D. L. Morgan (editor) Successful Focus Groups: Advancing the State of the Art. Sage, Newbury Park.

Garrod, B. (2003) Defining marine ecotourism: a Delphi study. In Garrod, B. and Wilson, J.C. (eds.), Marine Ecotourism: Issues and Experiences, Channel View, Cleveland.

Jeffreys, I. (2006) The use of the Nominal Group Technique for eliciting opinion for policy evaluation. In: Proceedings of the Joint Organic Congress, Odense, Denmark.

Kahan, J. P. (2001) Focus Groups as a Tool for Policy Analysis. Analyses of Social Issues and Public Policy, 1(1): 129–146.

Linda, G. (1982) Focus groups: a new look at an old friend. Marketing and Media Decisions, 17(9): 96-97.

Linstone, H. and M. Turoff (1975) The Delphi Method: Techniques and Applications, Addison-Wesley, Reading.

Liou, Y. (1998) Expert system technology: knowledge acquisition. In Liebowitz, J. (ed.), The Handbook of Applied Expert Systems, CRC Press, Boca Raton.

Loveridge, D. (2001) Who is an expert? Ideas in Progress – Paper number 22, University of Manchester.

Lydecker, T. H. (1986) Focus group dynamics. Association Management, 38(3): 73-78.

Merton, R. K., M. Fiske and P. Kendall (1956) The Focused Interview. Free Press, Glencoe.

Meyrick, J. (2001) The Delphi Method and Health Research, Macquarie Business Research Papers Number 11/2001, Macquarie University, Sydney.

Morgan, D. L. (1988) Focus Groups as Qualitative Research. Sage Publications, Beverly Hills.

Morgan, D. L. and R. A. Krueger (1998) The Focus Group Kit. Sage, Thousand Oaks.

Mullen, P. (2000) When is Delphi not Delphi? Discussion paper 37, Health Services Management Centre, University of Birmingham.

Padel, S. and P. Midmore (2005) The development of the European market for organic products: insights from A Delphi study. The British Food Journal, 107(8): 626:647.

Pill, J. (1971) The Delphi method: substance, context, a critique and an annotated bibliography. Socio-Economic Planning, 5(1): 57-71.

Reid, N. (1988) The Delphi technique: its contribution to the evaluation of professional practice. In Ellis, R. (ed.) Professional Competence and Quality Assurance in the Caring Professions, Chapman Hall, London.

Rowe, G. and G. Wright (1999) The Delphi technique as a forecasting tool: issues and analysis. International Journal of Forecasting, 15(4): 353-375.

Rowe, G., G. Wright and F. Bolger (1991) Delphi: a re-evaluation of research and theory. Technological Forecasting and Social Change, 39(3): 235-251.

Sackman, H. (1975) Delphi Critique: Expert Opinion, Forecasting, and Group Process, Lexington Books, Lexington.

Sink, D. W. (1991) Focus groups as an approach to outcomes assessment. American Review of Public Administration, 21(3): 197-204.

Stewart, D. W. and P. N. Shamdasani (1990) Focus Groups: Theory and Practice. Sage, London.

Turoff, M. (1975) The policy Delphi. In Linstone, H. and Turoff , M. (eds.) The Delphi Method: Techniques and Applications, Addison-Wesley, Reading.

Zanoli, R. (ed.) (2004) The European Consumer and Organic Food. Organic Marketing Initiatives and Rural Development series: Volume 4, University of Wales, Aberystwyth.

C4-8 Annexes

Annex C4-1: Guidelines for applying Delphi from the OMIaRD project

Annex C4-2: Padel, S. and P. Midmore (2005) The development of the European market for organic products: insights from a Delphi study. The British Food Journal, 107(8): 626-647

Annex C4-3: Guidelines for applying NGT from the EU-CEE-OFP project

Annex C4-4: Jeffreys, I. (2006) The use of the Nominal Group Technique for eliciting opinion for policy evaluation. In: Proceedings of the Joint Organic Congress, Odense, Denmark

Annex C4-5: Guidelines for focus group discussions from the OMIaRD project

Annex C4-6: OMIaRD project Italy focus group report

ORGAPET Section C4: Evaluating Policy Outcomes Using Stakeholder Feedback and Expert Judgement