ORGAPET Section A5:
Planning an Evaluation

Nic Lampkin, Ian Jeffreys
Aberystwyth University, UK

Johannes Michelsen
University of Southern Denmark

Matthias Stolze, Hanna Stolz, Otto Schmid
Research Institute of Organic Agriculture (FiBL), CH

Christian Eichert, Stephan Dabbert
University of Hohenheim, DE

Version 6, April 2008

A5-1    Introduction

This section addresses the practical steps that need to be taken in setting up and managing an evaluation.  It builds upon Section A2 which addressed the principles and specific characteristics of organic policy evaluations which have influenced the design of ORGAPET and are important considerations in the planning of evaluations.  This section of ORGAPET is closely related to Volume 1 of the MEANS collection (Evaluation Design and Management, EC, 1999) as well as the Designing and Implementing Evaluation section of Evalsed. This section contains three main parts focusing on a) the planning stages of the evaluation process, b) the specific steps that should be undertaken in the evaluation and c) quality assurance.

A5-2    Issues to consider in planning an evaluation

From the literature, various approaches to planning evaluations can be found.

Vedung (1997) poses eight key questions to be considered when planning an evaluation:

  1. What is the aim of the evaluation? (e.g. control/accountability, improved knowledge, programme modification/change etc.)

  2. How is the evaluation organised? (Who commissions and who carries out the evaluation, and how are they interrelated?)

  3. What is the programme to be evaluated? (There needs to be a clear description to set the boundaries for what is to be evaluated)

  4. What is the public management process between input and output? (How has the decision been implemented and which agency was involved; what was the content; have target groups responded the way they promised; is there evidence of implementation failure?)

  5. What are the results (outputs/outcomes) of the programme? (The focus should not only be on intended outcomes – are there other outputs, e.g. national action plans resulting from the EU action plan. What further outcomes could be included?)

  6. What are the factors explaining the results (programme/others)? (What are the effects/impacts of the policy; what are the causalities involved; how are these influenced by the general context of society?)

  7. What are the evaluation criteria and standards used for assessment? (What value statements are involved; what are the goals; what did stakeholders expect?)

  8. How and by whom is the evaluation to be used? (Intention and reality – possible purposes include instrumental, political, internal reflection).

Patton and Sawicki (1993) suggest a problem-solving approach to planning evaluations which includes six steps:

  1. Verifying, definition of and detailing the problem (determine the magnitude and extent of the problem; continually re-define the problem in the light of what is possible; eliminate irrelevant material; question the accepted thinking about the problem; question initial formulations of the problem; say it with data; locate similar policy analyses; locate relevant sources of data; eliminate ambiguity; clarify objectives; resolve conflicting goals; focus on the central, critical factors: is it important? is it unusual? can it be solved?; identify who is involved, and why? what power do involved parties have?; list resources required to deal with the problem)

  2. Establishment of evaluation criteria (what are the important policy goals and how will they be measured?; identify criteria relevant to the stakeholders and to the problem; clarify goals, values and objectives; identify desirable and undesirable outcomes; is there a rank order of importance among the criteria?; what will be the rules for comparing alternatives?; administrative ease; costs and benefits; effectiveness; equity, legality and political acceptability)

  3. Identification of alternative policy (consider a wide range of options; consider the status quo, or no-action alternative; consult with experts; brainstorming, Delphi, scenario writing; redefine the problem if necessary)  

  4. Assessment of alternative policy (select appropriate methods and apply them correctly; estimate expected outcomes, effects and impacts of each policy alternative; do the predicted outcomes meet the desired goals?; can some alternatives be quickly discarded?; continue in-depth analysis of alternatives that make the first cut)

  5. Displaying and distinction (choose a format for display; show strengths and weaknesses of each alternative; describe the best and worst case scenario for each alternative; use matrices, reports, lists, charts, scenarios, arguments)

  6. Implementation, monitoring and evaluation of the policy (draw up a plan for implementation; design monitoring system; suggest design for policy evaluation; was the policy properly implemented?; did the policy have the intended effect(s)?)

Weiss (1998) proposes a key question approach which includes seven key questions:

  1. Deciding whether to use quantitative or qualitative method, or a combination of the two methods.

  2. Developing measures and techniques to answer the key questions.

  3. Figuring out how to collect the necessary data to carry out the measures.

  4. Planning an appropriate research design (with attention to the kinds of comparisons that will be drawn and the timing of data collection, for example).

  5. Collecting and analysing the data.

  6. Writing and disseminating the report(s) of study results.

  7. Promoting appropriate use of results.

For the planning of an evaluation, further questions have to be answered (Weiss 1998):

  1. Programme Process: Questions about programme process essentially ask what is going on in the programme. The issue here is the fidelity of the programme with respect to the intention of its designer.

  2. Programme Outcomes: An emphasis on programme outcome would direct the evaluation to the consequences of the intervention for its clients. The focus is the change in situations; but goals sometimes change during the course of programme operation. Another question of outcomes would be the hunches of programme staff, clients, observers or others about what the real results are, regardless of the rhetoric of intent.

  3. Attributing Outcomes to the Programme: These questions are aimed at finding out whether any changes that are observed over time are due to the programme. The evaluation may want to devote some questions to understanding the extent to which the programme was responsible for the changes.

  4. Links between Programme and Process: With information about process, the question of whether particular features of the programme are related to better or poorer outcomes can be analysed.

  5. Explanations: Evaluation is not only aimed at finding out what happened but also how and why it happened. This entails a search for explanations. For the improvement of a programme, the reasons for achievements and shortfalls have to be indicated.

In developing the set of questions to be answered, the following should also be considered (Weiss 1998):

In the MEANS collection Volume 1 (EC, 1999), three factors that affect the form of an evaluation are identified:

  1. the stage of the policy cycle;

  2. the level of decision making;

  3. the scope of the evaluation.

Evaluations can be undertaken at three stages within the policy cycle: ex-ante, mid-term or intermediate and/or ex-post.

This ongoing process of evaluation runs parallel to that of policy development and reflects and assesses the impact of the policy on the observed direct and indirect policy results. 

An evaluation must be undertaken for each level of instrument (policy, programme or project). It can not be assumed that the performance of a programme will be the sum or the synergy of its component projects and there will be other effects that must be considered. When evaluating projects and programmes, the objectives of higher level programme or policy should also be taken into account. In evaluating policy, the wider goals of economic, social and environmental factors must be taken into consideration.

Some of the questions identified above are addressed in the remainder of this section of ORGAPET; the rest are addressed in ORGAPET Part B and Part C.

A5-3    Steps in preparing the evaluation

A5-3.1        Defining the purpose of the evaluation

Vedung (1997) poses a critical question: Why is the evaluation being conducted? This is clearly linked to Vedung’s second question: Who is organising the evaluation? As discussed in ORGAPET Section A2, Stockmann (2004) asks whether evaluations are primarily oriented towards information, monitoring, learning or legitimisation. Stockmann identifies some key differences in possible answers (Table A5-1) and concludes that evaluations can be categorised as either formative (process-orientated, constructive or communication-promoting designs) or summative (result-oriented, concluding, accounting) in character.

Table A5-1:   Dimensions of evaluation research

Stages of the
programme process

Analysis
perspective

Perception
interests

Evaluation
concept

formulating the
programme/planning stage

ex-ante

"analysis for policy"

"science for action"

preformative/formative: proactive design, process-orientated, constructive

implementation stage

ongoing

both possible

formative/summative: both possible

impact stage

ex-post

"analysis for policy"

"science for knowledge"

summative: summarising, making up the balance, result-orientated.

Source: Stockmann (2004)

This is consistent with the MEANS approach outlined in the preceding section and with the Evalsed guidance on defining the object of an evaluation. Both the scope and purpose of the evaluation should be defined.  While the purpose may be summative or formative in nature, the scope defines the programme and the evaluation within set institutional, temporal, sectoral and geographical limits.  Institutional limits may be set at an EU, national or local government scale; temporal limits to a given period of time; sectoral limits to an industry, society or environment (e.g. rural or urban environment); and geographical to a country, region or town. However, it might also be worth considering:

  • Is the intention to limit evaluation to the funding of the programme or to include other national, regional or local funding that is, to a greater or lesser degree, directly related to the programme?

  • Is the intention to limit the evaluation to interventions in the eligible area or to extend observations to certain neighbouring areas that encounter similar development problems?

  • Is the intention to limit the evaluation to funding allocated within the programming cycle under consideration or to a certain extent to include funding of preceding cycles?

Evalsed advises keeping a narrow focus to evaluations, particularly ex ante ones, by precisely defining the scope, so that the really central issues are addressed.

In ORGAPET Section A2, we concluded that:

Taking these perspectives into account, it is possible to categorise four different types of evaluations which will place different weights on the different elements of ORGAPET (Table A5-2).

From the perspective of organic action plans, the two types of formative evaluation should focus on:

The summative evaluations should focus on:

Table A5-2: Relevance of ORGAPET Sections to different types of evaluation

Type

A

B

C

D

Nature

Formative

Summative

Timing with respect to action plan implementationn

Before
(ex-ante)

Mid-term
(purpose: refining)

Mid-term
(purpose: controlling)

After
(ex-post)

B1 Programme development/implementation

No

Yes

Yes

Yes

B1 Status quo analysis/feasibility analysis

Yes

No

No

No

B2 Programme content and failure risk

Yes

Yes

Yes

Yes

B3 Evaluating stakeholder involvement

Yes

Yes

Yes

Yes

C1 Identifying objectives

Yes

Yes

Yes

Yes

C2 Identifying indicators

Yes

Yes

Yes

Yes

C3 Generic indicators/measuring results

Yes
(baseline)

Yes
(preliminary)

Yes
(preliminary)

Yes

C4 Expert judgements

Yes

No

No

Yes

D1 Synthesis

Yes

Yes

Yes

Yes

 

A5-3.2        Structuring and scheduling the evaluation

An evaluation can be undertaken in three distinct phases:

  1. Deciding on the evaluation;

  2. Drawing up terms of reference;

  3. Launching the evaluation.

The tasks that need to be completed at each stage depend on the type and timing (ex-ante, mid-term, ex-post) of the evaluation. Table A5-3 displays examples of the tasks that should be completed at these different stages.

The timing and frequency of evaluations needs to be considered in terms of stage in the policy cycle as discussed above, but also in administrative terms. If carried out too frequently (more than once every three-five years?), the resources required could be substantially greater than the benefits derived, and it could detract from policy implementation activities. If the evaluation is carried out too late, or the evaluation process takes too long, decisions on the next stage of policy development could already have been taken, and the results are then only of historical or academic research interest.

It is also important is that there is enough content in the plan and clear targets to evaluate. Indicators (see Sections C2 and C3) need to be determined and monitored from the outset, and therefore evaluation should form part of the planning of the action plan - if there is no data, the plan cannot be evaluated.

Table A5-3: Steps in preparing an evaluation (adapted from MEANS)

Type

A

B

C

D

 

Formative

Summative

Timing with respect to action plan implementationn

Before
(ex-ante)

Mid-term

Mid-term

After
(ex-post)

Who commissions the evaluations?
Action plan groups, administrations
Action plan groups, administrations
Action plan groups, administrations
Administrations, researchers, auditors
Deciding on the evaluation

Defining the scope

What will be evaluated? Define: geographical, temporal and funding limits and interactions with the ongoing policy cycle.
Specifying the motives
e.g. identifying relevant policy goals and/or measures; improving programme relevance and coherence; identifying baseline/status quo
e.g. proposing reallocation of resources, modifications to (fine tuning of) measures
e.g. preliminary evaluation of outputs, results, impacts; trend analysis
e.g. validating best practice; determining cost effectiveness; basis for future policy choices
Planning the participation of the main partners in a steering group
Include: policy-makers; beneficiary representatives, researchers, other affected stakeholders etc.
Include: as A and managers of measures, implementation officials and others working with beneficiaries (e.g. consultants)
Include: as B
Including spokespersons of concerned groups (stakeholders – those affected and affecting)
Drawing up terms of reference
Asking partners to express their expectations; selecting evaluative questions and judgement criteria
Rationale, relevance and coherence
Coherence effectiveness and efficiency
Coherence effectiveness and efficiency
Effectiveness and efficiency of results and impacts
Recalling the regulatory framework and describing the programme
Programme proposal
Review and amend programme
As D
Describe the programme as it was applied
Listing available knowledge
Including evaluations of previous programmes
Including ex-ante evaluations
Including ex-ante evaluations
Including mid-term evaluations
Checking feasibility of evaluation methods and questions
Checking the relevance, effectiveness, efficiency and utility of the evaluation.
Defining rules of conduct, schedule and budget
Including constraints on the scheduling of the evaluation, especially regarding the decision-making schedule
Launching the evaluation
Defining skills requirements for the evaluation team and select team
Often a mixed team with specific knowledge of the programme area and evaluations.  Independent of the commissioner
Planning evaluation work, particularly quality control measures
Define and implement Quality Assurance process

Source: EC (1999) modified

Evalsed provides further guidance on drawing up terms of reference, which should serve as the contractual basis between the commissioner and the team conducting the evaluation. Typically the terms of reference will cover the regulatory framework, the scope of the evaluation, the main users and stakeholders of the study, the evaluative and research questions, the available knowledge, the main methods and techniques to be used, the schedule, the indicative budget, the expertise required, and administrative requirements for a proposal.

A5-4    Performing an evaluation

The key elements of performing an evaluation are covered in detail in ORGAPET Parts B, C and D. Table A5-4 summarises the key steps in performing an evaluation and indicates in each case where further information can be found in ORGAPET.

Table A5-4: Steps in performing an evaluation (adapted from MEANS)

Type

A

B

C

D

 

Formative

Summative

Timing with respect to action plan implementation

Before
(ex-ante)

Mid-term

Mid-term

After
(ex-post)

Examining the logic of the programme (see ORGAPET Section B1 and Section B2)

Analysing the strategy and assessing its relevance, including clarity and coherence of objectives

Highly important

Is strategy still relevant in light of changing context? Are objectives understood by managers and operators

Is implementation consistent with original strategy?

What objectives were actually followed and how do they differ from planned strategy?

Examining coherence between objectives, resources and action points (measures)

Assessment necessary for forward planning

Need to ensure continued compatibility to avoid implementation failure

 

Does coherence explain success/failure of programme?

Identifying results and expected impacts

Projections, target-setting, cross-impacts matrix

Are projections, targets still appropriate?

How does actual uptake compare with targets?

How well have results and impacts been achieved?

Examining quality of the monitoring system

Are proposed indicators appropriate? Does baseline data exist?

Is monitoring system capturing useable data?

 

Is data capable of assessing effects?

Examining programme effects (see ORGAPET Part C)

Selecting and using existing information

To define baseline situation (status quo analysis)

To review progress and redirect resources, including monitoring data

 

To provide basic assessment of uptake, outputs, results and context

Carrying out additional surveys

To define status quo situation

May be needed where data not available from monitoring system

 

Provides more in-depth knowledge of specific results and impacts

Estimating results and impacts

Extrapolation from impacts of similar interventions

More in-depth analysis of specific result and impact mechanisms

 

Integrating full range of data sources, including research, and expert judgement

Formulating, validating and utilising the conclusions (see ORGAPET Part D)

Interpreting results of surveys and analyses; preparing impartial judgement

Judgement on ambition of objectives and probability of achieving them

Judgement on progress of different measures and their contribution to success of the programme

 

Judgement of overall success of programme and cost effectiveness

Writing up an evaluation

Formulating real conclusions by clearly answering evaluative questions

Reflecting and acting on results, in appropriate stakeholder context

Adjusting objectives, monitoring system etc.

Improving measure and  retargeting resources

 

Highlighting best practice and general lessons learned

Disseminating results

e.g. seminar for partners involved in design of next programme

e.g. publication of interim evaluation

 

e.g. seminar for authorities responsible for programme, publication of final evaluation

Monitoring actions taken, including defining who is responsible.

Integrating status quo analysis in action plan document

Integrating conclusions in programme management and resource allocation

 

Integrating conclusions in determination of future policy directions

Source: EC (1999) modified

Evalsed includes further guidance on the use of evaluations, identifying three different ways in which evaluation work is used:

The extent to which the results of an evaluation can be used effectively will depend on timeliness of the evaluation, dissemination and follow-up activities, the institutional arrangements and engagement of senior staff, as well as the engagement of stakeholders and the quality of the evaluation.

A5-5    Ensuring the quality of an evaluation

A key part of effective policy evaluation is to ensure that the quality of the evaluation is sound, otherwise the conclusions drawn may not be robust. Evaluation standards have been developed in many countries and institutions in order to contribute to evaluation quality. Here, the evaluation standards of SEVAL, the Swiss Evaluation Society (Widmer et al., 2000), are presented briefly, as well as the MEANS and Evalsed guidelines.

A5-5.1        SEVAL evaluation standards

The SEVAL standards define the demands of evaluations but do not specify the instruments to be used within the evaluation (Widmer et al., 2000). The standards were formulated in a way that is suitable for all kinds of evaluations, except those of personnel. They are based on the Programme Evaluation Standards of the Joint Committee on Standards for Educational Evaluation. The SEVAL standards, as is the case with most of the evaluation standards, distinguish four sub-groups (Widmer et al., 2000):

A5-5.1.1       Utility standards

The utility standards guarantee that an evaluation is oriented towards the information needs of the intended users of the evaluation. The utility standards include:

A5-5.1.2       Feasibility standards

The feasibility standards call for evaluation systems that are as easy to implement as possible, efficient in their use of time and resources, adequately funded, and viable from a number of other standpoints. The feasibility standards include:

A5-5.1.3       Propriety standards

The propriety standards require that evaluations be conducted legally, ethically and with due regard for the welfare of evaluatees and clients of the evaluations. The propriety standards include:

A5-5.1.4       Accuracy standards

The accuracy standards require that the obtained information be technically accurate and that conclusions be linked logically to the data. The accuracy standards include:

A5-5.2        Assessing quality of an evaluation in the MEANS and Evalsed frameworks

The MEANS framework (EC, 1999, Vol. 1:169 ff.) and the Evalsed section on quality assurance and quality control identify eight quality assessment criteria that should be addressed to identify whether an evaluation report is unacceptable, acceptable, good or excellent:

In reaching an overall assessment of the evaluation, account should be taken of constraints weighing on the evaluation and the team which performed it, in particular whether the terms of reference and the time and resources allocated to the evaluation were realistic and whether the necessary data could be obtained.  In the same section, Evalsed provides further guidance on quality assurance with respect to the evaluation process.

Quality assessments can be undertaken by various groups with various aims. In Table A5-5, possible tasks for each group are presented.

The MEANS framework also defines how to assess the quality of individual indicators or systems of indicators (see Section C2).

Table A5-5: Who assesses the quality of an evaluation?

Assessor

Aim of assessment

Steering group (with specialist help)

Validating final report and verifying robustness

Co-decision-makers

Assessing reliability of conclusions and recommendations

National and European Authorities

Assessing reliability of conclusions and recommendations

Improving evaluation process

Independent experts

Assessing quality of publicly disseminated output

Meta-evaluation/developing professional standards

Source: EC (1999) modified

A5-6    Conclusions

When preparing an evaluation, it is necessary to be clear about the purpose and scope of the evaluation related to the type of evaluation (formative or summative) and the stage in the policy cycle (ex ante, mid-term, ex post) in which it is to be carried out. Clear terms of reference and quality assurance procedures are required to ensure optimal results. The detailed steps to be followed in the preparation and conduct of evaluations are set out in Tables A5-3 and A5-4. The evaluation team and programme managers should be supported by a steering group including relevant stakeholders to assist with the planning, interpretation and final evaluative judgement of the results. A clear dissemination/communication strategy is also required to inform relevant stakeholders about the process and the outcomes of the evaluation, and to ensure that follow-up actions take place.

A5-7    Checklist

The checklists in each section are intended to provide a structured listing of the key issues covered in the text that should be addressed as part of an action plan evaluation. The following questions relate to the material covered in this section of ORGAPET and should be answered at the planning stage of an evaluation. Normally this will be from the perspective of the person/institution organising the evaluation, but it can also be undertaken by others with an interest in the evaluation process.

1. What is the purpose (aims, objectives, desired outcomes) of the evaluation (in your own words)?

2. What is to be evaluated (define the scope e.g. national action plan from 2000-2005 be as specific as possible)?

3. Was an evaluation planned for from the outset, with an appropriate monitoring programme and baseline data in place?

4. When (at what stage of the policy cycle) is the evaluation to be carried out (ex-ante, mid-term, ex-post)?

5. What type of evaluation is needed (formative = to assist future planning; summative = to evaluate past actions; or both)?

6. With reference to Tables A5-1 and A5-2, classify the evaluation type as A, B, C or D.

7. Who (which agency/organisation) will commission the evaluation?

8. Who will conduct the evaluation (consultants, stakeholders, others)?

9. What is the timescale (schedule) over which the evaluation should be conducted?

9. How and by whom can the results of the evaluation be used (dissemination and decision making)?

10. Have any relevant evaluations or reviews been conducted previously (the results will be relevant in Part B and Part C)?

11. Will the evaluation meet (or has it met) the quality assurance guidelines (specify whether SEVAL or MEANS or other)?

12. Have clear terms of reference for the evaluation been defined?

A5-8    References

EC (1999) Evaluating Socio-economic Programmes. MEANS Collection Vols. 1-6. Office for Official Publications of the European Communities, Luxembourg.

Patton, C. and D. Sawicki (1993) Basic Methods of Policy Analysis and Planning, 2nd Edition, Englewood Cliffs, Prentice-Hall, New Jersey, USA.

Stockmann R. (2004) Was ist eine gute Evaluation? CEVAL, Arbeitspapier Nr. 9. Centrum für Evaluation, Saarbrücken. 

Widmer, T., C. Landert and N. Bachmann (2000) Evaluation Standards of SEVAL, the Swiss Evaluation Society.

Vedung, E. (1997) Public Policy and Program Evaluation. Transaction Publishers, New Brunswick, New Jersey, USA.

Weiss, C. (1998) Evaluation Methods for Studying Programs and Policies. Second Edition. Prentice-Hall, Inc., Simon & Schuster/A Viacom Company, Upper Saddle River, New Jersey, USA.

A5-9    Annexes

No annexes are currently included in this section.