Evaluating Cluster Policies: A Unique Model? Lessons to be Drawn from a Comparison between French and European Experiences.

Although there is a consensus concerning the need for public policy evaluation, there is no stable doctrine regarding the way such assessments should be carried out. Different models coexist or succeed one another; it is, for example, possible to schematically oppose a ballistic model of evaluation “of the action” to an emergent model of evaluation “in the action”. The aim of this article is to analyse the evolution in public policy evaluations and the difficulties inherent in them by studying the French cluster evaluation undertaken in 2008. This evaluation was planned from the beginning as a component of the cluster policy, with the aim of modifying the policy in the light of its initial results. We first put into perspective the doctrines and methodologies underpinning public policy evaluation in general and cluster evaluation in particular. We then study the procedures used in the French cluster evaluation, comparing them to four international cases (Germany, Belgium, Finland and Austria). The analysis is based on a detailed examination of documents relevant to the evaluation, on our empirical knowledge of the French clusters, and on discussions with territorial and national actors involved in the cluster policy. The article reveals the inherent difficulties in cluster evaluation processes. These difficulties are mostly related to the systemic, multi-actor and heterogeneous characteristics of the object “cluster”. Analysing the usage and the effects of the evaluation on the various actors allows us to conclude that cluster evaluation in France is a learning source for the progressive construction of a cluster doctrine and a doctrine of its management. The evaluation, grounded in an interactive approach, becomes part of a larger process, a knowledge process benefiting both the government and the local actors concerned. Integrated from the outset into the cluster management system, the evaluation becomes a tool amongst others; it is therefore less consistent with a model of objective, incontestable and independent knowledge production than with an instrument to help decision-makers forge their choices.

!" #$%&'()*%#'$+ Since Marshall (1920) it has been apparent that geographical concentrations of firms and economic actors, known as districts or clusters, can generate positive effects on economic growth in specific territories. Marshall's results are confirmed by the success of clusters like Silicon Valley and the Italian districts. Numerous studies (Becattini, 1989;Piore and Sabel, 1984;Porter, 1998;Rosenberg, 2002;Saxenian, 1994) have highlighted the characteristics and advantages of such organisations.
Such examples of localised organisations have since become paragons of the genre. Most of them developed organically with no initial shared strategic vision. In the 1990s, their proven track records prompted federal and regional governments in most countries to introduce cluster development policies designed to encourage the creation of the kind of synergies observed in such spontaneously evolving clusters and to generate sources of competitiveness in their territories (Saublens, 2007). Even if the concept of clusters remains fuzzy (Markusen, 2003;Martin and Sunley, 2007), public resources consecrated to such policies are generally far from negligible.
A desire to account for the use of such funds, to measure the effects of such policies and improve or re-orientate them mean that public authorities are keen to examine the impacts of their policies and assess them to ensure that objectives are being met. Over the course of the last few years, two approaches to the evaluation of cluster policies have emerged. On the one hand, studies have underlined the importance of evaluations in terms of learning and of the transmission of knowledge (Potter, 2005). On the other, academics have attempted to develop frameworks for evaluating clusters (Colgan and Baker, 2003). Diez (2001) highlighted that approaches to assessing clusters are changing and that "more creative ways" are needed.
Participatory evaluation, he suggests, would be an appropriate way in which to assess the specificities and characteristics of clusters. Nevertheless, a consensus emerges from such studies: evaluating cluster policies is a difficult task and precise measurements and frameworks have yet to be developed.
This consensus derives, notably, from the fact that clusters, and government-supported clusters in particular, only emerged relatively recently. This raises the following questions: How should clusters and cluster policies be assessed? Is there one particular evaluation method or does each individual policy require a specific assessment approach? Is it possible to propose an evaluation model for public policy concerning clusters?
In order to contribute to current debates on the subject and to propose a series of tentative answers to these complex questions, we have based our approach on a comparison between various evaluation techniques used in a number of European countries. More precisely, we focus on the evaluation recently conducted in France. In 2005, the French government introduced an ambitious approach to the development of clusters, referred to as the "pôles de compétitivité" 2 ("competitiveness clusters") policy. The policy is designed to stimulate innovation and competitiveness amongst firms and within territories, principally by means of providing financial support to joint R&D projects involving enterprises and research centres.

Seventy-one competitiveness clusters received official accreditation between 2005 and 2007.
Accreditation provides the right to apply for a certain number of public subsidies both in terms of running the cluster and supporting R&D projects. However, the government has no direct influence on the way in which clusters are organised or on their employees. All seventy-one French clusters were evaluated in 2008 within the framework of a major operation requested by the government agencies responsible for overseeing their development.
Working on competitiveness clusters in a multi-disciplinary research team since 2007, 3 we have carefully monitored the development, results and effects of the national evaluation carried out in 2008. Moreover, as well as the overall results, we were able to access some of the raw material used by the assessors. Furthermore, we repeatedly interacted with the bodies responsible for requesting the evaluation and the assessors themselves. This allowed us to glean a greater understanding of the issues at play in the evaluation, the methodology applied and the way in which the results were presented. Lastly, we interviewed the directors of the clusters to ask them about their impressions of the evaluation process. Moreover, other European governments, which introduced cluster policies, also produced evaluations. After reviewing the literature, we chose four cases (Germany, Finland, Belgium, Austria) which appeared to us to be significant and which were sufficiently well documented to be able to 2 Throughout the article, "pôles de compétitivité" or "competitiveness clusters" will be referred to using the generic term "clusters". 3 The authors belong to a team of researchers in management and economics funded by the Agence Nationale de Recherche (French National Research Agency) and ESCP Europe (a business school). They use a variety of approaches including empirical work (both qualitative and statistical) on certain clusters or certain themes linked to clusters. provide elements of comparison with the French evaluation. In Austria, in addition to the literature review, we were able to interview the directors of clusters about the way evaluations were carried out there.
After outlining the issues and problematics inherent in all evaluations, particularly those concerning cluster policies (Section 2), we provide a detailed presentation of the evaluation of cluster policy carried out in France in 2008 in order to identify the approaches used and the difficulties encountered therein (Section 3). We then present four examples of foreign cluster evaluations (Section 4) analysed using the same methodology as that applied to the French case. We will then compare those evaluations with the French situation in order to assess whether they are based on a shared methodology and whether they present similarities in terms of the questions asked, the definitions of the parameters of the object evaluated, the difficulties encountered, and the use made of the evaluations (Section 5). Lastly, we will conclude that, even if the various evaluation approaches do have points in common, there is no "unique model" and examine the determinants of the differences observed.
," -./0)/%#$1+*0)2%-&23+/+*'$%-45'&/&6+$--(+++ The emergence of cluster policies in a large number of countries can be explained fairly easily: Schematically, in a context in which international competition is exacerbated by globalisation and in which developed countries have witnessed a growing trend for first their production and then their R&D capacities to delocalise towards emerging countries, the success of a number of spontaneously developing clusters, the primary example of which is Silicon Valley (Weil, 2010), has prompted governments to employ a voluntaristic approach to supporting the emergence and development of clusters. A range of theoretical studies (for a literature review see Fen Chong, 2009) have demonstrated that all such policies involve substantial public investment (public funding, tax breaks, etc.). Factors explaining for the increase in the number of evaluations requested by governments include a desire to account for the use of such funds and to measure the effects of such policies, to improve and reorientate the policy, as well as to annihilate persistent doubts about the impact of clusters on economic growth (Martin and Sunley, 2003;J. Potter and Miranda, 2009;Torre, 2008).

The current debate on the evaluation of public policies
It is not our intention to provide either a detailed overview of the history of the evaluation of public policy in France (Theonig, 2002) or discuss the various theoretical tenets underpinning different conceptions of evaluation (Chanut, 2009;Stame, 2009). We will content ourselves with noting that in France, after an influx of American research approaches in the 1970s, evaluation developed in a timid manner before being institutionalised by a decree promulgated in 1990 which, however, did not entirely guarantee its primacy. Thus, even if a certain number of evaluations were carried out in France after the decree, observers were nevertheless critical about the breadth and scope of such practices.
Evaluating public policy consists in "examining the efficiency of the policy by comparing results to pre-defined objectives and the resources used to achieve them" (Decree of 18/11/1998). Beyond this apparently clear definition lurk a number of relatively well-known difficulties or dilemmas exist. We will content ourselves with listing them briefly below: • Formulating objectives is rarely simple, and the work of assessors often consists in elaborating questions pertinent to specific policies.
• Evaluations are carried out at different times in the lifespan of a policy -ex ante, ex post, ex itinere -a fact which considerably modifies their status, their relevance, and the use that can potentially be made of them. This range of approaches implies various choices and constraints, many of them political in nature.
• The methods used, however rigorous they may be, cannot be purely quantitative.
Narrowly positivist approaches are eschewed. They are thus informed by methods derived from the social sciences, even though the results generated by such techniques are, in a certain regard, relatively subjective. What kind of balance should be struck between quantitative and qualitative methods?
• Although the participation of various stakeholders, notably end users or beneficiaries, constitutes a source of knowledge, evaluations cannot be transformed into negotiating tools; in other words, the conclusions reached by evaluators cannot be reduced to the status of expressions of compromise. How can the two objectives be reconciled?
• The interaction between an evaluation and its ramifications is a complex phenomenon; advocates of the evaluative approach had eventually to admit that it could by no means be considered of central importance to the political or administrative decision-making process (Lacasee, 1995). In fact, few changes are made to policies as a result of evaluations, the effects of which tend to be diffuse and indirect.
These questions could, depending on the answers elicited by them, play a role in the renunciation of a model termed by some "ballistic" (Pardioleau, 1982) or "epidemiological" (Stame, 2009) in which evaluation is seen as the last link in the chain of a process of public action designed to be sequential and linear where the series "objectives-means-results" is followed by a corrective phase made possible by a rigorous process of objective evaluation.
For some authors, the French paradigm is, to use the terminology developed by Chanut (2009) characterised by assessment "in" rather than "of" action, which, amongst other things, renders evaluation a quasi-continuous, rather than specifically ex post process, thus blurring the boundaries between evaluation and management or management control and shifting responsibility for assessment from the "experts" to the managers themselves. Nevertheless, as we will see in terms of the evaluation of cluster policy in a sample of four different countries as well as in France itself, the distinction between the two paradigms is not always quite so clear. Indeed, the doctrines governing evaluation tend to vary.

How should a cluster policy be evaluated?
The need for evaluation having been established, the question remains what should be assessed and how? 4 In order to outline the principal approaches to evaluation, we will refer to the report published by the BIPE 5 on behalf of the DIACT 6 in 2007 (BIPE, 2007). The report, designed to help the DIACT prepare the evaluation of French clusters, contains a summary of the various approaches to assessment applied abroad.

What should be evaluated?
One of the traditional difficulties involved in evaluating public policy is the variety of potential angles of attack hidden by the term "evaluation". Specialists habitually distinguish at 4 We have made use of a comparative bibliographical study on the subject conducted by Lefebvre & Pallez (2009) for the Chair of Entrepreneurship of the Paris Chamber of Commerce and Indstry. 5 The BIPE is an economic and strategic consultancy firm (www.bipe.com; consulted on 15/07/2010) 6 « Délégation interministérielle à l'aménagement et la compétitivité des territoires » (DIACT) (« Inter-Ministerial Delegation for Regional Development and the Development of Competition ») functioned from 2005 to 2009. Before 2005, the delegation was known as the DATAR (« Délégation à l'aménagement du territoire et à l'action régionale », or « Delegation for Territorial and Regional Development »). least five notions characterising public policy: relevance, coherence, efficacity, efficiency and systemic impact. The question of "performance" refers directly to the notion of efficiency in terms of the degree to which the objectives outlined in the policy have been attained. But it also refers to the notion of relevance (in that any examination of performance inevitably poses question about "action theory", the theory underlying the policy) and to the notion of systemic impacts (the effects could be a good deal more wide-ranging than those identified a priori).
In so far as cluster policies are concerned, it is possible to distinguish a number of different levels of evaluation partially linked to these general notions as well as to actors with different interests: • Firstly, the efficiency of cluster policies at the national level as opposed to the impacts of a given cluster.
• Secondly, those interested in evaluating clusters can distinguish: results in relation to the aspired objectives, organisational efficiency (projects, governance, piloting, etc.), the impact of individual clusters on the territories in which they are located and on the economic dynamic of those territories, the impact of clusters on their actors (enterprises, research, local authorities) or the results of certain projects and initiatives carried out within clusters.
As is illustrated by the graph below, these various elements are not independent of each other.
The graph succinctly outlines the difference between evaluating a cluster policy and assessing the cluster (from the perspective of the way in which they are organised, the results they produce, and their impacts). In spite of its simplistic character, the graph also reveals the potential determinants of the results achieved by clusters, which can be linked to the public policy in question, to internal organisation, and to external factors. In addition to these explanatory factors, clusters also possess "inherited" characteristics which are not apparent in the BIPE schema. By the term "inherited" we mean the configuration of the actors concerned, the kind of resources available, and the links between the various actors and the territory prior to the creation of a cluster (Fen Chong, 2009).

Graph 1: Levels of evaluation of an innovative cluster
Source: BIPE (p.6, 2007) These remarks provide an insight into the variety of approaches to the evaluation of clusters around the world. The graph also highlights the fact that the evaluation (or self-assessment) 7 of the organisational efficiency of clusters, on which emphasis is often placed, is merely one of the many factors enabling us to assess cluster policies.

Who requests the evaluation and when?
Evaluations are undertaken on the request of particular actors at specific times. These two variables (the origin and time of the request) influence the nature of evaluations to the degree that the nature of the questions posed differs according to the type of actor and the phase of development of the cluster. Most evaluations of cluster policies are carried out at the initiative of the public actors responsible for developing and funding those policies. Moreover, in France -and this is also true for other countries -the evaluation was carried out at a relatively early stage in the development of the policy.

2.2.3.What methodology, which indicators?
One of the difficulties inherent in approaches to evaluation examined in this study is described below: • In theory, evaluations should be carried out in view of objectives defined ex ante, on the basis of a comparison with a point of reference, which is itself defined and characterised prior to the introduction of the policy. 7 Sometimes referred to as an "audit".
• In practice it can be observed that, generally speaking, the networks of actors in a cluster were already in place, that the development dynamic already existed in an embryonic form, and that the objectives of the cluster change over time. These factors render all evaluations delicate. This is particularly true in terms of introducing additional elements to the policy.
Consequently, many evaluations use ex post methodologies (in which results are not assessed on the basis of initial objectives). Such evaluations are rarely normative and as prospective as they are retrospective.
The criteria used to evaluate clusters generally focus on organisation and results. The operational efficiency of clusters is judged on the basis indicators such as the "number and cost of projects supported", and scientific and technological performance by the "number of patents and licences generated." Meanwhile, economic performance is gauged by highly traditional indicators covering the growth and health of firms (turnover, added value, exports), as well as job creation, enterprise creation, and direct investment within the territory. More "intermediary" results-based indicators are often applied: funds dedicated to R&D projects, total investment. After having provided a general outline to the approaches characterising cluster policy evaluations, we will now present a specific case -the evaluation carried out in France in 2008 -that we were able to examine in detail.
-We will then analyse the way in which the results of the evaluation were used and demonstrate that it was designed as a piloting tool.

Evaluating clusters: a presentation
In France, cluster policy, unlike a number of previous policies, integrates the issue of evaluation from the very outset. The State had decided to evaluate its ambitious and costly policy, 8 introduced in 2005, after three years, with a view to using the results, if necessary, to modify the initial doctrine. Although not entirely original, 9 this particularity is nevertheless worth pointing out. Which approach was then selected by the French State, the sponsor of the evaluation?
All evaluation approaches presuppose that objectives should be defined and appropriate methods and indicators selected. After establishing a list of specifications based on the framework provided by the BIPE (see above), the DIACT launched a call for tender in 2007.
The tender was won by two consultancy firms, CM International (CMI) and the Boston Consulting Group (BCG).
The mission of the two consultancy firms consisted in providing an analysis of the strategic orientation of national policy and gauging the efficiency of the approach, cluster by cluster, in each of the country's seventy-one clusters. The ambitious nature of the mission should be noted. The evaluation focused on three major themes: the policy's relevance/coherence, the way in which it was implemented, and its initial effects (it should be observed that, in view of the fact that the policy been introduced so recently, the assessors regarded its economic impact of only secondary importance). Meanwhile, the evaluation of the seventy-one clusters was based on the analysis of three factors: their dynamic, the way in which they were structured, and their R&D projects. Satisfactory, mediocre and insufficient results were defined for each field.
The methodology encompassed the analysis of documentary sources, interviews, and meetings with the actors and organisations concerned, as well as a qualitative and quantitative survey carried out by means of questionnaires sent to the clusters prior to the interviews, which were effected by means of a formalised procedure in order to ensure that they were comparable and to guarantee that the evaluation was balanced. In total, over one thousand people 10 were interviewed. The questionnaire was established on the basis of a test carried out on four "pilot" organisations designed to take into account the diversity of the country's seventy-one competitiveness clusters. The operation was closely monitored by the bodies requesting the evaluation, which made it possible to modify the methodology and the questions asked on an ongoing basis. The process included a weekly meeting with the DIACT, frequent contacts with the DGE, 11 a monthly inter-ministerial committee, a steering committee which met every two or three months, etc. The results of the evaluation were presented to the steering committee in June 2008 after which a government press release was immediately published. The summary documents were posted on the Internet and feedbacks were given to each cluster in the form of "contradictory interviews".
The principal conclusion of CM International and the BCG was that the "organisation of competitiveness clusters seems to be sufficiently promising to warrant a continuation of the general outlines of the policy" (p. 2, CM International and BCG, 2008). Nevertheless, it was recommended that (a) the actors of the clusters should assume more responsibility; (b) project funding mechanisms should be optimised to ensure a greater degree of coherence; (c) cluster policy should be integrated more closely with overall research policy; and (d) the strategic piloting of the approach should be further developed. In terms of the evaluation of the individual clusters, the evaluation recommended a three-tier classification based on three key areas (strategy, governance, and the capacity to develop R&D projects): • Clusters (39) which had "attained the objectives of the cluster policy" • Clusters (19) which "had partially attained the objectives of the cluster policy and which must focus on making improvements in certain areas" • Clusters (13) which "could benefit from making thoroughgoing changes." In total, over 80% of the French clusters either totally or partially attained their objectives.
After these conclusions were drawn, the government took a certain number of decisions 12 which constituted what has come to be known as Cluster Policy 2.0. We will discuss those decisions in Chapter V. For the moment, the close links between the results of the evaluation and the introduction of the second phase of the policy should be noted.

How was the evaluation used?
Evaluations are only of any real value if they raise questions about the way forward. Now, the French State is often criticised for inadequately evaluating its policies and, in any case, of not following up those evaluations it does carry out (Duranton, 2007). How have evaluations been used by the various stakeholders and what effects have they had? We will initially examine the State's appropriation of the results of evaluations before analysing the potential effects of those evaluations on clusters and then, finally, considering the lessons learned for the following evaluation.

An evaluation whose function is to reorient policy
We have already noted that this evaluative approach is a part of a generalised trend which has made the act of "giving an account of oneself" both an integral part of economic life (Dumez, 2008) and, in terms of public policy, of democratic life. It therefore comes as little surprise that the CMI-BCG evaluation carried out by CMI BCG was followed by other "evaluations" requested, successively, by the two chambers of the French parliament (the Assemblée Nationale and the Senate), as well as by the Cour des Comptes, the French national auditing body. These various initiatives led, amongst other things, to the conclusion that more precise indicators were required in the relatively traditional perspective of "end" results (job creation, for example).
The CMI-BCG evaluation, carried out at the request of administrative bodies responsible for overseeing the clusters, seems to us to be of a different nature. Even though the political communication concerning the results generated by the cluster policy was of course an important objective, the process of the evaluation was also explicitly developed as an element for piloting the cluster. We will outline how that was translated in practice.
The French President announced the principle of renewing cluster policy a year earlier, in June 2007. Nevertheless, the evaluation demonstrated the relevance of the approach in terms of providing the actors involved with an adequate structure and developing a territorial strategy, and prompted the State to confirm that its policy would be extended for a further three years, with the initial budget of 1.5 billion euros renewed.
Although the decision to extend the national competitiveness cluster policy cannot be imputed to the evaluation, certain modifications can be. Based on the results of the evaluation, it was decided to introduce changes to the original policy, thereby commencing a phase known as "Version 2.0". Without attributing the modifications to the State's role in piloting the programme solely to the evaluation's recommendations -there were numerous interactions between the clusters and representatives of the State -it is undeniable that some of the newly introduced approaches directly exploit the results of the evaluation. We will comment on two of these approaches which, in our view, are significant. 13 -The formalisation of the strategy The evaluation highlights the clusters' weakness in terms of elaborating and formalising their strategies. In consequence, the State requested the clusters to draw up a series of road maps and implement a contract between the State, the local authorities and the individual clusters.
The aim of the road maps was to present the strategy of individual clusters and provide a basis for the development of the multi-annual performance contracts in which the cluster undertook to implement strategic objectives and action programmes, as well as target agendas and indicators designed to monitor results. The bodies overseeing the programme thus demonstrated a desire to provide a more precise framework for the development of clusters compared to the initial phase in which the objectives of individual clusters were relatively vague. -

Implementing indicators
The insistence on developing indicators is particularly interesting in that it reveals the State's concern with future evaluations. These indicators are to be implemented by the clusters themselves and delivered to the State on a yearly basis, thus providing an annual report and facilitating the evaluation of all seventy-one clusters. An initial series of around thirty indicators are applicable to all clusters (enterprise creation, the number of R&D projects receiving public funding, number of patents lodged, etc.). These indicators should make it easier to compare individual clusters. But the State has also taken the diversity of clusters into account by requesting that each one of them provide specific indicators in order to assess their development relative to their own characteristics.
This twin-pronged approach to evaluation (inter cluster-comparison and the evaluation of the clusters' internal dynamics) reveals an important evolution in the doctrine: the model of a "good cluster" to which all clusters should strive has been abandoned; on the contrary, in order to make it possible to analyse the dynamics of individual clusters in function of their specificities, the management structures of the clusters themselves are invited to provide relevant indicators. It could even be said that by means of these indicators new schemas of causality could be suggested by results generated by the clusters.
In any case, these modifications can be interpreted as the result of a learning process on the part of the bodies responsible for oversight, obtained notably through the evaluation process, enabling them to access knowledge which was unavailable to them at the time at which the policy was originally launched. Knowledge that can be summarily described in the following terms: a recognition of the existence of the clusters' diversified development approaches, and a range of shared indicators adjudged to be relevant; and the affirmation of the link between the development dynamic and the detailed formulation of a strategy which the clusters are then encouraged to establish through formal contracts.

An evaluation designed to mobilise the clusters
The evaluation can also directly affect the clusters. In effect, through the knowledge of individual clusters that it provides, and the comparisons made possible by it, the evaluation can sanction, pilot and mobilise the clusters. It is worthwhile analysing the position taken by the State in this regard.
The most visible effect of the evaluation is the division of the clusters into three different categories. The classification, which separates the clusters which have fulfilled the objectives from those which have not, was to a certain degree perceived as a sanction by the thirteen clusters classified in Category C. On the other hand, the classification symbolically rewards clusters regarded as "good students." As is the case with all classifications, the clusters receiving most criticism questioned the relevance of the criteria used. For example, the competitiveness of certain clusters was said to be less dependent on radical technological innovations than on use-based innovations (e.g. the Child cluster) or on the availability of a labour force equipped with specialist skills and on the organisation of the industrial landscape (e.g. the Burgundy Nuclear Cluster). These clusters considered that the assessment to which they had been subjected did not adequately take into account their specificities.
For the bodies responsible for oversight, various attitudes to the results of the evaluation could be envisaged. A decision to sanction under-performers could have been taken immediately; "bad" clusters could have been disaccredited, and an emphasis placed on more efficient ones. Inversely, the authorities could have chosen to do nothing, hoping instead that poorly performing clusters would have been scared into action by the results of the evaluation. It seems that the State opted for a compromise solution by offering clusters with poor evaluations a year in which to reorganise and restructure. In fact, this transitional period was eventually extended to almost two years. However, in May 2010, it was announced that six of the thirteen Category C clusters had had their accreditation revoked.
Initially, this position had a number of interesting effects. 14 Once the initial period of misunderstanding had passed, Category C clusters began to modify their strategy and operational approaches. Clearly, this attitude was informed by the credible threat of having their accreditation revoked, a serious symbolic sanction with potentially grave consequences within the territory in which the cluster was located (increased risks of delocalisation, negative effects on the morale of private actors, researchers deciding to focus on more academic subjects, etc.). It should be added -and this may appear strange to foreign observers -that the importance accorded to accreditation is probably a deep-seated characteristic of French national culture, where marks of excellence handed down by the State are held in extremely high regard. Nevertheless, in France, the consequences of this type of accreditation are more than merely symbolic.
By taking such an approach, the authorities played on the "sunshine regulation" effect, simply publishing the results (including an implicit comparison with other clusters), albeit strongly advising Category C clusters to "put their house in order" within a certain timeframe or they 14 Nevertheless, it is clear that this waiting phase lasted too long. will risk serious consequences. More generally, even in the case of A and B Category clusters, the observations of assessors and the publicity generated by the results encouraged a more critical attitude which, in some cases, led to certain operational approaches being called into question. This is hardly surprising in view of the fact that a further evaluation, based on the results of the 2008 assessment, will be carried out in 2012.

Are there lessons to be drawn in view of future evaluations?
Can lessons be drawn in terms of future evaluations? It seems that the evaluation recently carried out in France could have far-reaching consequences in two areas, namely an objectivisation of the judgments made by evaluators, and a recognition of the diversity of the country's clusters.

-The objectivisation of judgments
In terms of the objectivisation of data and the inherent rigour of the method applied, the development of route maps and performance contracts should make the next evaluation easier in that objectives will be more clearly formulated and normalised indicators will have been defined. Furthermore, the first evaluation will provide a reference point making it possible to objectivise the notion of trajectory. The process is increasingly characterised by norms and routines, which should improve the quality of future evaluations.
The reliability of data is set to become an even more central issue in that, in future, the clusters will provide a certain amount of the information on which they will be evaluated.
Even if there is some risk in this regard, we are of the opinion that it is offset by the advantages accruing from economies made in the collection of data and the virtuous circles created within clusters by this obligation. Furthermore, a comparison of quantitative data from the point of view of the actors, which should be rendered systematic by the oversight bodies with a view to ensuring reliability, could in itself be a source of knowledge, notably in terms of reinventing schemas of causality explaining performance. But such an approach implies a less positivist conception of truth, focusing as it does on "not unlikely" truths.
-More effectively taking the diversity of clusters into account Futhermore, as has been pointed out above, one of the issues of future evaluations will be to take into account the diversity of clusters. Comparisons should be made within homogeneous categories in which comparability between clusters in terms of development trajectories has been confirmed. The data collected by CMI and BCG could be used to create such categories.
On what basis should these categories be constructed? Traditionally, differences are primarily described in terms of the sectors to which various clusters belong. But the wealth of data collected during the initial evaluation should make it possible to go beyond such simple and intuitive typologies. For example, Colgan and Baker (2003) suggest dividing the clusters of the state of Maine in the United States into three groups. According to the authors, this classification, based on the nature of the resources used by clusters (technological, natural, others) makes it possible to differentiate between them and improve the way in which they are piloted by the public authorities.

Using the databases built up by the CMI-BCG assessors and the Observatoir des Sciences et
des Techniques, we have begun to develop our own typologies. For example, our initial analyses revealed substantial differences in terms of the "heritage" of competitiveness clusters -notably in terms of the R&D potential of the territories in which they are based -which could explain the diversity of projects undertaken and results obtained. In effect, projects set up by clusters with different R&D potentials cannot a priori be considered identical. By applying the hypothesis that, in certain development schemas, R&D projects are an essential source of innovation, these data could be used to compare the development dynamics of different clusters.
It thus seems that evaluations led to the formulation of new questions of great importance in terms of the development and piloting of clusters.
We will briefly describe the emergence of cluster policies in the four countries before moving on to characterise the various approaches applied to carrying out evaluations. A comparison with the French case will be presented in Chapter V. In Finland, the debate concerning the need for a cluster policy was triggered by the publication of the article "The Competitive Advantage of Nations" (1990) by Michael Porter (Pentikäinen 2000). Eight cluster programmes were eventually introduced in 1997 (Rouvinen and Ylä-Anttila 1999). They were provided with subsidies to support R&D programmes based on general criteria such as the importance of joint-projects and strengthening links between the public and private sectors. The first evaluation of those programmes was carried out in 2000 (Pentikäinen 2000).

THE EMERGENCE OF CLUSTER POLICIES IN BELGIUM, GERMANY, FINLAND
In the mid-1990s, Germany made the transition from a traditional industrial policy to a cluster-based industrial policy (Dohse, 2007). An initial selection was made in 1996 on the basis of a national "competition" -BioRegio -in the field of biotechnology. Other competitions of the same type (for example, BioProfile) followed. Three regions were In Austria, responsibility for cluster policy falls to the regional governments rather than to the federal government. 18 Since the 1990s, 19 each of the country's regions has developed its own clusters. The majority of regional financial support is targeted at encouraging the structural development of clusters and the services they provide rather than at major research projects, which continue to depend on traditional sources of support. Lower Austria, 20 which has created six clusters since 2001, is the region we studied in most detail. The management structure of each cluster operates under the umbrella of the regional economic agency, EcoPlus. Two of the major objectives of cluster management in Lower Austria are to boost the competitiveness of local SMEs and to help them break into new markets (by means of joint national and international projects). In 2004, EcoPlus requested an initial independent evaluation of its cluster policy  with a view to improving its original approach.

FOUR EVALUATION APPROACHES
We will now briefly describe the four approaches to evaluation using the same type of questions posed in Chapter II: • The bodies requesting the evaluations: The four evaluations were each carried out for the governmental authorities responsible for instigating the policy.
• The time at which the evaluation is carried out: The German evaluation is considerably different regarding the Finish, Belgium and Austrian evaluation in terms of when it was carried out. In fact, German clusters were evaluated ten years after they had been set up, a fact which made it possible to assess the programme's economic results (added value, for example). In Finland, Belgium and Austria, evaluations were carried out within three years of the programme having been set up, with a resultant emphasis on an assessment of the programme itself.
• The object of the evaluation: Our analysis of clusters outside France reveals the diversity of the subjects evaluated. In effect, in the four cases studied, assessors focused on cluster policy. In Germany and Wallonia, economic performance was also 18 … even if the national government has recently began to place greater emphasis on coordinating regional initiatives. 19 In 1995, Styria became the first Austrian region to develop cluster initiatives. 20 In this paper, it is to this region that we refer whenever Austria is mentioned. measured. Lastly, in Wallonia and Austria, organisational aspects and the services offered by the clusters' management were also evaluated.
• Methodological choices: German assessors opted for a comparison with a control group (regions which were not being subsidised), while the three other evaluations featured qualitative and quantitative data and were more descriptive in nature.

MAKING PROCESS
German evaluators found that the BioRegio and BioProfile competitions encountered a good deal of success in terms of enterprise creation, job creation and international visibility.
Placing an emphasis on the "end" results, they recommended a continuation of the "competition" -based policy designed to distribute funds to the most promising projects. In this regard, assessors are in line with the tested doctrine of such programmes based on selection and incentives.
Evaluators of the Finnish programme focused on describing its shortcomings, highlighting the fact that private companies were loath to become involved, probably because it was run by a public body. Furthermore, the evaluators noted that most of the budget was allocated to shortterm projects. Consequently, no long-term collaborative projects were developed. Hertog and Remoe (2001) pointed out that approximately one in three joint-projects had been set up with the sole aim of receiving public funding.
In terms of the selection of cluster projects, the evaluators of the Belgian programme recommended focusing on the most promising programmes. Nevertheless, rather than scoring the clusters hierarchically, they presented a detailed evaluation of each individual cluster. The Wallonian government eventually took the decision to stop funding two of the region's four clusters.
The evaluators of the Austrian programme focused above all on the services offered by clusters. For example, they recommended that initiatives should not only be more wideranging the older the cluster was, but also that they should reflect client needs. They also recommended that knowledge accumulated by clusters should be developed and rendered more accessible, both for Ecoplus and for member companies (for example, what problems are faced by individual groups of enterprises?). Furthermore, they highlighted the need for more inter-firm cooperation and for closer links with national research funds, especially the TIP (the research and consulting programme for Austria's research, technology and innovation policy).
Evaluators' recommendations varied in tone and had differing direct effects on the decision taken by the bodies which had requested the evaluations.
By means of this analysis we will attempt to identify similarities and differences in approaches to evaluating clusters.
Divergences have already become apparent. In France and Belgium as well as, to a certain degree, in Germany, the evaluation of the policy is based on the evaluation of the cluster, which provides information about how the policy has been implemented. In Finland, only the policy is evaluated, while in Austria emphasis is placed on the services provided by clusters.
We will therefore examine the various facets of evaluation by analysing the following points: -Difficulties inherent in objectivation.
-Questions concerning the interpretation of results.
-The question of the diversity of clusters.
-The link between evaluations and political decisions.

The question of objectivation
The various examples of cluster policy evaluation reveal the existence of ongoing problems concerning both the reliability of data 21 and of issues concerning measurement: numerous phenomena are either not measured, badly measured, insufficiently measured or difficult to measure.
Firstly, some of the phenomena observed can only be analysed qualitatively, which implies problems in terms of objectivation and comparability. How, for example, should the operational aspects of a cluster be measured?
But, even when a phenomenon can be quantified, indicators are always reductive: for example, in France, in order to measure the success of a cluster, in terms of building up a network for example, the number of "members" is counted. But this indicator does not take into account the number of "sleeping" partners and can have the perverse effect of encouraging directors to "recruit" members without examining the degree to which they are committed to the activities of the cluster. It is for this reason for example, that in Austria, the level of "participation" of member firms in administrative bodies and collective events is measured. But have we really measured the phenomena, namely the construction of a network of actors?
In terms of measurement, specific categories and instruments are often elaborated a posteriori, especially in that objectives defined by clusters evolve over time. This explains the highly "traditional" nature of recurrent indicators in national statistical systems and the difficulties inherent in longitudinal and comparative studies, already noted by the BIPE (2007), using more accurate indicators. It should be noted that it was possible to carry out a more objective impact analysis in Germany, based on an economic indicator (added value), for two reasons: firstly, a sufficient period of time had elapsed between the implementation of the policy and the evaluation, and, secondly, the programme itself (the competitions) made it possible to compare the performances of subsidised and non-subsidised regions in the same sector (biotechnology). On the other hand, the Belgium encountered the same problems as France.
Lastly, the process of elaboration of indicators is influenced by question of the parameters of cluster activities. Should the activities of all member companies and research centres be taken into account when measuring the economic activity of an individual cluster? Is there an inherent risk in such an approach of overestimating the contribution of major companies?

The question of measuring and imputing results
Cluster evaluation raises the crucial question of what a result is and for which actors. In effect, relatively contrasting points of view are possible depending on whether clusters' "clients" are thought to consist of enterprises whose productivity is to be increased, or the "collectivity" (the region or the nation, for example). France, Belgium, Finland and to a certain extent Germany focus to some degree on the impact on the collectivity, while Austria places more emphasis on the satisfaction of enterprises by producing surveys on their clients. This difference in perspective can considerably modify views concerning performance.
Furthermore, the notion of performance is highly polysemic and never precise. All the evaluations we examined classify performance into different categories (scientific, HR, economic, etc.) and, within each category, attempt to objectivise it in different ways. There are few common elements: turnover generated by cluster projects, the number of jobs or enterprises (created, safeguarded, attracted), the number of patents lodged by members of the cluster. These are some of the indicators used by the French assessors, CMI-BCG, as a foundation for future cluster evaluations in France.
The variety of approaches to performance raises the question -a traditional one in terms of evaluations -of the distinction between "end" results, or impacts, and "intermediate" results.
Thus, in the French case, evaluators prudently assessed intermediate results (for example, joint R&D projects and the way in which governance was structured) but did not examine the end results habitually analysed (in terms of enterprises and/or jobs created), which could not be observed over such a short space of time (Pentikäinen 2000).
Naturally enough, an analysis of various evaluation approaches shows that the more recent the programmes, the more the evaluation concentrates on intermediate results (the number of projects, the quality and cost of projects, the number of joint-projects, etc.) and on analyses of processes and resources, as we have seen in the case of Belgium and France. On the other hand, evaluations carried out later in the process, as in Germany, tend to focus essentially on end results, with assessors taking the view that these are the only results that count (Staehler et al. 2007).
As well as measuring cluster performance, we are faced with the question of the schemas of the causality underlying the choice of criteria and phenomena observed. These schemas are not always explicit and, at any event, are often open to question. This is not surprising in terms of an analysis of such systemic dynamics. But it clearly makes interpretation, based as it is on indicators, a more delicate affair.
For example, the supposed casual link R&D ! innovation ! competitiveness is not a determinist one. Joint research projects can have significantly different impacts, sometimes immediately visible (when they lead to commercially successful innovations), sometimes less so if they merely contribute to the development of networks encompassing new actors.
Moreover, a number of joint projects are in effect opportunistic coalitions between actors designed to attract extra funding for projects that they intend to pursue anyway (see the Finnish case). In France, assessors have observed that newly established clusters "destock" existing R&D projects with a view to attracting funding.
An essential question in terms of the imputability of results revolves around the "inherited" characteristics of clusters (the nature of partners and the ways in which they were previously linked, the ways in which they cooperated and innovated, etc.), as well as the way in which they influence the results of the cluster compared to the voluntarist actions of governance, its mode of organisation and the suitability of support mechanisms (notably financial mechanisms) linked to national policy. It is clear that these schemas of causality, while convincing, have not been verified and that, in our opinion, one of the main functions of the evaluation of cluster policies is to test, enrich and even contest them.
Nevertheless, the importance of the question of the imputability of results varies from evaluation to evaluation. The German approach focuses on end results, most of them economic in nature. It emphasises a conception of the policy and of clusters in general based on results and does not really examine the black box of organisation and the mechanisms making it possible to develop clusters. This approach could be summed up in the phrase "the ends are more important than the means." In this case, potential for analysing imputability is reduced.
On the other hand, the evaluation carried out in Austria focused on the assessment of cluster management without attempting to measure economic impact. A central role is accorded to relational indicators and participation in cluster projects. Such a focus on management suggests that the question of causality does not arise. The approach presupposes that relevant factors explaining the development of a cluster are known. If those factors are correctly aligned, the cluster will be successful. In other words, it seems that the bodies responsible for overseeing the evaluation believe that action theory is relevant to cluster management and that it should only be applied to evaluating management.
The French solution, which consists in assessing intermediate results (research projects, etc.) and the way in which clusters are run, does not have the capacity to interact directly with cluster management and seems to represent an intermediary approach between the two cases presented above. It reveals that even if the policy is based on a number of hypotheses concerning the imputability of results, such as the notion that the development of joint research projects will help to create a local dynamic, its intention is, by means of the evaluation, to verify the link between management and results.
In our view, taking into account the knowledge of the dynamic of the clusters, which, as we know, is deployed over relatively long timescales, is one of the most important issues in terms of future evaluations, in that it should make it possible to examine "end" results.

The question of the diversity of clusters
Evaluating the performance of clusters implies taking into account a wide variety of different situations. As the evaluation confirmed, competitiveness clusters in France do not follow the same operational model. Nor do they have the same resources, the same configuration in terms of actors, or the same level of maturity. For example, some clusters, with a long tradition of working in tandem with research centres or working on the basis of subensembles have had little difficulty in adopting the kind of research-industry collaboration model proposed.
Nevertheless, not only do the public authorities consider that they are obliged to evaluate the policy on the national level, in that substantial public funds are poured into it, they also believe that the evaluation should be organised in such a way as to make possible a comparison between individual clusters, a source of potential emulation and, above all, of learning for the actors piloting the policy and for the clusters themselves. Whence a unique evaluation framework which, in the view of certain clusters, has failed to take into account a number of specificities. Can the lack of emphasis on diversity in evaluations carried out in the four European countries mentioned in this study be explained by a lack of desire to assess all the clusters, with each case considered specifically? Although the point is an important one, it should nevertheless be pointed out that diversity is less of an issue in the evaluation of sectorbased policies -as in Germany and Finland.

Links between evaluations and political decisions
As we have already pointed out, evaluations not only focus on policy but also on development issues. In this section, we examine the political decisions taken in these two fields following the evaluation.

The impact of the cluster policy evaluation
The various evaluations considered in this paper all led to the renewal of cluster policies.
However, a number of improvements were suggested. In France, for example, potential improvements included a clearer definition of development strategy and shorter payment terms once funding has been granted; in Germany, an amelioration in the financial situation of biotechnology companies; or in Finland, more involvement in joint-projects on the part of enterprises.
Most evaluations also emphasised the need for continuous and systematic data in order to make future assessments easier. In this regard it should be noted that, in Austria, cluster managers provide a highly accurate report on their activities. In France, the performance contract between clusters and the public authorities (State, Region) includes the implementation of systematic annual indicators. Nevertheless, as we have already pointed out, this raises the issue of the independence of evaluations in which stakeholders furnish a number of their own performance indicators.
It should also be noted that evaluations should be thought of as instruments suitable for examining potential policy evolutions rather than as a kind of sword of Damocles threatening the very existence of the policy itself. Thus, for example, in France, the continuation of the policy was announced a year before the publication of the results of the evaluation. Although evaluations increase knowledge about clusters and the way in which they function, decisions concerning public policy are a matter for the political establishment.

The impact of evaluations on clusters
Approaches to cluster evaluation differed in each of the countries studied. As we have already mentioned, the evaluation carried out in France eventually led to six clusters having their accreditation revoked. It should be noted that, according to French policy, clusters must be accredited -or, in other words, recognised as competitiveness clusters -in order to acquire public funding for both operational and research purposes. Without such accreditation, the ability of clusters to apply for public funding is substantially curtailed. Furthermore, accreditation provides a degree of legitimacy and visibility to local actors vis-à-vis national actors such as the CNRS, which consequently take an interest in their research activities. On the other hand, the government has no direct influence on the organisation of clusters or on their employees. For example, teams of employees have no hierarchical links with the public authorities. Revoking accreditation is there the only way in which the public authorities can exert an influence on clusters regarded as inefficient. The Wallonia region has adopted a similar procedure and evaluation there led to the disaccreditation of two clusters.
A major difference between the two cases can, however, be observed in the decision-making process. The French evaluation identified thirteen clusters "which could benefit from a thoroughgoing overhaul." They were given a year to propose a new mode of organisation to the public authorities. Eventually, two years after the evaluation, six of the thirteen clusters had their accreditation revoked. While the poorly rated clusters appreciated that fact that they had been given a "second chance", they considered that the time limits concerning the final decision were far too long, a situation which wrought havoc in terms of motivation. In Belgium, on the other hand, the decision to revoke accreditation was immediate, which, while making any form of restructuring impossible, seems to have rendered the policy a good deal more effective. The situation in Austria is somewhat different. Two kinds of sanction are applied: on the one hand, clusters can either have their accreditation revoked or be merged with another cluster (there are similarities with the situation described above) when the "demand" of such a cluster is not given any more, or, on the other hand, the sanction can also focus on the cluster manager since the cluster umbrella body has the power to fire him or her if it considers that results are not up to scratch. This course of action cannot be taken in France, Germany or Belgium where there are no hierarchical links between the national government and the associations which manage the clusters.
The impact of the two mechanisms is this different in nature. In the first case, the sanction falls on the entire region and on all the actors involved. There can thus be major consequences not only in terms of the visibility and image of the region concerned but also of the dynamism and motivations of local actors (enterprises, research centres and local authorities). The psychological and economic impacts of the decision could constitute a double sanction consisting in a loss of access to public funding and a loss of visibility and legitimacy. In Austria, the operations team or its manager can be sanctioned by being fired, an approach which presupposes that poor results are the result of a lack of professional competence. Such a procedure does not call into question the dynamism of the actors or any potential future joint projects.
In conclusion, the French case also differs from the evaluation of the BioRegios policy. In effect, this last were assessed in an ex post manner, almost two years after having been set up. There are thus two different conceptions of the role of evaluations. In France, Austria, Finland and Belgium, evaluation appears to be a tool amongst other for steering clusters, even if the methods applied to that goal differ. Evaluation models of this type tend to focus on assessing ongoing activities. In Germany, evaluation is characterised by a more traditional approach in the sense that it is basically an ex post operation independent of the policy sphere. It would thus seem that the German approach is based to a substantial degree on what has been termed the "ballistic" (Padioleau, 1982) or "epidemiological" (Stame, 2009) model, according to which evaluation is the last link in the chain of a process of public action designed to be sequential and linear.
=" *'$*0)2#'$+ The comparison of different evaluation approaches presented above demonstrates that rather than a unique model there exist a range of different conceptions. Our empirical comparative work does not offer sufficient material to propose different evaluation models. Nevertheless, we have been able to isolate a number of discriminating factors.
The desire, or otherwise, to break into the cluster's "black box" constitutes an initial element of differentiation. France has made this choice; the country's approach is based on a centralised mechanism for piloting clusters and evaluation makes it possible to examine clusters' internal workings. Austria and Germany take a different approach; the evaluations carried out in those two countries do no break into the black box. In Germany, the purpose of evaluation is to assess economic results ex post. In Austria, evaluation is designed to measure management performance largely by means of client satisfaction surveys, an approach which could be characterised as a delegation model. This difference in approach may signal the first steps in the appropriation of the question of evaluation by management researchers focusing on the piloting of clusters rather than, like economists, on their economic performance.
Furthermore, it seems that the kind of one-off evaluations carried out in France and Germany can be contrasted with the type of continuous evaluation applied in Austria. In one-off evaluations, unique evaluations (as in Germany) can be distinguished from the periodical approach characteristic of France, where a second evaluation will be undertaken in 2012. The objectives and levers of these evaluations differ from one another. In Austria, evaluation is designed to contribute to piloting cluster projects. Nevertheless, the Austrian approach is reminiscent of an operation designed to monitor the activity of employees.
Moreover, evaluations can be carried with a view to measuring performace from the point of view of the collectivity or from that of member companies. Methods, instruments and indicators vary. The results of such evaluations will be put to different uses. The choice reveals two conceptions of the purpose of clusters: either to contribute to developing the territory in which they are based or to encourage competitiveness amongst member companies.
Lastly, it is useful to recall that assessments can focus on end results, as in Germany, where the purpose of the evaluation is purely to accumulate knowledge, since it can have no direct effect on policy. This form of evaluation is based on the canonical ex post, independent model the purpose of which is to ensure that public funds are being put to good use. This traditional conception of evaluation can be contrasted with another approach which seems to be developing around clusters and which has been qualified as the "chemin faisant" (or gradualist) approach (Fen Chong, 2009). This conception, adopted in France and Belgium, and to a certain degree in Austria, uses evaluation as a tool (amongst others) with which to pilot clusters. The approach is, generally speaking, relatively experimental, and should be useful in terms of evaluating the mechanics of individual clusters.
The summary of the differences between the five evaluations examined in this paper highlights the complexity inherent in assessing cluster policy. The five differentiation criteria identified here may or may not be independent or compatible. Thus, the decision concerning whether or not to enter the black box is correlated with the nature of the results measured.
But, in regard to this question, it can be observed that France, Germany and Austria opted for different evaluation approaches. It is thus evident that different criteria can lead to the development of different models. It would nevertheless seem that a major issue in evaluation is whether to assess end results or intermediate results. Conceptions of the role of evaluation and the methodologies applied to it vary substantially. In order to meet the demands of the bodies requesting evaluations, assessors must take this variety of criteria into account. This paper has shown that there is no unique evaluation model for clusters. Indeed, our empirical analysis demonstrates the existence of a variety of differentiation criteria. The paper is nevertheless characterised by a number of limitations. Firstly, due to a relative lack of documentation concerning evaluations of clusters outside France, it was not possible to affect an exhaustive comparison of the various approaches. Moreover, this study confirms that the term "cluster" covers a multiplicity of notions, a fact which makes comparison more difficult.
However, even taking such issues into account, it is probable that a sytematic and wideranging analysis of cluster evaluations could lead to the emergence of new models of evaluation. The cluster policy and the current situation of the 71 clusters "Pilot Programme" and the current situation of four "pilot clusters"

Policy objective
The Cluster Management Teams: -interface with the public sector -help enterprises to develop their projects and ideas -arrange joint projects -provide aid and advice in terms of project development -provide a qualifications offer -help members to break into international markets The setting up, by means of the integration of biotechnology capacities and programmes, of a dynamic process of innovation designed to initiate the commercialisation of modern biotechnology in Germany (cluster creation) (see p. 4) 1. The setting up of a new, permanent cooperative structure 2. Improving the cooperative aspect of the research system 3. Increasing the relevance and flexibility of projects ! Ultimate objective: "to generate growth, improve industries' competiveness and productivity, increase employment, generate new innovations and improve social welfare" (p. 60) "The programme is intended to boost the competitiveness of the French economy and to create growth and jobs in promising markets: -by developing innovation; -by boosting essentially industrial activities with a high degree of technological content and by setting up enterprises in the French territories; -by making France more attractive by increasing its international visibility." 27 The programme is intended to provide "stimulus in terms of encouraging companies to exploit this potential and facilitate the implementation of initiatives creating the conditions required for promising interactions between firms." (p. 85)