Majority Judgment vs. Approval Voting

Majority judgment (MJ) and approval voting (AV) are compared in theory and practice. Criticisms of MJ and claims that AV is superior are refuted. The two primary criticisms have been that MJ is not “Condorcet-consistent” and that it admits the “no-show” paradox. That MJ is not Condorcet-consistent is a good property shared with AV: the domination paradox shows majority rule may well err in an election between two. Whereas the no-show paradox is in theory possible with MJ it is as a practical matter impossible. For those who believe this extremely rare phenomenon is important it is proven that MJ with three grades cannot admit the no-show paradox. In contrast, AV suffers from serious drawbacks because voters can only “tick” or “approve” candidates—at best only Approve or Disapprove each candidate. With AV voters cannot express their opinions adequately; experiments show that Approve is not the opposite of Disapprove; and although AV does not admit the no-show paradox it admits the very closely allied “no-show syndrome” and “insensitivity.” Two is too few. Substantive debate must concern three or more grades.


Introduction
In 1925 Walter Lippmann, claimed by many to be the most influential journalist of the twentieth century, pointed to an obvious yet fundamental limitation of the majority rule (MR) as it is practiced in the United States, the United Kingdom, France and most nations throughout the world: "[W]hat in fact is an election?We call it an expression of the popular will.But is it?We go into a polling booth and mark a cross on a piece of paper for one of two, or perhaps three or four names.Have we expressed our thoughts . . .?Presumably we have a number of thoughts on this and that with many buts and ifs and ors.Surely the cross on a piece of paper does not express them. . . .[C]alling a vote the expression of our mind is an empty fiction."[25] Approval Voting (AV)-first proposed in 1977 [41], soon thereafter rashly proclaimed to become "the election reform of the twentieth century" [10]-has been advanced as the method that for a variety of reasons will overcome the limitations of MR [14,11,24].It may be described as a method that asks voters to assign one of two grades, Approve or Disapprove, to each candidate.The electorate's rank-order of the candidates is determined by the number of Approves.While marking several crosses or ticks rather than the at most one of MR-as AV-voters are typically asked to do-is slightly better than MR, calling an AV-vote an expression of a voter's mind is an empty fiction as well.
Majority Judgment (MJ) [1], proposed thirty years after AV [41]-a twentyfirst century idea-begins by confronting voters with a specific charge, such as (in an election): "Having taken into account all relevant considerations, I judge, in conscience, that as President of the European Union each of the following candidates would be:" and then asks them to evaluate each candidate in a scale that contains any fixed, finite number of ordinal grades such as Outstanding, Excellent, Very Good, Good, Fair, Poor, To Reject.
Majorities determine the electorate's evaluation of each candidate and the ranking between every pair of candidates-yielding a transitive rank-ordering of all candidates-with the first-placed among them the winner.MJ with more than two grades-six or seven have proven to be good choices-certainly gives voters a much greater ability to express their minds than does AV.
While some AV supporters admit that MJ allows "more nuanced judgments" [11] they steadfastly maintain that AV is superior, advancing a variety of reasons.
The first intent of this article is to present practical and theoretical evidence to show that every one of those "reasons" is either wrong, insignificant, and/or not realized in practice (and some, amusingly, are shared with AV).Among them-in addition to the lack of Condorcet-consistency [20] and the possibility of the no-show paradox [8,12,16,20]-are that it can elect a candidate preferred by only one voter [11], that it is a "tall order" because too difficult for voters [12], that it induces strategic behavior because "I'm afraid, voters would . . .[give] their favorites the maximum grade and their most serious competitors the minimum grade" [12].
AV has been viewed, described, used, or analyzed in three different guises.
(1) It was originally conceived and analyzed in terms of the traditional paradigm of social choice theory, namely, that voters compare candidates: implicitly voters rank candidates and draw a line between those who receive a tick or cross and those who do not. 1 With this view changes in the slate of candidatures could induce voters to change their rankings (resulting, for example, from a change in the number of ticks), so with it AV does not escape from Arrow's impossibility theorem (its advocates made no such claim until it became apparent that AV could be viewed differently).
(2) It has been used as though it were a point-summing method with a tick or cross worth 1 point.Thus, for example, the Social Choice and Welfare Society's ballot for electing its president had small boxes next to candidates' names with the instructions: "You can vote for any number of candidates by ticking the appropriate boxes," the number of ticks determining the candidates' order of finish.No meaning is ascribed to Approve other than it gives 1 point and a candidate's point total determines the order-of-finish (giving no point is in no way identified with Disapprove).It is entirely up to voters to decide how to try to express their opinions.
(3) After MJ became known a new view emerged: "the idea of judging each and every candidate as acceptable or not is fundamentally different" from either voting for a candidate or ranking them ( [11], pp.vii-viii).This suggests a belief that voters are able to judge candidates in an ordinal scale of merit with two grades, so AV becomes MJ with a language of two grades (called Approval Judgment in [2]), escapes Arrow's impossibility theorem, and inherits the other good properties of MJ (except for the limitations due to too few grades).With this point of view MJ vs AV turns into an argument over how many grades MJ should use.
The second intent of the article is to show that two grades are too few (however AV is viewed).Two obviously permits no nuance in the expression of voters' opinions, in and of itself this is sufficiently damning.AV's behavior also shows that two is too few.It is proven that with MJ the no-show paradox-extremely unlikely in practice with three or more grades-cannot occur when there are three grades, so overcomes all the AV-enthusiasts' criticisms.
In general, it is argued, majority judgment is a significantly better replacement for majority rule than approval voting has pretended to be. 2

Methods
A method of ranking candidates (or more generally competitors) based on measures as versus comparisons begins by formulating a question appropriate to its use (in elections to political offices, or in competitions among figure skaters, wines, pianists, or movies) and defining an ordinal scale of merit or evaluation-a language of grades Λ-of possible answers.In elections, scales of five to seven ordinal grades have been used; wines have used 21 numerical grades (in Australia) or seven ordinal grades for each of 14 attributes (by the U.I.OE.); the Chopin International Piano Competition has used 100 at an elimination stage and 12 for six finalists (for these and other examples see [2]).
Given a language of grades Λ, a method of voting asks voters (or judges of a jury) to evaluate every candidate (or competitor) by assigning each a grade in that scale.The input-called the opinion profile3 -may be represented as an m by n matrix γ = (γ ij ), where m is the number of candidates, n the number of voters, and γ ij ∈ Λ the grade assigned to candidate i by voter j.
A method of ranking R is a non-symmetric binary relation R on candidates that associates to each opinion profile a comparison between them, A R B meaning that the electorate prefers candidate A to candidate B or is indifferent between them, A ≈ R B that it is indifferent between them, and A R B that A is strictly preferred.

Majority judgment (MJ)
The motivation for MJ was to find a meaningful method of ranking that induces judges and voters to express their evaluations fully and honestly-because an electorate's or a jury's collective rank-order of competitors should depend on true opinions and not on strategically calculated expressions-that avoids the bugbears of the traditional theory.It has been described and characterized in several different ways [2,5,6].MJ is based on one single concept that generalizes the idea of median and is not, as has repeatedly been claimed, the median together with "an elaborate set of rules for breaking ties.These are plausible, but there are other tie-breaking rules that would probably work just as well" [12].
Succinctly put, MJ is majority rule applied to grades, as the following description shows.What is the electorate's majority opinion of one candidate with grades cast by n voters?Voters must be treated equally so who gave which grade is of no Nicholas G. Hall, urged all to vote in the election of INFORMS officers.In it he states that historically only some 20% vote.AV is used although whenever one officer is to be elected there are but two official candidates.Given the arguments of this article the time has perhaps come for INFORMS to reconsider the method it uses to elect its officers.
importance.Define a candidate's merit profile to be the set of her grades listed from highest on the left to lowest on the right: For any k less than a majority, 0 ≤ k < n/2, partition the candidate's merit profile into a middlemost block, and left and right blocks of an equal number of grades: The right and middlemost blocks show there is a majority of n − k for giving the candidate at most the grade α k+1 , and the left and middlemost blocks that there is a majority of n − k for giving the candidate at least the grade α n−k : call this a 100(n − k)/n%-majority for [α k+1 , α n−k ].When k = 0 this is the unanimous decision for evaluating the candidate at best α 1 and at worse α n .The larger k the closer to equal are its two grades α k+1 and α n−k , so the more precise is the majority evaluation.For n odd taking k = (n − 1)/2 means the middlemost block is reduced to a single grade called the majority-grade, the most precise majority evaluation possible.
Consider a real example.The name for a newly formed computer sciences laboratory (the fusion of two existing groups in the spring of 2015) at the Université Paris-Diderot (Paris 7) was to be chosen.Ten names were proposed (here called A through J), 95 persons voted, and they chose a language of five grades4 : Good, Rather Good, Not Bad Not Good, Rather Bad, Bad.How does a majority of an electorate rank candidates having sets of grades?Clearly when one candidate's grades are uniformly better than another's (s)he must be ranked above the other.

Rather
A candidate A's merit profile α = (α 1 , α 2 , . . ., α n ) dominates B's merit profile β = (β 1 , β 2 , . . ., β n ) (both written from highest to lowest) when α i β i for all i and α k β k for at least one k; equivalently, when A has at least as many of the highest grade as B, at least as many of the two highest grades,. . ., at least as many of the k highest grades for all k, and at least one "at least" is "more." 6very method of ranking should respect domination on every pair of candidates: namely, evaluate one above another when the first's grades dominate the other's.
When one candidate's majority-grade is above another's s(he) must be ranked above the other.When both have the same majority-grade-e.g., A and B in the choice of a new name for the laboratory are both rated Rather Good -the most precise majority or the smallest x for which there are x%-majorities for both that differ decides.An equivalent description leads to an intuitive rule.Let p A % and q A % be A's percentages of grades strictly above and strictly below her majority-grade, and likewise for B. p A = 50 − 14.21 = 35.79%,q A = 50 − 40.00 = 10.00%, and p B = 17.89%, q B = 48.43%.The 51.57+ %-majorities are determined by the biggest of these four, and the rule is this: if the biggest is for a higher grade that candidate leads the other, if it is for a lower grade that candidate lags the other.

Rather
A simple diagram pictures the situation in general.
A's merit profile: Candidate A is ahead of B in the MJ-ranking M J when: A's and B's middlemost grades (bracketed around the center, ) are all equal except for the leftand right-most (as indicated by ), ᾱ = β or/and α = β, and either meaning β ᾱ α β.
Thus a succinct complete description of how the majority principle ranks is this: The MJ-ranking M J : For each pair of competitors ignore the maximum equal number of highest and lowest grades of their merit profiles so that when the remaining middlemost grades are compared domination or consensus decides.This rule-the logical outcome of applying the majority principle to gradesis what many seem to have been groping for.For example, the Fédération International de Natation (FINA) traditionally used a point-summing method for diving, with each of five or seven judges assigning a number grade between 0 and 10 in multiples of 1  2 (the meanings of each carefully defined [18]) to every dive, the sums of grades determining the rankings.Recently, it changed: when there are five judges the highest and lowest grades are eliminated, when there are seven the two highest and two lowest are eliminated, and then the sums of the remaining grades decide the rankings.Had FINA (or the International Skating Union, the International Gymnastics Federation, or others) gone a bit furthereliminating what must be to distinguish a difference and letting domination or consensus decide rather than sums-they would have used MJ.
It is immediately evident that in practice when there are many voters, (1) every candidate will have a majority-grade, and (2) in comparing candidates consensus is almost surely never invoked since that would require ties in the determination of the maximum number of the rule.Accordingly, assume this is the case and define candidate A's majority-gauge (MG) to be (p A %, α A , q A %), and likewise for any candidate, with α A her majority-grade, p A % the percentage of her grades strictly above α A , q A % the percentage of her grades strictly below α A .
The MG-ranking M G : Consistent with the usual interpretation, "+" may be attached to a majority-grade α C when p C > q C , and "−" otherwise.
With many voters the MG-ranking is for all practical purposes identical to the MJ-ranking.

Good
Rather The full merit profile of the vote at the Université Paris-Diderot is given in Table 2a.Of the 45 pair-by-pair comparisons of grades, 40 are dominations, so all reasonable methods should rank them identically.The MJ-ranking (=the MG-ranking) is given in Table 2b

Majority rule and the domination paradox
Majority rule (MR)-also called plurality rule and first-past-the-post-is the most used method of voting: voters tick the name of one candidate at most, the ranking of the candidates is determined by their numbers of ticks (used, e.g., in Great Britain and the USA); if the top-ranked candidate does not obtain an absolute majority of ticks sometimes (e.g., France) a run-off is held between the two top finishers (call this MR+).That MR and MR+ measures the relative support of candidates badly was recognized long ago by both Condorcet [15] and Borda [9].Recent elections in the USA and France prove the point.George W. Bush defeated Albert Gore in 2000 due to the presence of Ralph Nader's candidacy: Bush's margin of 537 votes in Florida gave him the state's 25 Electoral College votes; had Nader not been on the ballot most of his voters would have preferred Gore, making him the victor in Florida and the nation.Jacques Chirac was elected France's president in 2002 because the candidate who would in all likelihood have defeated him face-to-face, Lionel Jospin, was eliminated in a first-round of 16 candidates, resulting in a second-round that pitted Chirac against a candidate that almost any candidate would have defeated, Jean-Marie Le Pen: the presence of candidates with absolutely no hope of being elected eliminated Jospin.Nicolas Sarkozy won France's 2007 presidential election for much the same reason: François Bayrou would have defeated any candidate face-to-face-he was the Condorcet-winner-but was eliminated in the first-round.A recent study of the 2017 French legislative elections [32] shows that in 19.2% of the 577 elections the presence of three candidates (in run-offs) denied the election of the Condorcet-winner.
What has not heretofore been appreciated is that MR may go badly wrong in face-to-face encounters, i.e., when there are but two candidates.
A method is subject to the domination paradox if a candidate A's grades dominate B's but the method ranks B above A.
MR is subject to the domination paradox.A national French presidential poll of 2012 (described more fully below) proves the point.The leading pretenders  were F. Hollande and N. Sarkozy.Hollande's grades dominate Sarkozy's (Table 3a) yet their merit profile could have come from the opinion profile of Table 3b.Using MR a voter would tick the candidate (s)he evaluates higher and abstain or vote blank when both candidates are evaluated To Reject.This makes Sarkozy the overwhelming MR-winner with 59.57% of the votes to Hollande's 26.19%, 14.24% rejecting both.
In fact Hollande defeated Sarkozy in the run-off obtaining 51.6% of the votes, whereas the opinion profile of the national poll showed Hollande winning with 53.9% of the votes.These relatively narrow victories when Hollande's grades completely dominated those of Sarkozy-and Hollande's majority-grade was Good, Sarkozy's Fair -show that MR can easily go wrong in practice.This is not an isolated event.
Another example is the US presidential election of 2016.The Pew Research Center conducted four in-depth national polls during the course of the election.Among the many questions posed to the participants was the following: "What kind of president do you think each of the following would be-a great, good, average, poor, or terrible president?" Pew Research did not realize that this provides the inputs to a method of election; however, the fact that they used it is testimony to their belief that the question is natural and the answers revelatory of the electorate's opinion.The last of the four polls was conducted shortly before the election in the period October 20-25 with the result given in Table 4a (except for a 1% change to make a point).

Great Good Average Poor Terrible
Table 4a.Merit profiles, 2016 US presidential poll, Pew Research, October 20-25 [31].(Clinton's 9% Great was in fact 8% and her 26% Good was 27%.) Clinton's grades very comfortably dominate Trump's so MJ (together with any reasonable method) makes Clinton the winner.Moreover, Clinton's majoritygrade is Average and Trump's merely Poor (with or without the 1% change).Yet their merit profiles could well have come from the opinion profile of Table 4b.With it MR elects Trump with 49% of the votes to Clinton's 28%, the remaining 23% of the voters judging both to be Terrible.The lesson is clear: MR can easily go wrong even in an election with two candidates.This has far reaching consequences because AV (as practiced), Borda's, Condorcet's, and all methods based on voters' preferences as versus evaluations become MR when only two candidates compete.

9%
A case in point may well be the 2016 U.S. election.Clinton's and Trump's merit profiles were remarkably similar throughout the year-in January, March, August and October-despite the many dramatic ups and downs of the campaign as may be seen in Table 4c Clinton's grades dominated Trump's in January, March and August, and nearly did in October; her majority-grade was Average and his Poor all four times.So why did Clinton lose?U.S. voters were in open revolt, determined to show their exasperation with politicians.But how, with MR, could they express this disgust other than by voting for Trump?Had they been able to express their opinions more fully-rating Clinton Poor or Terrible but Trump as Poor or Terrible as well-Trump's very narrow MR victories in Florida, Michigan, Wisconsin and Pennsylvania might well have become Clinton's giving her 307 Electoral College votes to Trump's 231.

Approval voting (AV)
AV is a very slight improvement over MR (when there are more than two candidates): voters tick the name of any number of candidates they wish, the ranking of the candidates is determined by their numbers of ticks.That is its traditional point-summing explanation when used.
If, however, AV voters are specifically asked to evaluate candidates in a scale of merit that contains Approve and Disapprove-contrary to when it is used with voters asked to tick candidates-AV is MJ with a language of two grades.It has a simple description (already evident from the SCW Society's instructions): taking 1 when voter j assigns Approve to candidate I and 0 when voter j assigns Disapprove to candidate I, AV ranks candidate A above candidate B when: If Approve meant Good or better to the Université Paris-Diderot voters the AV-ranking would have been That is very different from the MJ-ranking.In this case AV and MJ give the identical winner (A's grades largely dominate any other name's grades).However, AV does not respect domination because F and G are tied and yet F 's grades dominate G's (this happens because two grades are too few).However, AV respects weak domination: if one candidate's grades dominate another's the first either leads the second or they are tied.

MJ vs. AV: practice
How and why AV fails in practice-in particular, fails in comparison with MJis shown via several different uses and experiments.
The inputs to voting are ballots filled out by electors.In a majority vote a ballot poses no question.The voter knows at most one candidate may be given a tick, and the candidates' total ticks decide.Similarly, a typical AV-ballot poses no question but specifies that the voter may give as many ticks as he wishes and the total ticks decide (see Table 5a).
Vote for one or more candidates by ticking in the appropriate circles.
The candidate with the most votes wins.
Accordingly, ticks are given for completely different evaluations of the candidates and there is no sense that giving no tick to a candidate means he is Disapproved.Moreover, instead of eliciting an evaluation of the candidates the instructions suggest comparing them.
Having considered all relevant information, I believe in conscience, that as President of the United States of America each of the following candidates would be: You must check one single grade for each candidate; no check is taken to mean Terrible.

Table 5b. Typical MJ-ballot.
The ballot of Table 5b implicitly forces every voter to evaluate every candidate.There are alternatives to handling "missing grades" (see section 13.4 [2]).Among them one " compensates fairly": calculate the majority-grade α C of each candidate C with the grades that are given; then adjoin α C to C's grades as many times as necessary to obtain the full complement of grades.
A typical MJ-ballot poses a question and elicits an answer in a scale of grades easily understood by voters or judges (see Table 5b). 10The question emphasizes the distinct meanings of the answers that are to be given by the voters and that are by and large shared among them.In contrast, the meanings of the ticks or the Approves given by a voter in AV-voting vary widely.

1999 SCW Society 1999
AV was used in the 1999 election of the Social Choice and Welfare Society's president [13,35].Members were also asked to indicate their preferences among the three candidates (called A, B, and C).Seventy-one members voted, giving the outcome (where numbers attached to candidates indicate their AV-scores) Two voters ticked three candidates; three ticked two; 64 ticked one; two none.
Fifty-two members gave their preferences Their AV-scores were giving the same result; here one voter ticked two candidates, 49 ticked one candidate, and 2 ticked none.In this case the face-to-face majority rule votes yield a transitive order (called the Condorcet-ranking) since (where numbers attached to candidates give their votes) whereas Borda's method 11 gives the outcome (where the numbers attached to candidates indicate their Borda-scores) A(59/52) B C(58/52) B B(39/52).
AV elects the Borda-winner not the Condorcet-winner.
• It is claimed that "[t]he AV and [Borda] winners generally coincide with the Condorcet winner" 12 and that AV has "a strong propensity to elect Condorcet candidates" [13].AV cannot guarantee the election of a Condorcet-winner, as this example shows.These claims are often violated when the result between a pair of candidates is "close" (not surprisingly most methods agree when there are strong winners).Experimental evidence shows that AV-winners are often not Condorcet-winners and often not Borda-winners (see [2], Table 19.1, p. 343).

French presidential election 2002
In the belief that AV was a reasonable method an experiment was conducted on April 21, 2002, in parallel with the official vote (for more details see [2] pp.329-333).It was realized in five of Orsay's twelve voting precincts and the one voting precinct of Gy-les-Nonais (Loiret), where 3,346 voters cast official votes (at most one vote for one candidate) and were then asked to participate in the experiment by casting AV-votes.As in subsequent experiments carried out in France permission was obtained to place separate voting booths and urns on the paths of voters who had already cast their official votes: 2,597 voters (78%) participated.
The AV-ballot stated: "The elector votes by placing crosses [in boxes corresponding to candidates].He may place crosses for as many candidates as he wishes, but not more than one per candidate.The winner is the candidate with the most crosses."The results are given in Table 6.The three major candidates according to popular opinion and the majority vote all lost relative support, the topranked candidate Jospin dropping to 12.9%; whereas all the others (except Chevènement) gained, making the final ranking less rather than more definitive.Jospin's 40.5% to Chirac's 36.5% does not give him added legitimacy.
The results do not accurately measure the relative standings of Chirac and Le Pen, the run-off candidates.In these six voting precincts Chirac defeated Le Pen with 89.3% of the votes.There are three possible ways to estimate that outcome using the information revealed by AV.The difficulty is how to "count" a ballot that either gives a cross to both candidates or gives no cross to both.Three estimates were made, in all a cross for one candidate and none for the other gave a 1 to the first.Otherwise, (1) crosses to both and crosses to neither gave both 1/2 (expressing indifference); (2) crosses to both gave both 1/2 (indifference), crosses to neither gave both 0 (says nothing); and (3) crosses to both and crosses to neither gave both 0 (saying nothing).These three interpretations gave the following results, all far from the actual result: (1) Chirac 61%, Le Pen 39%, (2) Chirac 79%, Le Pen 21%, (3) Chirac 80%, Le Pen 20%.This suggests that AV measures badly.
This experiment shows how far an electorate can be from having singlepeaked preferences. 13If the preferences were single-peaked then the crosses on each ballot would have to be in consecutive boxes (relative to the fixed left-right order).With 16 candidates there are 137 such ballots (16 with 1 cross, 15 with 2 crosses, 14 with 3 crosses, . . ., 1 with 16 crosses), so among the 2,597 ballots at most 137 could be different.In fact, there were many more: 813.However, these AV-ballots were used to deduced how the candidates line-up along a left-right spectrum. 14This is merely statistical, single-peakedness does not hold, so the existence of a Condorcet-winner is not guaranteed.Nonetheless, Condorcetcycles may be avoided when sufficiently many voters' preferences are singlepeaked with respect to a fixed order of the candidates (see, e.g., [28,19,22]).
• After citing an example where with plurality voting a winning candidate received less than a third of the votes but would have had 54% with AV, it is claimed that it "often will . . .confer legitimacy on their victories to the extent that it shows their support to be widespread" ( [14], p. 8).On the contrary, with many candidates spanning a wide and polarized political spectrum AV will often fail to clearly distinguish a winner; moreover, the winner will often fail to be conferred the supposed "legitimacy" of an AV-score close to or above 50%.
• AV measures badly because of its two grades or its limitation to ticks.This experiment convinced us that AV is a flawed method-completely changing our earlier view-and spurred the search for another method. 13An electorate's preferences are single-peaked if the candidates can be listed in a fixed order from left to right so that every voter prefers some one candidate C and the more distant a candidate D is from C to the left or to the right the lower D is in the voter's preferences.

2009 Louis Lyons Award for Conscience and Integrity in Journalism
As Louis Lyons Award designate, having taken into account all relevant considerations, I believe in conscience that this nominee is:  The Nieman Fellows at Harvard University traditionally decide to whom this award is given.Five highly regarded nominees emerged from a set of many.The Fellows decided to use MJ and the scale of seven grades given in the ballot of Table 7a.The behavior of the judges in this real use of MJ is very similar to that of voters in electoral experiments (see Table 7b): only one judge J 6 gave different grades to all nominees; three judges did not use the highest grade; five gave their highest grade to at least two nominees; and contrary to the predictions of many, judges are far from limiting themselves to the highest or lowest grades.• With few voters AV often ends in ties although the grades of one competitor dominate another's.Once again, two grades are simply insufficient.

Socialist presidential primary, Alfortville 2011.
For the first time ever the French Socialist Party held an open public primary to name their candidate for 2012.It used MR+.École Polytechnique students [21] carried out experiments designed to compare MJ and AV in voting bureaus of Alfortville, a town next to Paris.Participants were asked to vote via MJand AV-ballots and to answer several questions.
The MJ-ballot charged the voter, "As the Socialist Party's nominee in the 2012 presidential election, after having taken into account all considerations, I judge in conscience that this candidate would be:" and offered six possible responses, either Excellent, Very Good, Good, Acceptable, Poor, or To Reject.
The AV-ballot stated "For each of the following candidates to be the Socialist Party's nominee in the presidential election of 2012, I declare:" and offered two possible responses, either Approve his/her candidacy or Disapprove his/her candidacy.A specific question was asked-not the usual practice-and it made clear that the alternative to Approve is Disapprove.The merit profile is given in Table 8a.468 persons voted officially, of them 292 (62%) participated in the experiment and 284 votes were valid.The winner is immediately clear because Hollande's grades dominate all the other candidates' grades; indeed, each candidate's grades dominate the next candidate's from top to bottom except for Royal.The results are given in Table 8b, where "AV-score" are the participants' AV-votes, "Reported votes" are those they declared having been their official FPP votes (284 voters); "Official FPP" are the actual first-round results (468 voters); and "AV Good " are the AV results if Approve means Good or better.The MJ-and AV-rankings are the same except that AV puts Royal in 4th place and Valls in 5th.The top two AV-scores are both overwhelming, depriving the winner of undisputed legitimacy.Majorities Approve all candidates (except Baylet, member of another party) as versus more conflictual elections where no candidate is Approved by a majority (e.g., the presidential election of 2002, see above).
Statistically the voters behaved as if Approve meant Good or better, as may be seen by comparing the AV-score with AV Good (Table 8b).However, the ballots reveal that voters' thresholds at which they begin to Approve varied very widely (see Table 8c), only a third of them interpreting Approve to mean Good or better.• Voters attribute very different meanings to Approve even when it is clear that its opposite is Disapprove, so two grades are too few, putting into question the meaningfulness of an AV-score.There is a striking disparity between conflictual or polarized elections and more consensual ones such as primaries: often in the former no candidate passes the threshold of an AV-score above 50%, in the latter many do.Among the 268 who answered which of MJ and AV permitted them to better express their opinions, not surprisingly MJ was cited by 66.8% to AV's 33.2%.

French presidential poll 2012.
Terra Nova, a French think-tank, sponsored a national presidential poll carried out by OpinionWay April 12-16, 2012 (before the first-round of the election held on April 22) to compare MJ with other methods.993 participants voted with MJ and also according to usual practice-first-past-the-post (FPP) followed by a MR run-off between every pair of the five most likely leaders of the first round.Since the results of FPP varied slightly from the actual national percentages on election day (up to 5%) a set of 737 ballots was found for which those tallies are closely matched, whose merit profile is in Table 9a.[4].
Table 9b gives the majority-gauges of the candidates, three AV-scores-when Approve means at least Good, at least Very Good, and at least Excellent-and the FPP ranking to show the marked differences among them.Table 9c gives the ten MR face-to-face results between the five principal candidates and the deduced Condorcet-and Borda-rankings.The FPP ranking is different from all others, showing how the lack of information concerning the electorate's evaluations can lead to an outcome that departs widely from the collective consensus.The extreme rightist Marine Le Pen is 3rd though she is Rejected by 47.63% of the electorate and rated Poor or worse by 53.87%, and François Bayrou places 5th although he is much more consensual than either Nicolas Sarkozy, Le Pen or Jean-Luc Mélenchon.
The MJ-, Borda-and Condorcet-rankings are identical among the five major candidates; the three AV-rankings differ.When Approve means Good or better, Hollande places 2nd (contrary to the usual claim since he is the Condorcetwinner).When Approve means Very Good or better the winner (Hollande) receives less than a majority of the votes, curtailing his legitimacy.Two meanings of Approve place Le Pen in 5th place, one places her 4th, whereas MJ places her 8th: thus AV does better than FPP but does not sufficiently take into account the weight of Le Pen's negative evaluations.The three meanings of Approve place Bayrou 1st, 3rd, and 5th.
• Here three very different methods that use more information than AV agree, whereas AV differs.Two grades are simply not enough.

French presidential election experiment 2012, Fresne.
AV asks voters to tick those candidates they Approve; presumably they Disapprove of the others.In the experiment the École Polytechnique students conducted in the context of the Socialist Party's primary this had been made explicit (see above).If the two grades Approve and Disapprove constituted a sufficiently rich scale then a voter who does not give a tick to a candidate-who does not Approve of the candidate-Disapproves of the candidate.Therefore, if instead of asking voters to tick candidates whom they Approve they are asked to tick voters whom they Disapprove the result should be the same, though here more ticks means lower in the ranking.This proposition was tested.
The experiment was conducted in Fresnes, another town close to Paris, in parallel with the first-round of the French presidential election on April 22, 2012 [21].
In Fresnes's 12th voting bureau the AV-ballot instructed "Tick the box of each candidate of whom you would APPROVE as President of France," 771 voted officially, 421 valid MJ-and AV-votes were cast.[21].
In Fresnes's 14th bureau the DV-ballot instructed "Tick the box of each candidate of whom you would DISAPPROVE as President of France," 658 voted officially, 408 valid MJ-and DV-votes were cast.
The official FPP together with the AV and DV results of both bureaus are given in Table 10a.The order-of-finish according to FPP is identical in both bureaus (except for a tie between two candidates in Bureau 14); moreover, the actual vote percentages of the candidates in the two bureaus are very similar.By this measure the voting behavior in both bureaus is about the same.
However, the AV and DV results in the two bureaus are different: Le Pen is 6th in one, 10th in the other; Bayrou and Mélenchon swap their 2nd 3rd places.More to the point, Approve and Disapprove are not opposites: the candidates' sums of Approves and Disapproves are well below 100%, ranging from a low of 62% to a high of 88%, their average under 75%.There are not sufficiently many grades.The distributions of the lowest grades given candidates who were Approved and the highest grades given candidates who were Disapproved are given in Table 10c.As usual voters attribute very different meanings to both Approve and Disapprove.
The disagreement between AV and DV is easily explained.When a voter is asked to assign Approve he assigns them only to those he is "sure" to approve and he evaluates those candidates to whom he gives no ticks at many different levels, a range of opinions that may vary from (say) Good down to To Reject.Symmetrically, when a voter is asked to assign Disapproves he assigns them only to those he is "sure" to disapprove and evaluates candidates to whom he gives no ticks at many different levels, a range of opinions that may vary from (say) Outstanding down to Good.In either case voters are not able to adequately express their opinions.
The Pew Research Center's in-depth national polls repeatedly pose the same questions concerning sitting presidents: "Do you approve or disapprove of the way Barrack Obama is handling his job as President?"Over Obama's eight years in office 79 answers are given ( [31] p. 78, references are given there to the answers concerning George W. Bush and Bill Clinton).The sums of Approves and Disapproves are always less than 100%, ranging from a high of 95% to a low of 81% (and similarly for Bush and Clinton).
• Approve is not the opposite of Disapprove.Two grades are simply too few to adequately express voters' opinions.At least three grades are necessary.

Pew Research Center political survey March 2016.
The results of the Pew Research Center's poll of 1,787 registered voters conducted March 17-27 [30] are given in Table 11a.The MJ-and AV-rankings with Approve meaning at least Good are given in Table 11b.Sander's grades very nearly dominate Clinton's yet AV declares them tied.(When Approve means at least Average the two rankings are the same.)

Never
• The MJ-and AV-rankings are very different: AV does not sufficiently take into account the fact that 47% of the participants evaluated Clinton as Poor or worse (a majority of 62% evaluated Trump as Poor or worse).Once again, two grades are too few.

MJ vs. AV: theory
What does theory say about the AV-supporters' criticisms of MJ?

A winner preferred by only one voter
Brams [12] gives an opinion profile of a jury of five on two candidates to show MJ may elect a candidate preferred by only one voter essentially the same as: Excellent Excellent Good Good Good MJ places A above B because A's majority-grade is Very Good whereas B's is Good ; however, only J 3 prefers A to B. On the other hand, it is undeniable that a majority believes A is Very Good and a majority believes B is Good which happens to be in opposition to the traditional interpretation of majority.
Yet AV can also elect a candidate preferred by only one voter as may be seen with this very same example.AV with Approve meaning Good or better places A over B with three ticks to B's two ticks.

Condorcet-consistency and manipulation
A method that is Condorcet-consistent is necessarily subject to the domination paradox.Moreover, many advocated methods are not Condorcet-consistent when voters express themselves honestly including MJ, AV, the alternative vote, point-scoring methods, and Borda's method.So why should Condorcetconsistency be considered desirable?
One explanation may be Thomas Paine's: "[A] long habit of not thinking a thing wrong, gives it a superficial appearance of being right, and raises at first a formidable outcry in defense of custom.But the tumult soon subsides.Time makes more converts than reason."[29].
A more serious explanation is the good property that MR with two candidates enjoys.It is strategy-proof or incentive compatible: the best strategy of every voter is to vote honestly.Of course this is a good property of a method only when its results are good with honest behavior (e.g., dictatorship's strategyproofness is not an argument in its favor).Honest grades are of very great importance: it is the true evaluations of the merits of candidates that should be amalgamated, not some exaggerated set of grades whose determination relies on expectations (that in addition, may be erroneous).
So when is MR on two candidates sure to avoid the domination paradox?Roughly said, when the higher a voter evaluates one candidate the lower (s)he evaluates the other.Specifically, a pair of candidates A and B are polarized when their opinion profiles have the following property for any two voters v i and v j : if v i evaluates A higher (respectively, lower) than B then v i evaluates B no higher (respectively, no lower) than v j .When an electorate is polarized 17there can be no consensus so the "strongly for or against" characteristic of MR renders the only acceptable result.3a).

Theorem 1 ([6]
) Majority rule (MR) avoids the domination paradox on pairs of candidates that are polarized.
The polarized Hollande-Sarkozy opinion profile with the same merit profile as Table 3a is given in Table 3c and the polarized Clinton-Trump opinion profile with the same merit profile as Table 4a is given in Table 4c.The polarized opinion profile makes Hollande the MR-winner with a score of 50.75% to Sarkozy's 43.28%.Similarly, the polarized profile makes Clinton the MR-winner with 55% to Trump's 43%.With these opinion profiles MR makes the right decisions.A method of ranking is consistent with MR on polarized pairs of candidates if both give the identical ranking between every such pair.

Theorem 2 ([6]
) A method of ranking that is consistent with MR on polarized pairs (and enjoys several standard properties) must coincide with the MG-rule.
voters.However, if the new voter strongly preferred A to B, then the no-show paradox does not occur.For example, if the new voter evaluated A Excellent and B Good, the merit profile becomes: A's middlemost block dominates B's so MJ places A above B: no no-show paradox.
In general, take a merit profile for which A n M J B n because the extremities of the middlemost blocks where they differ are [α, α] = [β, β].Suppose an (n + 1)st voter arrives who evaluates A higher than B, α n+1 β n+1 .
The no-show paradox cannot occur when (as in the example) the (n + 1)st voter sees a real difference between the candidates.It can only occur when that voter either (1) has a relatively good opinion of both candidates (α n+1 β n+1 min{α, β}) and B's evaluation increases, or (2) a relatively bad opinion of both candidates (β n+1 ≺ α n+1 ≺ max{α, β}) and A's evaluation decreases.However, for the paradox to exist at all it must be assumed that the voter cares more about the victor than about increasing B's evaluation in the first case or decreasing A's evaluation in the second case. 18This assumption is not verified in practice: voters' utilities are unknown and unfathomable (as is further discussed in the conclusions below).
Proof.To begin suppose n is odd and that the first middlemost where A's and B's grades differ is the 1st-middlemost.They are singletons, the candidates' majority grades: Proof.Take the three grades from highest to lowest to be 2, 1, and 0. Suppose (as above) that A n M J B n , the first middlemost intervals of A n and B n where their grades differ is [α, α] = [β, β], and an (n + 1)st voter arrives who evaluates A higher than B, α n+1 β n+1 .
If n is odd and the first middlemost where A's and B's grades differ is the 1st-middlemost, then as was seen they are the candidates' majority grades: α A and β B with α A α B .By Theorem 1 there are only two possibilities where the paradox could occur, namely, when (i) In the first case three grades implies β n+1 = 0, α n+1 = 1, and α A = 2. Consider the worst possible set of evaluations of A and the best of B compatible with the assumptions: The (n + 1)st voter's grades yields M J B n+1 since A's two middlemost grades dominates B's.Thus the no-show paradox does not occur in this most vulnerable of possibilities.The monotonicity of majority judgment then proves that it cannot occur at all.
In the second case α n+1 = 2, β n+1 = 1, and β B = 0.The worst possible evaluations of A and best of B compatible with the assumptions are: The late voter's grades yields M J B n+1 and the monotonicity of majority judgment assures that the paradox can never occur.
Consider the first case.As was noted before (1) implies α β, so with only three grades, β n+1 = 0, α n+1 = 1, and α = 2. Therefore α = 2, and β ≺ 2 (β = 2 would contradict the hypothesis).The worst possible set of evaluations of A and the best of B compatible with the assumptions are Adjoining the late voter's grades yields where the l + 1 middlemost grades of A dominate the corresponding set of B, showing A n+1 M J B n+1 .The monotonicity of M J proves the no-show paradox cannot occur.
Consider, finally, the second case.With three grades α n+1 β n+1 min{α, β} implies α n+1 = 2, β n+1 = 1, and β = 0 (α cannot be 0 since A n M J B n ).The worst possible set of evaluations of A and the best of B compatible with the assumptions are When the (n + 1)st voter adds her votes the grades become M J B n+1 .The monotonicity of M J establishes the result once more.Thus in all cases the no-show paradox cannot occur.
There is a more general concept than the no-show paradox that encapsulates the same general idea: not only more support can hurt a candidate but also more support that could be decisive is not decisive.
A method of ranking R is subject to the no-show syndrome if A n R B n , one more voter arrives who prefers A to B, α n+1 β n+1 , yet AV is subject to the no-show syndrome (as is MJ).Suppose Approve means Very Good or better with the merit profile A: Excellent Very Good Good Fair Poor B: Very Good Very Good Fair Poor Poor AV proclaims a tie vote, each candidate receiving two ticks (though A's grades dominate B's).One more voter arrives who evaluates A Good and B Fair or A Excellent and B Very Good : the vote remains an AV tie.In this case MJ places A ahead of B before and after a sixth voter.Of course this example is in essence the same as that showing MJ admits the no-show paradox: if the late voter sees a real difference between the two candidates-in the MJ case giving A at least Good and B at most Fair (see Theorem 1), in the AV case giving A a tick and not B-then the no-show syndrome does not occur.
It may be argued that the likelihood for an AV tie in an election with many voters is infinitesimal: true enough, but a late MJ voter is even more rarely in the position of being able to provoke the no-show syndrome or paradox.In the uses and experiments reported on above AV produces ties or near ties whereas MJ does not (unless two candidates have precisely the same set of grades): with few voters AV produces two ties in the Louis Lyons jury; with many voters, AV produces a tie in the Paris-Diderot vote, a near tie in the Alfortville Socialist primary (53.2% for Valls, 53.5% for Royal). 19V is subject to another allied property.A method R is insensitive if B n R A n , an arbitrarily large set of n voters arrive who all have the same opinionthey give A the grade α and B the grade β with α β-yet B n+n R A n+n .It is evident that AV is insensitive for the simple reason that two grades do not suffice.It occurs when voters see a difference between the two candidates but cannot express it since they either Approve both or Disapprove both.Insensitivity decreases as the number of grades increases: MJ with sufficiently many grades is sensitive.
AV supporters claim, "By being better able to express their preferences, voters would probably be more likely to go to the polls . . ." ([14] p. 4).There can be no doubt that with a language of six or seven grades voters are able to express their preferences better than with two grades.But more is true.With AV a voter who Approves both candidates or Disapproves both candidates has no effect on the outcome, so she might as well stay home instead of bothering to vote.With MJ even a voter who gives the same grade to A and B can (in theory) change the outcome.For example, take the following merit profile where the MJ-winner is A: S e dominates S m which in turn dominates S c .It is worth noting that this was not an experiment: it was an academic exercise involving the usual stiff competition among disciplines.Nevertheless-despite oft repeated predictions that when the stakes are high voters will use only the highest and lowest gradesvoters did not confine themselves to A's and C's: 12 B's were given (together with 20 A's and 16 C's).Only five of the 16 voters confined themselves to A's and C's.Moreover, although there were three grades and three candidates, only six voters rank-ordered them by using all three grades.
As another example take the 2012 French presidential poll ( Five different methods' rankings are given in Table 13b.The rankings all differ.In 34 of the 45 pairs of candidates one's grades dominates the other's.The MJ 3 -ranking and all three AV-rankings place Le Pen in 5th place, ignoring the detailed information that places her in 8th place when all the nuances in opinion are taken into account.Three grades in this case do not reflect the fact that of Le Pen's 53.87% Disapprove, fully 47.63% are To Reject.In sum, those AV-enthusiasts who lend a primary importance to the no-show paradox can improve matters by using three grades rather than two, thus giving up AV and using MJ 3 : two grades do not suffice.It seems three grades meets their agenda, though three are too few to adequately express voters' opinions in most applications. 20

Conclusion
Why do voters bother to vote?Few believe that his or her vote in a national contest will change the outcome so "rational" behavior would lead them to abstain.Those voters who go to the polls do so for many reasons: to express their beliefs by giving their opinions (to the extent that the electoral system allows), to participate, to feel they belong to the society in which they live, to be "counted," . . . .Much of the traditional theory of voting takes a considerably more limited view; namely, that voters care only about who wins an election.It assumes, in the jargon of economic theory, that a voter's utility function depends only on the identity of the winner.With this view the extent of the victory, the identity of the runner-up, the place of each candidate in the final standings, and the extent of the support for them, all have nothing to do with voters' wishes.
In fact nothing is known about voters' wishes and satisfactions-as many elections show-so in a more realistic context "no-show" may be no paradox in theory as well as in practice.
The 2002 AV experiment in Orsay (section 2.2) and polls and experiments with MJ [2,3] show voters' preferences cannot be explained as single-peaked on a left-to-right political spectrum (although statistically they can be).In theory single-peakedness is very unlikely [22].A recent study of voting in Switzerland uses principal component analysis to show that at least three dimensions are necessary to map voters' preferences, so the one-dimensional single-peaked condition cannot be met [17].
The 2017 French presidential election shows voters' motivations are far from being limited to the identity of the winner.A first round with 11 candidates, 77.8% of registered voters participating, and 2.6% blank or invalid ballots resulted in four leading candidates: Jean-Luc Mélenchon (far left) 19.58%, Emmanuel Macron (center) 24.01%, François Fillon (right) 20.01%, and Marine Le Pen (extreme right) 21.03%.The second round was between Macron and Le Pen-two very distinct, fundamentally opposed candidates-and yet only 74.6% of the voters turned out, and of all ballots cast fully 11.5% were blank or invalid.Thus 1,536,201 fewer voters participated in the run-off, and of the 35,467,327 who participated 4,085,724 cast blank or invalid ballots: fully 5,622,125 did not wish to chose one or the other candidate in this battle of strongly opposed ideologies, policies, and persons.Many of these voters had voted for Mélenchon or the socialist candidate Benoit Hamon in the first round: they preferred having no voice in the choice of president to giving their vote to a candidate whose ideology was very different than theirs.By preventing them from expressing themselves MR simply disenfranchised them, as would AV as well.The outcome was Macron 66.1%, Le Pen 33.9%: Macron's crushing victory in no way reflected the preference of the electorate for him; instead it expressed the electorate's negative opinion of Le Pen.
A recent study of French parliamentary (1978-2012) and local (2011, 2015) elections also concludes that voters do not care only about who wins.It makes a distinction between "expressive" voters who vote according to their preferences among candidates only and "instrumentally rational" or "strategic" voters who vote based on the likely winners.Their data base leads them to conclude "that a large fraction of voters are . . .'expressive' and vote for their favorite candidate at the cost of causing the defeat of their second-best choice" ( [32], p. 42).
In addition to showing what is no more than common sense-that voters must be given the means to express their opinions-this paper has presented evidence-experimental and theoretical-to make a number of key points.
1. MR can easily go astray because voters' opinions cannot be adequately expressed.
grades are too few).
4. The no-show paradox-a theoretical possibility with MJ-is insignificant because so unlikely to occur in practice.
5. The no-show paradox is impossible when MJ has three grades.
6. Neither MJ nor AV is Condorcet-consistent when votes are honest, which is a good property of both [6].
7. MJ is consistent with MR on polarized pairs of candidates so resists strategic manipulation on that domain when the grades are sufficiently rich 21 , not true of AV (nor of point-summing methods).
8. In practice voters and judges using MJ do not limit themselves to highest and lowest grades (if they did the results would be those of AV). 9.In practice voters and judges have no difficulty in assigning grades [26].Experience suggests it is cognitively natural to do so (as versus ranking candidates or partitioning them into two classes).This is shown by repeated experimentation, MJ's increased use in various instances, including French and American national polls [30], French news sites on the web [34,38], a grass-roots citizen's primary conducted in France [23]; and plain common sense.
In short, MJ with six or seven grades (and even with three grades) has striking advantages over AV.
Not Bad Not Good ].When the two grades are equal they are the competitor's majority-grade, so A's majority-grade is Rather Good.Naturally the smallest x% (> 50%) yields the most precise majority decision: with many voters it is essentially certain that they are equal, so the majority-grade.
50% of the grades are to the left of the middle, 50% are to the right.There is a 100%-majority for [Good, Bad ]; and a 50+ %-majority 5 for [Rather Good, Rather Good ].For any x, 50 < x% ≤ 100%, there is a x%-majority for at most one grade and at least another; e.g., there is a 75%-majority for [Good,

Table 1
. Merit profiles, laboratory names, Université Paris-Diderot, spring 2015.Both have x%-majorities for [Rather Good, Rather Good ] for every x, 50 + ≤ x ≤ 51.57.The smallest x where they differ are the 51.57+ %-majorities: for A it is [Rather Good, Rather Good ] and for B [Rather Good, Not Bad Not Good ]; A's is better than B's so A must be ahead of B in the MJ-ranking. .

Table 3b .
Possible opinion profile, Hollande-Sarkozy (giving the merit profile of table 3a), national poll, 2012 French presidential election.
Check one grade in the line of each nominee.No check is interpreted to mean Neutral.

Table 7a .
Ballot for the Louis Lyons Award, 2009 (one line for each nominee).

Table 7b .
Opinion profile, Louis Lyons Award, 2009.The merit profile is given in Table7c, and the MJ-and several AV-rankings in Table7d.Five nominees implies ten comparisons: eight of them are dominations.The two presumably reasonable interpretations for Approve-at least Excellent or at least Very Strong-result in a tie between C 2 and C 3 .The interpretation at least Outstanding results in a tie between C 4 and C 5 -whereas C 4 's grades dominate C 5 's-showing that AV does not respect domination.

Table 9b
. MJ-ranking, AV-scores when Approve means at least Good, at least Very Good, and at least Excellent, FPP-scores, 2012 French presidential poll (737 ballots)

Table 10b .
[21]al AV and DV votes compared with AV Good and DV Poor, French 2012 presidential election experiment, Fresnes[21].Statistically the AV results of Bureau 12 are close to interpreting Approve to mean Good or better and the DV results of Bureau 14 are close to Disapprove meaning Poor or worse (see Table10b): Fair plays no role whatsoever, again suggesting that two grades do not suffice.

Table 3c .
Polarized opinion profile, Hollande-Sarkozy, 2012 French presidential election poll (for merit profile of Table

Table 4c .
Polarized opinion profile, Clinton-Trump, 2016 US presidential election poll, Pew Research Center (for merit profile of Table4a).

Table 9a )
, and suppose that with MJ 3 Approve means Very Good or better, Neither means Good or Fair, and Disapprove means Poor or worse, giving the merit profile and MJ 3 scores in Table13a.