Simultaneous Elicitation of Scoring Rule and Agent Preferences for Robust Winner Determination

. Social choice deals with the problem of determining a consensus choice from the preferences of different agents. In the classical setting, the voting rule is fixed beforehand and full information concerning the preferences of the agents is provided. This assumption of full preference information has recently been questioned by a number of researchers and several methods for eliciting the preferences of the agents have been proposed. In this paper we argue that in many situations one should consider as well the voting rule to be partially specified. Focusing on positional scoring rules, we assume that the chair, while not able to give a precise definition of the rule, is capable of answering simple questions requiring to pick a winner from a concrete profile. In addition, we assume that the agent preferences also have to be elicited. We propose a method for robust approximate winner determination and interactive elicitation based on minimax regret; we develop several strategies for choosing the questions to ask to the chair and the agents in order to converge quickly to a near-optimal alternative. Finally, we analyze these strategies in experiments where the rule and the preferences are simultaneously elicited.


Introduction
Aggregation of preference information is a central task in many computer systems (recommender systems, search engines, etc).In many situations, such as in group recommender systems, the goal is to find a consensus choice; social choice theory can provide foundations for such applications.The traditional approach to social choice assumes that 1) the full preference orderings of the agents and 2) the social choice function are expressed beforehand.These represent two strong hypotheses.Requiring agents to express full preference orderings can be prohibitively costly (in terms of cognitive and communication cost).This observation has motivated several works assuming partial preference orders: one early work is by Conitzer and Sandholm [7] who studied the complexity of communication when using different voting rules; Konczak and Lang [15] studied the computation of possible and necessary winners for various voting rules; Xia and Conitzer [34] then showed that, while the identification of a necessary co-winner in scoring rules is polynomial, the determination of possible co-winners is NP-hard; additional complexity results were given by Walsh [32] and Pini et al. [25].
Since in many practical situations there would be too many possible winners but no necessary winners, several works addressed the problem of agent preferences elicitation using a variety of approaches (minimax regret, Bayesian methods, etc.) with the goal of converging to a necessary winner [24,14,21,26,2,9].Among those, Walsh [33] and Conitzer [6] analyzed when to stop the elicitation process.
A second concern is the ability of the chair (the person or organization supervising the voting process) to provide a precise definition of the voting rule, suggesting the relaxation of the second hypothesis.Indeed, it is often difficult for non-experts to formalize a voting rule on the basis of some generic preferences over a desired aggregation method.Here we provide two examples of such situations.
Consider, as a first example, a chair that is about to hire a new employee whose performances are evaluated by several experts.The members of the chair may not have a voting rule in mind at the start of the process, and might not wish to agree on a specific voting rule.However, they might be willing to answer a few questions requiring to select who should be the winner out of specific profiles.
Consider, as a second example, the reviewing process of a conference where the best paper must be elected.The agents express their preferences on the papers they reviewed, but they are not aware of the voting rule the Program Chair will apply when aggregating them.Nonetheless, reviewers are still willing to participate in the process.Also, the PC may not have a specific voting rule in mind, and she will find it hard to provide a precise scoring vector if asked.Maybe she strongly believes that being ranked once in the first position is "much more" valuable than being ranked two times second, but does not know exactly how much more (though she can judge example cases).
In this paper, we focus on positional scoring rules with convex weights, that are a particularly common method used to aggregate rankings.We develop methods, based on the notion of minimax regret, for determining a robust "winner" under uncertainty of both the voting rule and the agent preferences.We provide incremental elicitation methods that at each step of the elicitation question either one of the agents or the chair, and we discuss several heuristics to choose questions that quickly reduce the regret.Answers to questions are encoded as constraints; questions to the agents are comparisons between pairs of alternatives while questions to the chair ask to select a winner out of a synthetic profile.
While some previous works have considered partially specified aggregation methods [30,20,31], we do not know of any work considering both sources of uncertainty at the same time.Actually, very few works altogether have considered the problem of eliciting a voting rule by asking questions to the chair.We mention the work of Cailloux and Endriss [5] that assumes a different representation for the rule.Additionally, some works address the manipulability of voting rules [11,10,8,1] and strategic behaviors [12,17,27].
Our approach is evaluated on simulations with synthetic and real datasets where both the voting rule and the agent preferences are initially unknown to the system and incrementally revealed through questioning.We assume the chair to be human, thus able to answer questions about a limited number of alternatives, so we focus on small scale social choice situations.We compare the effectiveness of several questioning strategies based on the current knowledge of the rule and preferences.To summarize our con-tributions: 1) we provide a novel mechanism for eliciting a voting rule by translating abstract questions about weights to a choice of an alternative given a concrete profile; 2) we show that with our elicitation method it is possible to reach low regret with a reasonable number of questions; 3) we present elicitation strategies that achieve good results within reasonable computation time; 4) we show that for the class of rules considered, asking a few questions to the chair suffice to reach low regret; 5) our experiments suggest that low degree of similarity among preferences (as in impartial culture) is a more challenging setting than less varied profiles.

Social choice with partial information
We now introduce some basic concepts.We consider a set A of m alternatives (products, restaurants, public projects, job candidates, etc.) and an infinite set N of potential agents.
A profile (≻ j , j ∈ N ) considers a finite subset of agents N ⊂ N and associates to each agent a preference order ≻ j ∈ L(A), a linear order over the alternatives.A profile is equivalently represented by v = (v j , j ∈ N ) where v j (x) ∈ {1, . . ., m} denotes the rank of alternative x in the preference order ≻ j .A social choice function f : ∪ ∅̸ =N ⊂N,N finite L(A) N → P * (A) associates to each profile a set of (tied) winners, where P * (A) is the powerset of A excluding the empty set.Among the many possible social choice functions, we consider convex positional scoring rules (PSRs).A PSR f w is parameterized by a scoring vector w associating weights w r ∈ [0, 1] to positions, with 1 = w 1 ≥ w 2 ≥ . . .≥ w m = 0. Let α x r be the number of times that alternative x was ranked in the r-th position.Given v and w, an alternative x ∈ A obtains the score The winners f w (v) are the alternatives with highest score.An important class of PSRs is the one using convex weights [30,19], meaning that the difference between the weight of the first position and the weight of the second position is at least as large as the difference between the weights of the second and third positions, etc. ∀r ∈ {1, . . ., m − 2} : w r − w r+1 ≥ w r+1 − w r+2 . ( The constraint above is a natural and common assumption, often used when aggregating rankings in sport competitions (such as F1 racing, alpine skiing world cup): losing ranks at the top is more damaging than losing ranks at the bottom.Let W denote the set of such convex weight vectors.We consider a specific finite set of agents N * ⊂ N and let v * = (≻ * j , j ∈ N * ) and w * denote the profile and weight vector, unknown to us, that represent the preferences of the agents in N * and of the chair.
At a given time, our knowledge of agent j's preference is encoded by a partial order ≻ p j ⊆ ≻ * j over the alternatives, a transitive and asymmetric relation (we assume that preference information is truthful).An incomplete profile p = (≻ p j , j ∈ N * ) maps each agent to a partial preference.Let C(≻ p j ) = {≻ ∈ L(A) | ≻ p j ⊆ ≻} denote the set of possible completions of ≻ p i and C(p) = j∈N C(≻ p j ) the set of complete profiles extending p.Note that v * ∈ C(p).
The vector w * is also unknown but we assume that the chair is able to specify additional preference information taking the form of linear constraints about w * .Let W ⊆ W denote the set of weight vectors compatible with the preferences expressed by the chair about the scoring vector.We will show in Section 4 that the additional preferences we use can be elicited by showing a complete profile of a synthetic election and asking who should be elected in this case.

Robust winner determination
It is desirable in an elicitation protocol such as ours to be able to stop before reaching full knowledge of the agent preferences or of the preferences of the chair about the voting rule.As, often, there are no necessary winners and too many possible winners, it is useful to declare a winner given partial information.As a decision criterion to determine a winner, we propose to use minimax regret [29].This decision criterion has been used for robust optimization under data uncertainty [16] as well as in decisionmaking with uncertain utility values [28,3].In particular, Lu and Boutilier [21] have adopted minimax regret for winner determination in social choice where the preferences of agents are partially known, while the social choice function is known.
We consider the simultaneous presence of incomplete knowledge in agent preferences and in the weights of the PSR.We use maximum regret to quantify the worst-case error, and let the alternatives that minimize this quantity win, giving some robustness in face of ignorance.Intuitively, the quality of a proposed alternative a is how far a is from the optimal one in the worst case, given the current knowledge.
Given p and W (that represent the current knowledge about agent preferences and the PSR), the maximum regret is considered by assuming that an adversary can both 1) extend the partial profile p into a complete profile, and 2) instantiate the weights choosing among any weight vector in W .We formalize the notion of minimax regret in multiple steps.First of all, Regret(x, v, w) is the "regret" of selecting x as a winner instead of the optimal alternative under v and w: The pairwise maximum regret of x relative to y given the partial profile p and the set of weights W is the worst-case loss of choosing x instead of y under all possible realizations of the full profile and all possible instantiations of the weights: The maximum regret is the worst-case loss of x: MR(x; p, W ) is the result of an adversarial selection of the complete profile v ∈ C(p) and of the scoring vector w ∈ W that jointly maximize the loss between x and the true winner under v and w.Finally, MMR(p, W ) = min x∈A MR(x; p, W ) is the value of minimax regret under p and W , obtained when recommending a minimax optimal alternative x * p,W ∈ A * p,W = argmin x∈A MR(x; p, W ). Picking as consensus choice an alternative associated with minimax regret provides a recommendation that gives worst-case guarantees.In cases of ties, we can return all minimax alternatives A * p,W as winners or pick one of them using some tie-breaking strategy.
Observe that if MMR(p, W ) = 0, then any x * p,W ∈ A * p,W is a necessary winner: any valid completion of the profile and choice of w ∈ W gives to x * p,W the highest score.
We note that our notion of regret gives some cardinal meaning to the scores: instead of just being used to select winners under the corresponding PSR, their differences are considered as representing the regret of the chair.
Computation of minimax regret Given a voting rule and a partially specified profile, Xia and Conitzer [34] determine necessary winners by showing constructions that attempt to maximize the score difference between a proposed winner and a chosen alternative.This reasoning was later adopted by Lu and Boutilier [21] who used the considerations on the worst-case completions for computing the minimax regret.
In order to compute pairwise maximum regret, and therefore minimax regret, we decompose the PMR into the contributions associated to each agent by adapting this same reasoning to our setting.The context is however more challenging due to the presence of uncertainty in the weights.
Recall that, in the computation of s(x; v, w), w vj (x) represents the score that x obtains in the ranking v j (see Eq. ( 1)).Since scoring rules are additively decomposable, we can consider separately the contribution of each agent to the total score.Thus, we can write the actual regret of choosing x instead of y as s(y; v, w) − s(x; v, w) = j∈N w vj (y) − w vj (x) , and we obtain The following propositions show that the procedure for completing a partial profile, proposed by Lu and Boutilier [21] when considering a fixed weight vector, also applies in our setting.We write a ⪰ p j b iff a ≻ p j b ∨ a = b and adopt the canonical notation when considering a relation as a function, writing ⪰ p j (x) for {y | x ⪰ p j y}.Proposition 1.There exists a completion v ∈ C(p) of the partial profile p such that PMR(x, y; p, W ) = max w∈W [s(y; v, w) − s(x; v, w)] and such that the linear order vj of each agent j satisfies: Proof Sketch.Consider our knowledge ⪰ p j about the preference of the agent j.The adversary's goal is to make the score of y as high as possible and the score of x as low as possible.To do this, he should complete ≻ p j to ≻ j by placing above x as many alternatives as possible; that is, all the alternatives except those that are known to be worse than x (those a such that x ⪰ p j a); and similarly, he should put below y all the alternatives he can.Two conditions must be excluded for a to go below y.The alternatives such that a ⪰ p j y can't be put below y.Furthermore, the first objective must take priority over the second one: when an alternative should go above x according to the first objective (because ¬(x ⪰ p j a)), and x is known to be better than y (thus x ⪰ p j y), then a should be put above x (irrespective of whether a ⪰ p j y), which will move both x and y one rank lower than if a had been put below y.This maximizes the adversary's interests: because the weight vector is convex, the score difference will be lower when both alternatives are ranked lower (Equation 2), and that difference of scores is in favor of x when x ≻ p j y, thus to be minimized from the the adversary's point of view.Proof.The rank of x is directly obtained from Eq.( 4).The rank of y is obtained by complementing Eq.( 5), obtaining a ⪰ j y ⇔ (a ⪰ p j y) ∨ ((x ⪰ p j y) ∧ ¬(x ⪰ p j a)), and, observing that a ≻ j y ⇔ a ̸ = y ∧ a ⪰ j y, obtaining that a ≻ j y if and only if or equivalently, if and only if Indeed, ( 6) ⇒ (7), and ( 7) ⇒ (6) because (x ⪰ p j y) ∧ ¬(x ⪰ p j a) ⇒ a ̸ = y (as when a = y, (x ⪰ p j y) and ¬(x ⪰ p j a) are opposite claims).Suffices now to rewrite Eq. ( 7) to let the two disjuncts designate disjoint sets: Note that in Proposition 2, in the case (x ⪰ p j y), β is the number of alternatives incomparable with both x and y.
Proposition 3. The PMR can be written as: where αy r (resp.αx r ) is the number of times y (resp.x) has rank r in the complete profile v defined in Proposition 2.
Proposition 3 shows that PMR is linear in the weights.The pairwise max regret PMR(x, y; p, W ) can thus be obtained by solving the following linear program defined on the variables w 1 , . . ., w m : The max regret MR(x; p, W ) is determined by computing the pairwise regret of x with all other alternatives in A, and the recommended alternatives are the ones with least max regret.Observe that when the PMR of an alternative x (against some other alternative y) exceeds the best MR value found so far, we do not need to further evaluate x.This idea can be exploited using a minimax-search tree [4].

Interactive Elicitation
We propose an incremental elicitation method based on minimax regret.At each step, the system may ask a question either to one of the agents about her preferences or to the chair about the voting rule.The goal is to obtain relevant information to reduce minimax regret as quickly as possible.The elicitation can be terminated either after a given number of questions, or when the minimax regret is lower than a threshold (or when it drops to zero if we wish optimality).

Question types
We distinguish between questions asked to the agents and questions asked to the chair.As questions asked to the agents we consider comparison queries relating two alternatives.The effect of a response to a question asked to an agent is the increase in our knowledge about the agent rankings, thus augmenting the partial profile p.If agent j answers a comparison query stating that alternative a is preferred to b, then the partial order ≻ p j is augmented with a ≻ p j b and by transitive closure.A bit more discussion is needed about questions asked to the chair.Such questions aim at refining our knowledge about the scoring rule; a response gives us a constraint on the weight vector w.In particular, we want to obtain constraints of the type w r − w r+1 ≥ λ(w r+1 − w r+2 ) for r ∈ {1, . . ., m − 2}, relating the difference between the importance of ranks r and r + 1 with the difference between ranks r + 1 and r + 2.
Building concrete questions for the chair Even if the chair might be considered able to answer directly such abstract questions, we want to ensure that these questions can also, in principle, be asked in a more concrete way: in terms of winners of example profiles.Such questions have clear semantics whose understanding can be assumed to be shared by the chair, contrary to abstract questions about weights.Moreover, this way of questioning the chair is independent of the voting rule that is being elicited; whereas questions about weights only make sense when considering PSRs.Asking who should win in specific profiles has been used in experimental settings investigating the feeling of justice of individuals [13], but, to the best of our knowledge, the use of such questions to systematically guide an elicitation process about voting rules is novel.This is similar to favor, in decision theory, direct choice questions ("please choose either a or b") compared to, say, questioning the decision maker about the shape of her utility function.The former are considered "observable": acts of choice are translated to preference statements [22,Ch. 1].
Although questioning in terms of profiles and in terms of weights is logically equivalent in our setting, there is no a priori certainty that questioning the chair using different phrasing would yield logically equivalent answers: research in experimental psychology shows that participants' answers differ widely when changing the phrasing of preference-related questions [18].To get out of such conundrums, we need a language considered "fundamental".Questions of the form "In this profile, who should win?" arguably provides such a natural language.
Thus, our task is to build a profile, given λ and r ≤ m − 2, in such a way that the set of (tied) winners picked by the chair reveals whether w r − w r+1 ≥ λ(w r+1 − w r+2 ).
Proposition 4. Given a rational λ = p/q > 1 and a rank r between 1 and m − 2, there exists a profile P such that, for any weight vector w ∈ W, a ∈ f (P ) iff w r − w r+1 ≥ λ(w r+1 − w r+2 ) and b ∈ f (P ) iff w r − w r+1 ≤ λ(w r+1 − w r+2 ), where f is the PSR parameterized with w.
Proof.Define a linear order > 1 over A as placing a at rank r, b at rank r + 1, and the remaining alternatives arbitrarily.Define > 2 over A as placing a at rank r + 2, b at rank r + 1, and the remaining alternatives arbitrarily.Define an arbitrary linear ordering > over A \ {a, b}.Define a linear order > 3 as placing a first, b second, and following the order of > for the remaining positions.Finally, define a linear order > 4 as placing b first, a second, and following the inverse order of > for the remaining positions.
Define P as the profile of 3(p+q) agents containing q times > 1 , p times > 2 , and > 3 and > 4 each p + q times.As a result, a obtains the following ranks: q times r, p times r + 2, p + q times first, and p + q times second.The alternative b obtains the ranks r + 1, 2 and 1, each p + q times.Consider any alternative c ∈ A \ {a, b}.Its score is maximal when it comes first in > 1 , first in > 2 and first in >, by convexity of the weights.In that case, c is positioned at the ranks 1, 3 and m, each p + q times.Letting s(x) denote the score of x at P , we obtain s(a) = qw r +pw r+2 +(p+q)w 1 + (p+q)w 2 , thus, s(a) ≥ (p+q)w m +(p+q)w 1 +(p+q)w 2 ; s(b) = (p+q)w r+1 +(p+ q)w 2 + (p + q)w 1 ; and, ∀c ∈ A \ {a, b}, s(c) ≤ (p + q)w 1 + (p + q)w 3 + (p + q)w m .It follows that a or b maximize s (as s(a) ≥ s(c)).We conclude by observing that a ∈ f (P ) ⇔ s(a) ≥ s(b) ⇔ qw r + pw r+2 ≥ (p + q)w r+1 ⇔ w r − w r+1 ≥ (p/q)(w r+1 − w r+2 ), and similarly for b ∈ f (P ).
Example.Suppose we want to ask the following question to the chair: w 2 − w 3 ≥ 2(w 3 − w 4 ).We show the profile in Figure 1a to the chair and ask who should win (each column is the preference of one agent).Both a and b have scores higher than c and d for all convex weights, thus either a or b will be picked under our hypothesis; and s(a) ≥ s(b) ⇔ w 2 + 2w 4 ≥ 3w 3 .Figure 1b represents the same profile using a compressed view, the numbers in bold indicating the number of agents having the preference in the corresponding column.As the proof shows, constructed profiles require only four different linear orders.Elicitation strategies We develop several strategies for simultaneous elicitation of agent preferences and of the PSR.While it is of course possible to first fully elicit the agent preferences and afterwards elicit weights, we want to investigate approaches that are able to recommend winning alternatives before obtaining complete knowledge of the profile or the rule.We define here various strategies; a strategy tells us, given the current partial knowledge (p, W ), which question to ask next.
The Random strategy is used as a baseline.It first chooses equiprobably whether to question the chair or the agents.In the first case, it draws one rank in 1 ≤ r ≤ m − 2 equiprobably, takes the middle of the interval of values for λ that are still possible considering our knowledge so far, and asks whether w r − w r+1 ≥ λ(w r+1 − w r+2 ).In the second case, it draws equiprobably among the agents whose preference is not known entirely; it then draws an alternative a among those involved in some incomparabilities in ≻ p j and an alternative b among those incomparable with a in ≻ p j .Let (x * , ȳ, v, w) be the current solution of the minimax regret, where x * is the minimax optimal alternative and ȳ, v, w the corresponding adversarial choices.The Pessimistic strategy considers a set of n + (m − 2) candidate questions: one per agent, and one per rank (excluding the first and the last one which are known).
The candidate questions to the agents are chosen by extending the idea of Lu and Boutilier [21], that privilege learning about the relationship of x * and ȳ to the other alternatives if possible.Given j ∈ N * , if x * and ȳ are incomparable in ≻ p j , the candidate question concerns the pair (x * , ȳ), otherwise, it concerns the pair (x * , z) for some z incomparable to x * (randomly chosen), or if none such z exist, the pair (ȳ, z) for some z incomparable to ȳ, or, if both x * and ȳ are comparable to every alternatives in ≻ p j , any incomparable pair is picked at random.
The candidate questions to the chair are determined as in the Random strategy.
Once having selected n+m−2 candidate questions, the Pessimistic strategy selects the one that leads to minimal regret in the worst case.Assume that a question q 1 has type t 1 (being "chair" or "agent"), and leads to the new knowledge states (p 1 , W 1 ) if answered positively and The terms ϵ t and ϵ ′ t are real numbers associated to the type t of question; these parameters are used to fine tune the choice of the question type.Define similarly t 2 , R max and R min 2 for q 2 .Pessimistic considers question q 1 to be better than . In other words if the maximal a posteriori MMR of two questions are (approximately) equal, then it considers the (penalized) minimal MMR values.
The Extended pessimistic strategy uses the same criterion as the pessimistic strategy, but extending it to a bigger set of candidate questions, the same as those considered by the Random strategy.These candidate questions are then evaluated using the same operator as for the Pessimistic strategy.Extended pessimistic is applicable only to very small problem instances: its complexity is in O(n 2 m 5 ), because we consider O(m 2 ) questions for each agent and need for each question to compute MMR twice, whose complexity is O(nm 3 ).
The Two phases strategy is developed in order to investigate the effect of varying the proportion of questions of the two types, when asking all questions to the chair at the beginning or at the end.It is parameterized by q c , the number of questions to be asked to the chair.The Two phases-ca variant first asks q c questions to the chair, then k − q c questions to the agents, using in both cases Pessimistic to select the specific questions; whereas the Two phases-ac variant starts with k − q c questions to the agents, then questions the chair.
Finally, the Elitist strategy aims at uncovering as quickly as possible the top alternatives of all agents.For any agent j, it asks to compare an alternative currently undominated in ≻ p j with one that is currently incomparable.Thus, the top alternative for j will be known after having asked exactly m − 1 questions to j.After having asked n(m − 1) questions to the agents, it questions the chair only, using the same approach as Pessimistic.This strategy can be expected to perform well when the chair assigns a large weight to the first rank, as compared to the other ranks.It is used to further challenge Pessimistic, which is not specifically tailored to such a situation.

Empirical Evaluation
We performed several numerical experiments using both real data and randomly generated profiles in order to validate our approach and test the performance of our elicitation strategies.
Given a problem size (m, n), a number of questions k and a strategy to test, we first create an "oracle", representing the true preferences of the agents (randomly generated or coming from real data) and the weights associated with the chair's scoring rule (randomly generated).We start with empty knowledge (p = ∅, W = W) about the preference orderings of the agents and the weights of the chair.We obtain the first question to be asked using the strategy under test.We then use the oracle to answer the question and update the system's knowledge, which is thus used to obtain the next question.This is repeated until k answers have been obtained, computing the resulting MMR values along the way for various values of k.We repeat this whole experiment a variable number of times, for a given (m, n, k), and report the average resulting MMR and standard deviation sd.The sizes of the considered scenarios are comparable to the ones used by Cailloux and Endriss [5].
The oracle is built as follows.For the real preferences, we used three datasets from PrefLib [23]: T Shirt (researchers voted on tee shirt designs; m = 11, n = 30), Courses (students voted on courses; m = 9, n = 146; referred to as AGH on PrefLib) and Skate (judges voted on skaters at the Euros Pairs Short Program; m = 14, n = 9).For the synthetic datasets, we follow an Impartial Culture (IC) assumption: the linear order of each agent is drawn i.i.d.uniformly.We believe IC to be a challenging situation and expect the number of questions to ask, in order to reach a certain level of regret, to decrease with less varied profiles.To generate the scoring rule weights, we first draw m − 1 numbers uniformly at random (in the interval 0, 1 representing weight "differences"), normalize and sort them; a sequence of convex decreasing weights is then obtained by a decumulative sum.The penalty parameters for the Pessimistic and Extended pessimistic strategies are ϵ chair = 1.1, ϵ ′ chair = 10 −6 , ϵ agent = 1.0 and ϵ ′ agent = 0.  Comparison of strategies Our first experiment concerns small size situations.Figure 2 compares some of our strategies in the case m = 5, n = 10 (variations around this size yield similar conclusions), where the results are averaged over 200 runs.We see that asking random questions does not allow to reach a low regret level even after having asked 100 questions, whereas a low regret level (MMR = 1) is reached by Pessimistic before having asked 60 questions.This also holds for other problem sizes.For instance, for m = 10, n = 20 and 500 questions, Random strategy reaches an average regret (over 20 runs) of 9.3 (±0.7) and Pessimistic 0.5 (±0.5).We notice that Pessimistic performs slightly better than Extended pessimistic, showing that Pessimistic chooses candidate questions wisely; this is good news since Pessimistic is much faster: it takes on average only 16s for a complete elicitation session (for m = 5, n = 10 and 100 questions), while Extended pessimistic takes 50s.Although their performance is close, Pessimistic performs systematically better in multiple runs of the experiment.We also compared the Pessimistic strategy against Elitist in a situation specifically tailored to advantage Elitist.For that experiment specifically, instead of drawing the weights of the oracle randomly, we fix it to a "geometric" weight vector, such that w r − w r+1 = 2(w r+1 − w r+2 ), for all r ≤ m − 2, so as to dramatically increase the importance of the weights associated to the top ranks.Even in that case, we see in Table 1 that Pessimistic performs better than Elitist.
Evaluation of Pessimistic Strategy Our next set of experiments evaluate the Pessimistic strategy in absolute terms.We first wonder how many questions should be asked in order to achieve low regret, fixed at n/10: this is equivalent to the difference of score of an alternative x that results from switching from a profile P to a profile P ′ where a tenth of the agents rank x last instead of first.Table 2, first five columns, contains the result: it displays, for each dataset, the number of questions asked to the chair (q MMR≤n/10 c ), and the quartiles of the number of questions asked to the agents (q MMR≤n/10 a ), averaged over 20 runs.It is interesting to note that about twenty or thirty questions per agent on average suffice to reach a low regret in those instances.We find also noteworthy that the Pessimistic strategy chooses to ask zero questions to the chair but still achieves low regret, in most of those instances.
Another interesting measure is the average number of questions asked to the chair (q MMR=0 c ) and to the agents (q MMR=0 a ) before reaching zero regret.The results for various sizes are displayed in the last two columns of Table 2. Here, we see that the Pessimistic strategy does choose to question the chair when reaching low enough regret values.The m15n30 dataset did not reach zero regret in 1000 questions. Figure 3 shows the decrease in MMR according to the number of questions asked for various problem sizes.In particular, this shows important differences between some real datasets and the problems generated using IC.In the Skate problem, the value MMR = 1 is reached after less than 100 questions, while the IC case of the same size (m = 14, n = 9) requires more than 200 questions to reach that value.This reasoning also applies to the Courses dataset but not to the T Shirt dataset.This can be explained by the high degree of similarity in the preference rankings of the Skate and the Courses problems, which helps reducing the regret faster.For example, in Skate the top-2 alternatives are the same for all agents, and 8 out of 9 agents rank the same alternative at position 3.By contrast, in T Shirt, the alternatives are evenly distributed in the preference rankings.

Comparison with Two Phases
The experiments so far let the strategy free to question either the chair or an agent at each step.One may wonder what is lost in terms of regret by asking different proportions of questions to the chair and the agents.Such restrictions may be useful because of (partial) unavailability of the chair, or because the estimated cognitive costs may differ sensibly.Table 3 shows the MMR value reached in problems of size m = 10, n = 20 after 500 questions, using the Two phases strategy, in the "ca" (chair then agents) and in the "ac" (agents then chair) variants.These numbers are to be compared with the MMR value reached after 500 questions with the Pessimistic strategy (displayed in Fig. 3), which is 0.7; the Pessimistic strategy asks on average 13 (± 13) questions to the chair in this setting.The line q c = 0, where no question is asked to the chair, suggest that it is possible to obtain a good-quality recommendation while knowing only that the voting rule is a scoring rule with convex weights, which is our basic hypothesis.However, we observe that asking no questions to the chair does not permit to reach MMR = 0.The strategy, indeed, obtains full knowledge of the profile after an average of 500 questions to the agents but never reaches zero.

Conclusions
In this paper we have considered a social choice setting with partial information about the agent preferences and voting rule.We have proposed the use of minimax regret both as a means of robust winner determination and as a guide to the process of simultaneous elicitation of preferences and voting rule.Our experimental results suggest that regretbased elicitation is effective and allows to quickly reduce worst-case regret significantly.They also show that, in our setting, good quality (low regret) recommendations can be achieved short of having full knowledge of weights or profile.
As part of our contribution, we provide an open-source library that can be found at https://github.com/oliviercailloux/minimax, to reproduce our experiments and perform many more.Some directions for future works include developing new elicitation strategies, considering alternative heuristics; extending the elicitation to voting rules beyond scoring rules; eliciting preferences while restraining to concrete and easy questions.

Fig. 1 :
Fig. 1: Profile representing a question to the chair in extended (a) and compact (b) form.

Fig. 3 :
Fig. 3: Average MMR (normalized by n) after k questions with Pessimistic strategy for different datasets.

Table 2 :
Questions asked by Pessimistic strategy on several datasets to reach n 10 regret, columns 4 and 5, and zero regret, last two columns.