Formal monkey linguistics: The debate

Abstract We explain why general techniques from formal linguistics can and should be applied to the analysis of monkey communication – in the areas of syntax and especially semantics. An informed look at our recent proposals shows that such techniques needn’t rely excessively on categories of human language: syntax and semantics provide versatile formal tools that go beyond the specificities of human linguistics. We argue that “formal monkey linguistics” can yield new insights into monkey morphology, syntax, and semantics, as well as raise provocative new questions about the existence of a pragmatic, competition-based component in these communication systems. Finally, we argue that evolutionary questions, which are highly speculative in human language, can be addressed in an empirically satisfying fashion in primate linguistics, and we lay out problems that should be addressed at the interface between evolutionary primate linguistics and formal analyses of language evolution.


Introduction
'Formal Monkey Linguistics' (Schlenker et al. 2016b) summarizes four types of contributions to recent research on monkey calls. First, it states numerous generalizations about the form and use of monkey sequences, in a unified format that facilitates comparison. Second, it establishes general methods to study their properties -and especially their meaning -in a formally precise fashion. Third, it makes specific proposals about the division of labor between syntax, semantics and pragmatics within each species of interest. Fourth, it proposes to add a comparative and evolutionary component to the enterprise, and sketches a reconstruction of call evolution over millions of years in some simple but striking cases. The approach is both formally precise and data-driven, hence the importance of a collaboration between linguists and primatologists.
In the rest of this note, we remind the reader of the basic goals of a formal monkey syntax, semantics and pragmatics; they should be uncontroversial if certain confusions and misunderstandings are set aside. We then lay out substantive topics of discussion in the areas of morphology and syntax, semantics, pragmatics, and monkey call evolution.

The formal approach
Observations and field experiments have established two general points: 1. The species under study arrange discrete calls in constrained ways. 2. There is a systematic relation between calls and the natural or experimental situations in which they occur. Furthermore, field experiments establish that the monkeys themselves know this correlation and thus derive information from the calls they hear.
Point 1 establishes that calls are subject to syntactic rules, i.e. rules that specify how calls can be ordered. To study them precisely, one needs a formal monkey syntax, which establishes a bipartition (or a more fine-grained classification) of sequences into possible and impossible ones -or to use standard terminology: well-formed and ill-formed ones.
Point 2 establishes that calls have a semantics, i.e. that they provide information by being appropriate or inappropriate in various situations. To study precisely the information conveyed by these calls, one needs a formal monkey semantics that establishes a bipartition (or possibly a more fine grained classification) of pairs of the form <situation, call sequence>, determining in the general cases whether a call sequence is appropriate or inappropriate in a given situation -or to use standard terminology: whether it is true or false in that situation. 1 These points should be uncontroversial; to say that one needs a 'formal monkey syntax' or a 'formal monkey semantics' is just another way of stating that one wishes to develop a precise account of these properties (how these properties will be eventually derived is another matter, but the end product should be a formal syntactic and semantic theory). While one might think that the precision is overkill, a cursory look at the generalizations and analyses offered in 'Formal Monkey Linguistics' shows that this is not so. Although extant data are incomparably simpler than what is found in human language, analysts must state generalizations and theories with great precision if they want their claims to be understood, their predictions to be testable, and possible errors to be uncovered.
Once a syntax and semantics have been established, a standard question arises about the division of labor between them: a sequence may fail to be produced because it is syntactically illformed; or because its meaning is a useless one. The category 'syntactically ill-formed' must itself be refined: there could be cases of articulatory/phonetic impossibility, and others that are syntactic in a narrow (cognitive) sense. As for the semantic side, it is bound to interact with knowledge of the environment (what is often called 'world knowledge' in linguistics): a signal that has rich behavioral consequences may encode a highly specific message, or a far less specific one that happens to interact with general knowledge on the part of the hearers.
While these distinctions are close to conceptual necessities, 'Formal Monkey Linguistics' makes the controversial claim that monkey sequences can best be analyzed if one posits rules of competition among calls, notably an 'Informativity Principle' that favors more informative callshence a third general point made in our contribution: 3. There are rules of call competition -notably, if a situation licenses a more specific call or call sequence, and a less specific one, the more specific one must be chosen ('Informativity Principle').
If this proposal is correct, formal monkey linguistics must in the end have a pragmatic component; and thus the enterprise will be to investigate the division of labor between syntax (broadly construed), semantics, pragmatics and world knowledge.
The strength of the formal approach is to turn these general issues into precise ones, and to develop the program within a framework that uses versatile formal tools, which need not be borrowed from human linguistics, but may facilitate the comparison among various communication systems, including human language (as is emphasized in 'Formal Monkey Linguistics', for the most part the properties we analyze are very different from those found in human languages 2 ).

Clarifications
If formal precision matters, conceptual clarity does too.
One may object to the program of a formal monkey syntax on the ground that monkey inventories and sequence types appear to be finite, unlike human sentences (Berwick 2016). This is in a certain sense true, but doesn't detract from the general program. 3 Standard criteria of empirical adequacy and theoretical parsimony apply for finite systems -as is well-known within human linguistics itself (morphological paradigms are finite but still give rise to interesting theoretical proposals and debates). Finiteness in syntax means that great care must be exercised when trying to determine whether a form is the product of a rule, or is memorized. The issue arose for instance in our discussion of Putty-nosed pyow-hack sequences, which appear to be associated with a special meaning (involving group movement), not straightforwardly derivable from the meaning of their component parts. Syntactically, pyow-hack sequences come in all sorts of forms, as long as a few pyows are followed by a few hacks; and their time course is relatively slow. This makes it implausible that all these forms are memorized as fixed sequences. On the other hand, it could be that a general pattern of the form P + H + is memorized. More elaborate methods should of course be developed to try to decide between rules and memorized forms. (On the semantic side, finiteness doesn't arise in the 2 Berwick writes that "one might go astray by analyzing monkey language as though it were a human language (or even a computer language), and this is the key cautionary note of this commentary that must be sounded in programmatic approaches like Schlenker et al." Given the numerous differences we have noted between human and monkey languages, it is hard to see in what sense our proposal analyzes monkey language as though it were a human language. Nor do we know of computer languages with similar properties. Instead, we propose to apply the same formal rigor to the study of these communication systems as to human language. The initial analogies end there (and for us it is thus trivially true that, as Berwick write, "no human-type grammar needs to be invoked at all"; in fact, we never invoked one in the first place). 3 Schlenker et al. 2016b in effect suggested that it would be interesting to establish what is the least powerful way to characterize the syntax of monkey languages, concluding: " On a substantive level, our syntactic generalizations were modest and could be handled with very simple finite state grammars. It would be interesting to explore in future research (i) whether all monkey languages can indeed be described in such simple terms, especially when larger databases are considered, and (ii) if so, which subset of finite state grammars best characterizes the syntax of these languages (see for instance Pullum andRogers 2006 andRogers andPullum 2011)." same way, since a single well-formed sequence can be associated to infinitely many situations in which it may be true or false.) The program of an animal syntax in general is now relatively standard, and our claims about monkey syntax are particularly modest, as we will see below. While the program of an animal semantics follows with the same strength when it is established that sequences convey information, the tools we employ may lead to various misunderstandings that should be set aside. They pertain both to what semantics in general is, and to what formal monkey semantics in particular intends to be.
(i) The linguistics of the 1970's may give the impression that formal semantics is used in logic and philosophy, rather than in linguistics. 4 The developments of the last 40 years have shown how wrongheaded this impression is. The synthesis from the 1980's treated natural languages as formal languages both from a syntactic and from a semantic perspective. In both areas the goal was to gain insights into the workings of the human mind: syntactic and semantic theories alike seek to posit mechanisms that are both descriptively adequate and cognitive plausible. The developments might have been slower to come in semantics, but they are so prevalent and dynamic that they are hard to miss (see for instance Maienborn et al. 2011aMaienborn et al. ,b, 2012 for a recent overview of the field).
(ii) One may also have the impression that formal monkey semantics comes with numerous concepts borrowed from human language, such as those of predication and reference. As we defined the project above, and as is made very explicit in 'Formal Monkey Linguistics', this just isn't the case. In fact, not a single one of the analyses we offer posits predicates or referential expressions -all are propositional in nature (with the exception of the Campbell's suffix -oo, which is a propositional modifier). 5 Nor should this be particularly surprising: the formal semantics that one gives for propositional logic is equally devoid of predicates and referential expressions; and within human linguistics, weather verbs such as It's snowing or It's raining are arguably 0-place predicates, i.e. propositions. A related confusion might arise if one ignores what semantics -including human semantics -can in principle do, taking elementary introductions to be fully representative of the field, with their focus on information pertaining to the external world rather than, say, to the emotional state of the speaker. Here too, nothing in the framework prevents semantic information from being conveyed about the emotional state of the speaker, as much contemporary linguistic research makes abundantly clear. (See, for example, research on expressives (Potts 2005, among others): using frog to refer to French people or Boche to refer to German people comes with a clear emotional or subjective component.) (iii) A different case of confusion pertains to the innate character of many primate calls. One may think that innateness gets in the way of a formal analysis. There is no reason for this. Even within human linguistics, semanticists routinely provide lexical entries for features that are often thought to be part of an innate inventory, such as the category 'plural'. There is certainly nothing in the innate or non-innate character of an expression that impinges on the existence of a lexical entry for it; in the first case, the entry won't have to be learned. 6 (iv) These considerations suggest that as soon as a formal system conveys information, there can be a semantic approach to it. In fact, formal semanticists have even extended their approach to laughter 4 Fitch 2016 thus writes that "several of the fundamental theoretical assumptions of formal semantics and pragmatics ... ultimately stem more from logic and philosophy than from linguistics per se". This is a surprising claim in view of the rather uncontroversial empirical achievements of contemporary semantics, which offers a detailed understanding of subtle empirical facts, based both on rich introspective judgments and on experimental data. 5 As a result, the following comment by Fitch 2016 has no relevance: "the very notion of 'reference' is heavily laden with assumptions that are questionable even for human language, inapplicable to other human communicative systems (e.g. music or laughter), and inappropriate for primate communication." 6 The following remark by Fitch 2016, while correct, has thus no bearing on the project of a formal monkey semantics: "Starting with what we know about nonhuman primate (...) vocal communication, there are several differences from human spoken language that argue against congruency. The most obvious is that primate calls have a strong innate component...". (Ginzburg et al. 2015) and to music (Schlenker 2016) -sometimes with entirely different tools from the ones that are used for the semantics of human languages, or of primate languages, for that matter. The resulting proposals might be true or false, insightful or not. But the advantage of asking such questions is to obtain a much richer typology of meaning phenomena in nature, and in particular to come to a comparison between very different means of transmitting information.
(v) One could conclude from these remarks that formal semantics is so general as to be vacuous. But this confuses a framework with particular theories that can be stated within it. The situation in semantics is thus parallel to the one we find in syntax: formal language theory offers a general framework in which to state theories pertaining to very diverse communication systems. Similarly, formal semantics offers a versatile framework to analyze their informational content. The difference between the two cases is primarily one of familiarity, due to the fact that formal syntax took off in the 1960's, whereas formal semantics was only integrated into mainstream linguistics in the 1980's. But certainly if ethology is willing to have intimate contact with formal language theory (Fitch 2016), it should soon be sufficiently aufgeklärt to extend its horizons to formal semantics.
The upshot is that the framework, if properly understood, should be relatively uncontroversial. The discussion should thus be focused on the particular theories we develop within this framework -which requires attention to the relevant data and predictions. General objections are unlikely to be fruitful unless they come with precise alternative analyses.

Main findings
While 'Formal Monkey Linguistics' puts greater emphasis on meaning than on syntax, the following conclusions were reached: (i) There are limited cases that argue for a kind of morphological composition within calls. Notably, in Campbell's monkeys the suffix -oo can be added to two roots, krak and hok (see Ouattara et al. 2009a, Kuhn et al. 2014 for recent discussion); and it is plausible that it modifies the meaning of the root in the same way in both cases (on one theory, R-oo indicates that one should be in the same attentional state as if R had been uttered -hence a broader meaning; on a competing theory, R-oo indicates that there is a weak threat of the type that licenses R). In Diana monkeys, the A call has root uses, but it also arguably serves to form the complex calls LA, HA, and RA, which are targeted as units by the operation of repetition, thus yielding LA LA LA LA (Veselinović et al. 2014;Candiotti et al. 2012; see also Coye et al. 2016 for field experiments with artificial playbacks, showing that the A suffix provides information about caller identity while the first part of LA and RA complex calls provides information about the social and physical context. 7 ) (ii) In other cases, there is no strong evidence against analyses that take individual calls to form fullfledged, independent sentences, with a propositional semantics. Two apparent exceptions pertain to pyow-hack sequences in Putty-nosed monkeys, and to snort-roar sequences in Black-and-White Colobus monkeys.
• In Putty-nosed monkeys, pyow seems to be used as a general alert call, while hacks are often (but not only) used in eagle-related environments. Sequences made of a small number of pyows followed by a small number of hacks were argued to trigger group movement, but this function was not easy to derive on the basis of the individual meaning of the calls, hence the possibility that these behave like idioms in human language (e.g. kick the bucket), which are syntactically combinatorial but not semantically compositional. We offered an alternative analysis in which each call has a constant meaning, and rules of pragmatic competition account for the 'group movement' function -a point to which we return below.
• In Black-and-White Colobus monkeys, sequences made of a single snort immediately followed by roars seem to be used as highly underspecified alert calls, unlike their component parts -notably snorts, which are indicative of ground mammals when given singly. While the data from field experiments are somewhat preliminary, this might indicate that the snort-roar sequence must be treated as a unit rather than semantically decomposed. But since (unlike the pyow-hack sequence) the snort-roar sequence forms a tightly connected acoustic unit, it might be that the complexity is phonological rather than morphological or syntactic: on this view, it is a phonological accident (without morphological or semantic consequences) that the snort-roar sequence is made of snort followed by roars, just like in English irate is phonologically made of syllables found in I and rate without thereby being composed of these words Overall, our findings are quite deflationary, but they raise two questions: one pertains to the basic units of our analysis, usually sentences; the other pertains to the typology of syntactic operations that one might expect to find in the animal world.

What counts as a sentence?
The semantics we favored for all calls (except possibly Colobus snort and roars found in snort-roar sequences) was propositional, each call counting as a sentence. Sauerland 2016 proposes a definition of what should count as a sentence, and notices that on this definition our calls are not used sententially.
(1) Sauerland's definition of sentences (Sauerland 2016) a. Syntax only applies within sentence units, but within those must apply. b. Non-coordinating semantic composition can only occur within a sentence.
As Sauerland correctly points out, we provide rules such as those in (2), which violate (1)b (see Steinert-Threlkeld 2016 for related worries). For while the meaning of the sequence wS is interpreted as the conjunction of the meaning of w and the meaning of S, the parameter of evaluation is different in the two cases.
(2) If w is any call and S is any sequence, Our rules were motivated by two goals.
(i) In some cases (in particular for Titi calls, but also in our preferred analysis of Campbell's calls), we wished to highlight that two calls of the same sequence are still evaluated at different times, a point of some importance when sequences are slow. In this case, we assumed for simplicity discrete time, and thus that if a call w uttered at time a was followed by a call w', w' was uttered at time a+1. We gave rules such as (2)a in order to have an explicit way of computing the truth conditions of entire discourses. But nothing hinges on this (as is also emphasized in Schlenker et al., to appear): we could have said just as well that each call is interpreted as a claim about the time at which it is uttered; and that a discourse is true just in case each of its individual call is -as evaluated at its time of utterance.
(ii) In some cases (in particular for Campbell's calls, although the point is more general), we also wished to find a simple way to state that repetitions of calls were not vacuous, and thus we took each call to raise the value of an all purpose alarm parameter. In such cases, a in (2) was interpreted as an alarm parameter, not as a time parameter. (In our preferred analysis of Campbell's calls, we combined the two ideas, noting that a temporal interpretation of the parameter still allows for an analysis in which the longer the sequence is held, the greater the alarm level is.) Concerning (ii), Sauerland is correct to point out that a more modular analysis can be developed, one in which on his definition of a sentence each call counts as a sentence and thus conjunctively modifies the meaning of the sequence if belongs to. One can then compute the alarm level separately, by just counting the number of calls. This would make the very same predictions as our system. To illustrate, consider the 'toy model' we used to introduce our explorations (Schlenker et al. 2016b, (10)a).
( 3)  The derivation proceeds by using a version of the rule in (2). We could instead follow Sauerland and take the 'simple' conjunction of hok and hok (without an alarm parameter), to obtain truth conditions specifying that there is an eagle. One must then add a rule to the effect that a sequence containing n calls triggers the inference that the alarm level is at least n-1. Sauerland's own implementation is slightly different, as it counts duration time rather than number of calls. This, in turn, makes slightly different predictions from our system; but the point remains correct that a more modular analysis can be given than ours.
(4) Sauerland's Danger-Memory Proportionality: The greater a threat, the longer an individual is alarmed by it.
Of course a key question is whether every individual call can be treated as a separate sentence. We come close to this conclusion in Schlenker et al. 2016b, but there are two recalcitrant cases: we do not treat the suffix -oo as a separate sentence, but as a sentence modifier; and we hint at the possibility that Colobus snort-roar sequences might have to be interpreted wholesale (the semantics of Diana calls is not yet known in sufficient detail to bear on the propositional vs. nonpropositional nature of atomic calls). Sauerland 2016 asks exactly the right question, namely how far one could go in taking them to be conjunctions of sentences. We come back to this important point below.

Typology of animal syntax
It is standard, and illuminating, to use categories of formal language theory to organize generalizations found in animal syntax, often with subclasses of finite state languages (e.g. Berwick et al. 2011, Pullum and Rogers 2006, Rogers and Pullum 2011. But Rizzi 2016 proposes a typology based on the kinds of 'merge' operations, i.e. operations of combination, that are found, as in (5) (see Murphy 2016 for further questions concerning the existence of labels in animal syntax).
(5) Rizzi's typology (Rizzi 2016) 1-merge systems, or "word -word merge systems": "merge can apply, forming two-word expressions, but then the system stops, i.e., it lacks recursive procedures." 2-merge systems, "permitting word -word merge, and also word -phrase merge." 3-merge systems, "permitting word -word merge, word -phrase merge, and also phrasephrase merge." While human syntax is a 3-merge system, Rizzi notes that subsystems of human language are less expressive: "for instance, one may think of the hierarchical structure of the syllable as arising from the operation of two such systems: a nucleus is merged with a coda to determine a rhyme, and an onset is merged with a rhyme, to determine a syllable, thus giving rise to hierarchically organized structures of three elements [Syllable Onset [Rhyme Nucleus Coda ]]. Here, two 1-merge devices combine to give rise to expressions of three elements, but the overall system is nonrecursive, as it cannot reapply to its own output." The male Campbell's root+suffix (-oo) structure is a 1-merge system, with the constraint that -oo does not occur on its own. Diana female social calls might be another instantiation of this system (Veselinović et al. 2014, Candiotti et al. 2012, Coye et al. 2016). The jury is still out concerning the status of Colobus snort-roar sequences. If they cannot be interpreted compositionally as snort + roar, the complexity might be placed in the phonology, in the morphology or in the syntax. More work is needed before we can come to a conclusion on this matter.
Importantly, one needs to determine where simple concatenation -interpreted as conjunction -should be placed in this system. Our impression, Sauerland's suggestion, and probably Rizzi's own intuition, is that in our studies concatenation should not be taken to involve a real instance of 'merge'. The reason is that each call can be treated as a separate utterance, and thus contribute its informational content independently from the others (this was in fact our intended interpretation for (2) when a is a time parameter). If one viewed concatenation-quaconjunction as a non-trivial operation, one would presumably take such cases to be 2-merge systems, in which a call can be merged with a sequence of calls.

Theory choice and natural classes
In Section 2.2, we tried to clear up some general confusions about semantics in general, and formal monkey semantics in particular. But this is not to say that the methodological situation is simple. One implicit constraint on the enterprise is that the lexical entries we posit should be based on categories that are reasonably natural for monkeys (a point to which we return in Section 6). As an example, we sought to explain why Campbell's krak is primarily used as a leopard alarm in the Tai forest, but as a general alert on Tiwai island. We could have, absurdly, posited the disjunctive interpretive rule in (6), which ends up saying something like: krak has Meaning (i) if uttered in Tai, and Meaning (ii) if uttered on Tiwai.
(6) krak is true in situation w iff (i) there are leopards in the general environment of s, and there is a leopard in s; or (ii) there are no leopards in the general environment of s, and there is a disturbance in s.
It is intuitively clear that there is nothing unified in this lexical entry. But the criterion by which a lexical entry is plausible or not remains entirely implicit at this point. A solid common sense is thus needed to avoid postulating impossibly complex lexical entries -and opinions as to what counts as 'impossibly complex' might of course differ. In the long term, some constraint on possible lexical entries would be very useful -a problem that is not unique to primate linguistics (indeed, the same issue arises to some extent in human linguistics). As in other areas, lexical semantics should ultimately be connected to the conceptual repertoire of the species under investigation -hence the relevance of animal psychology for the study of animal communication.

A null hypothesis: concatenation as conjunction
Although monkey sequences can be quite long, we take the 'null hypothesis' to be that each call contributes its informational content independently from the others, by way of a propositional meaning. As we noted in Section 3.2, this leads one to expect that the semantic content of a sequence should be the conjunction of the meanings of its component parts, evaluated at their respective times of utterance. This is the most trivial notion of 'compositionality' that one can imagine, which is not indicative of the existence of genuine rules of combination (since each call can be interpreted independently). This point should be borne in mind with respect to claims that some animal systems display aspects of compositionality. For instance, in an extremely interesting recent article, Engesser et al. 2016 argue that pied babbler (Turdoides bicolor) display stronger reactions to an alert call followed by a series of recruitment calls (a combination they term a 'mobbing sequence') than they do to either of their component parts (although the reactions are of the same types), and conclude that this communication system displays 'rudimentary compositionality'. But a crucial question is what one means by 'rudimentary compositionality'. If this goes beyond the null hypothesis, one would need to show that these calls are combined by an operation different from conjunction. It is not clear that there is evidence for this. The authors write that they "can rule out alternative explanations related to a sequential or additive processing of calls, because responses to played back mobbing sequences exceeded those elicited by the independent calls or their sum." But nothing in a conjunctive semantics contradicts this. To take a human analogy: Little Johnny is on the pedestrian crossing might not trigger a human alarm; nor need There is a car coming be alarming when uttered on its own. But the conjunction Little Johnny is on the pedestrian crossing and there is a car coming might require immediate action: the effect of the conjunction is not additive in terms of the effects of the conjuncts.
Thus when we argue that in some cases calls cannot plausibly be analyzed as contributing a separate proposition combined conjunctively with the other calls of the sequence, we are mindful of the fact that this is a departure from the null hypothesis.

Can all atomic calls be treated as independent propositions?
As mentioned above, the main cases in which some atomic elements were not treated as propositional involved the suffix -oo and snort-roar sequences. Sauerland 2016 correctly asks whether these too could be analyzed in propositional terms.
Let us turn with -oo, comparing its behavior to that of boom. In our analysis, the nonpredation call boom can modify the meaning of other sequences that would otherwise warn of a threat; but we still treated it as conjunctively modifying the meaning of these sequences. To be concrete, consider a cross-species version of the problem (but it arises in species-internal communication as well) : Zuberbühler 2002 showed that in the Tai forest Diana monkeys react with their own alarm calls to Campbell's sequences of kraks (indicative of leopards) or hoks (indicative of eagles); but they do not react with alarm calls when exposed to the same sequences prefixed by boom boom. Boom boom thus contributes the information that the situation is not one of predation. To the extent that boom boom can be followed by krak (e.g. in boom boom krak krakoo), this could be explained in our preferred theory because the lexical meaning of these calls is relatively weak, and gets strengthened by competition with other calls; in sequences with booms, the strengthening may fail to arise, in which case no contradiction would arise with the initial booms.
Could a similar logic be applied to -oo, which we treated as a propositional modifier rather than as a proposition? Sauerland 2016 puts forth the analysis in (7): (7) I(-oo) = there is a weak disturbance So if hok has a meaning akin to there is a non-ground disturbance, hok-oo would yield: there is nonground disturbance and there is a weak disturbance. On the further assumption that there is only one disturbance, one would get the inference that there is (only) a weak disturbance. Furthermore, one could even posit that hok competes with hok-oo, and thus that its meaning gets enriched to: there is a non-weak disturbance.
But it is worth examining the consequences of this move for our preferred (pragmatics-based) theory of Campbell's calls. As will be recalled, we took the entailment relations among Campbell's calls to be given by the following diagram, where full lines connect calls in an entailment relation (with higher = logically stronger). (8) Details matter, so we will quote the explanations we gave in Schlenker et al. 2014: "-Given our semantics for -oo, it immediately follows that a modified root R-oo always entails the bare root R.
-In addition, hok -and thus also the stronger hok-oo -entails krak: if the caller is alert to a disturbance whose source is non-terrestrial, then certainly the caller is alert to a disturbance.
-Are there further entailments? Not if the contribution of -oo in R-oo ('... weak among the disturbances that license R') is understood in a natural, non-intersective way, for instance as: '... is a disturbance in the bottom n%, in terms of threat level, among the disturbances that license R'. In particular, without special assumptions, there is no entailment relation between krak-oo and hok-oo. If krak-oo is used, then the caller is alert to a disturbance that counts as weak among all general disturbances; but this need not imply that the caller is alert to a disturbance that counts as weak among all those whose source is non-terrestrial. For instance, if aerial disturbances usually involve eagles, an inter-group encounter might count as a weak disturbance (e.g. in the bottom 10% of aerial threats); but this need not entail that it counts as weak among all the disturbances there krak krak-oo hok hok-oo are, as many of the non-aerial disturbances might be considerably less threatening than eagle encounters (hence an inter-group encounter might fail to be in the bottom 10% of all threats)." Crucially, the discontinuous line in (8) does not correspond to an entailment relation on this analysis. But it does on Sauerland's alternative: if the combination of -oo is conjunctive, and if hok is more informative than krak, it also follows that hok-oo is more informative than krak-oo. But this leads one to expect that krak-oo should give rise to the implicature that hok-oo could not be used. Hence we should get the inference that a weak threat occurred, but not a non-ground threat -hence presumably a ground threat. But our initial observation was that krak-oo is used as a completely general alert call, including use in cases of eagle sightings (Ouattara's data: Ouattara et al. 2009b;Schlenker et al 2014). The implicature of a non-ground threat thus does not seem to arise, thereby providing an argument within our preferred theory against analyzing -oo as an independent sentence.
Let us turn to snort-roar sequences. We argued in Schlenker et al. 2016b that these sequences might have to be analyzed in a non-compositional fashion, although the complexity might be phonological rather than morphological or syntactic in nature (we don't know). Sauerland 2016 argues instead that in all cases snorts and roars might be interpreted as simple propositional elements. He notes, reasonably, that the ordering we find, with snorts preceding roars, might be due to articulatory constraints (although this too is a big unknown at this point) -and if so, no bona fide syntactic constraints would be needed. Regarding the semantics, Sauerland writes that "the initial alarm calls are of greater importance since they trigger a specific evasive behavior, while once an individual is engaged in evasive behavior already, it may ignore content about a specific behavior." He goes on to argue for the interesting propositional analysis in (9). The interest of this analysis is that each call has a general component (pertaining to an alarm state), and a specific one (pertaining to a ground-vs. aerial-related evasive behavior). But the specific component is vacuously satisfied when the addressee is already engaged in evasive behavior. As a result, the specific effect of a call will be felt when it appears at the beginning of a sequence. Still, this proposal raises several questions. First, it is not entirely clear what is predicted when a snort-roar sequence appears at the beginning of a discourse. Since snort-roar sequences are tight acoustic units, the specific component will be nearly contradictory ('engage in evasive behavior for a ground threat and engage in evasive behavior for an aerial threat'), unless we take the initial snort to trigger a ground-related evasive behavior before the roars are analyzed. But this won't account for the (relatively few) cases in which snort-roar sequences appear at the beginning of eagle-related discourses (see the appendices of Schlenker et al. 2016b for data). Second, one would expect that discourse-medially, where the specific component is vacuous, pure roar sequences should occur in comparable ways to snort-roar sequences, including in contexts of ground threats -which should be investigated (presumably, it is on syntactic grounds that Sauerland prohibits snorts given singly from appearing in non-discourse-initial positions -which is also the position we took in Schlenker et al. 2016b 8 ; but pure roar sequences clearly appear in all positions). By contrast, we assumed in Schlenker et al. 2016b that snort-roar sequences have a broader distribution than either of their component parts, hence our (highly tentative) conclusion that they might have to be analyzed in a non-compositional fashion. Finally, the lexical entries in (9) involve what are intuitively very sophisticated concepts -which only highlights the importance of constraints on possible lexical entries, as mentioned in Section 4.1. Our provisional conclusions on the need for mechanisms of composition that go beyond conjunction still stand, although the situation might of course change as more data become available.

Pragmatics
One of the key innovations of Schlenker et al. 2016b is the proposal that there should be a division of labor between semantics and pragmatics, and in particular that the meaning of a weak call S competing with a more informative call S' may be enriched with the negation of S', as stated in (10).

(10) Informativity Principle
If a sentence S was uttered and if S' is (i) an alternative to S, and (ii) strictly more informative than S (i.e. asymmetrically entails S), infer that S' is false.
We were careful to state the principle in a way that need not require a theory of mind, as shown in (11).
(11) Informativity Principle without a theory of mind Assume that the semantics yields a relation 'is strictly more informative than' on some sentences that are alternatives to each other. Underinformative sentences are prohibited by the following rules: Speaker: Do not utter S in a situation w if a strictly more informative alternative S' is true in w.
Hearer: If you hear S in a situation w, infer that every strictly more informative alternative S' is false in w.
But this proposal is sufficiently new in primate analyses that it requires detailed justification. Further pragmatic principles we proposed are even more controversial, and are in need of further support.

Arguing for Informativity
The initial motivation for the Informativity Principle was as follows. There are numerous cases in primate communication (and beyond, probably) in which a general call competes with a more specific one in the repertoire of a given species. For instance, in Titi monkeys the B-call is used in predatory and non-predatory situations alike, whereas the A-call seems to have a narrower distribution and to invite a particular behavior, namely of looking up. Similarly, in Putty-nosed monkeys, pyows are used in all sorts of situations, including in leopard-related ones, and also at the end of sequences triggered by eagle stimuli, whereas hacks appear to have a narrower usage, possibly related to non-ground threats, or possibly high arousal (several other cases are discussed in Macedonia 1993). Now consider in both species a field experiment involving eagle stimuli. Results are quite clear: the specific call (hack for Putty-nosed monkeys, the A-call for Titis) is used at the beginning of the resulting sequences. But if the competing call is genuinely general (as its distribution indicates), why is it not used? The question is not usually raised in primatology, probably because the implicit answer is that one would have no reason to use a less specific call when a more specific one is available. But this is just what the Informativity Principle in (10) states. Steinert-Threlkeld 2016 correctly notes that several standard tests used in human language to decide between semantic vs. pragmatic analyses of a construction (notably those based on the interaction with logical operators such as negation) are inapplicable given the limited expressive power of monkey languages. Still, several suggestive arguments can be developed. In the case of the Informativity Principle, they are of four kinds.
(i) First, the Informativity Principle makes it possible to eschew highly unnatural lexical entries. Schematically: general alert and serious aerial threat are both specifications that might plausibly correspond to natural concepts (as would their conjunction, presumably). By contrast, a negative specification such as not-a-serious-aerial-threat seems less likely to correspond to a natural concept. By making use of the Informativity Principle, we can get the general effects of a 'negative' lexical entry without actually positing one: we define entries for general alert and also for serious aerial threat, leaving it to the Informativity Principle to explain why the general alert call is not usually employed in contexts of serious aerial threat.
Note that this reasoning is often made in areas of human linguistics to motivate competition principles, for instance in morphology. For instance, the zero ending in the English present tense is used everywhere except in the 3rd person singular. While the zero ending could be given the specification 'not-3rd-person-singular-present', morphologists usually prefer to note that this is not a natural class, although it is the complement of one. A principle of competition solves the problem: the null suffix only gets a [present] specification, -s gets the [3rd person, singular, present] specification, and competition guarantees that -s gets inserted wherever it can be (see for instance Bobaljik 2015). This argument is implicitly based on what counts as a 'natural morphological class'; this is the same kind of problem we mentioned for our own analysis in Section 4.1: a semantic analysis depends on a definition of what counts as a natural semantic class, and ultimately on a natural concept.
(ii) Second, due to its pragmatic nature, the Informativity Principle can naturally be taken to be a 'soft' constraint, and thus to yield an enrichment that is optional. This makes the prediction that general calls should have some uses in situations that would license a more specific call. We appealed to this property in our analysis of Campbell's krak calls in the Tai forest: although krak mostly has the distribution of a leopard call, we noted that it gave rise to more 'incorrect' (non-leopard-related) uses than one would expect if this were really its lexical specification (the comparison was effected by noting that non-leopard uses of krak were more frequent than non-aerial uses of hok 9 ). A related situation arises in human language: John will order a burger or some fries naturally leads to the inference that he won't order both, but this is a defeasible inference -unlike what would happen if we added but not both.
(iii) Third, because enrichment by the Informativity Principle is not obligatory, we could expect it not to apply if the enrichment leads to a contradiction or a useless meaning. We mentioned above the case of kraks preceded by boom boom, which suddenly stop yielding a leopard-related message, at least in the Diana monkeys' understanding of Campbell's calls. We made use of a more sophisticated variant of the same principle to explain why krak has its unadorned, general meaning on Tiwai island. By competition with hok and krak-oo, it could obtain a meaning of serious and ground-related threat. We argued that for lack of serious threats of a ground-related nature on Tiwai, this would be a nearcontradiction, and thus that strengthening should not apply. In a way, this is close to the logic used in the analysis of a sentence such as: I'll invite John or Mary -I'll even invite them both. While the first sentence could be enriched by competition with I'll invite John and Mary, this would yield an exclusive reading of or which is contradicted by the second sentence, and for this reason strengthening is taken not to apply.
(iv) Fourth, in rare cases, compositional considerations can provide an argument for a weak meaning, which must be independently enriched by the Informativity Principle. Such was the case of the krak/krak-oo interaction in male Campbell's monkeys in the Tai forest: if krak had a leopard meaning, given that hok-oo is in some way a watered-down version of hok, one would expect that krak-oo should be a watered-down version of krak, and should have something to do with ground threats, contrary to fact. Giving krak a general meaning makes it possible to derive krak-oo from the very general meaning of krak, which in turns accounts for the highly general uses of krak-oo. The Informativity Principle is then in charge of explaining why krak on its own still has leopard-related meanings in that environment.
Seyfarth and Cheney 2016 correctly suggest that the Informativity Principle should be subjected to specific field experiments -definitely an important direction to explore. But it must be noted that extant field experiments in which an eagle call appears at the beginning of an eagletriggered sequence despite the availability of a more general call already make the point. In addition, one could consider entirely different experiments designed to test the existence of the Informativity Principle in call acquisition. Thus Takashi Morita (p.c.) suggested that in artificial learning experiments, one could expose primates or other animals to two labels L and L' and a learning environment in which L' is true in a strict subset of the situations in which L is true (so that L' is strictly stronger than L). One could then test whether L blocks L' when both are applicable, be it in comprehension or (if testable) in production. This would of course take us in a very different direction from field experiments.

The Urgency Principle
Besides the Informativity Principle, which we used in all of our monkey studies, we posited a (rather tentative) Urgency Principle in order to account for Putty-nosed pyow-hack sequences: If a sentence S is triggered by a threat and contains calls that convey information about its nature or location, no call that conveys such information should be preceded by any call that doesn't.
The basic idea was that pyow has a meaning of general alert and that hack has a meaning of serious non-ground movement-related alert, with the result that the conjunction of pyows and hacks is rather underspecified. In particular, it could be used in situations of group movement (because Putty-nosed monkeys are arboreal), and also in eagle-related situations. But in the latter case, hacks would provide information about the location of a threat, and should thus come before pyows, which don't provide such information. This provides a mechanism of pragmatic enrichment of the meaning of pyow-hack sequences, using something other than the Informativity Principle.
Since at this point pyow-hack sequences are the only domain of application of the Urgency Principle, the latter should be seen as stipulative. But it is interesting to note that it might have applications beyond the primate realm. In recent work on the Japanese great tit (Parus minor), Suzuki et al. 2016 noted that on its own a sequence of notes ABC induces the hearer to scan for danger, while a sequences of D notes induces it to approach the caller. ABC-D combinations (in that order) have a mixed effect: hearers approach the caller and also scan (not necessarily in that order). But crucially, when the order of the notes is artificially reversed, leading to the sequence D-ABC, hearers rarely scan and approach. Although the authors speak of a 'compositional syntax', they themselves hint at a pragmatic analysis of their results: As D notes are often produced in non-predator contexts, conspecifics hearing D notes before ABC notes may be slower to produce appropriate anti-predator behaviours, which may be of particular importance when tits are defending their nestlings. (Suzuki et al. 2016 p. 5) Such considerations might argue for a bird application of (a version of) the Urgency Principle: the D sequence doesn't provide information about the presence of a danger, but the ABC sequences does. Urgency would thus lead one to expect that when the two sequences are put together, the ABC sequence should come first. When this principle is violated, responses are clearly weakened. Similar remarks might be helpful to understand why in the data discussed by Engesser et al. 2016 (see above) alert calls seemed to always come before recruitment calls.
It remains to be seen, of course, whether independent evidence will be found for the Urgency Principle, within Putty-nosed monkeys and beyond. 10 10 Steinert-Threlkeld makes a friendly (and correct) amendment to our analysis of Putty-nosed syntax in connection to the Urgency Principle (our analysis is developed along similar lines, but in greater detail, in Schlenker et al. 2016a). As he writes, Schlenker et al. 2016b "posit H + P + sentences as part of the syntax even though they appear never to be used. They claim that they need to do so because 'sentences of type H + P + … serve as alternatives to P + H + when the Urgency Principle is applied to the latter'. This, however, appears to be an unnecessary motivation. By the definition of alternatives in (14), the sentence H n will be an alternative to P k H n-k since it arises by replacing k Ps with k Hs. But P k H n-k will still be in violation of Urgency for the same reasons as given above and so will not be used in eagle contexts. A simpler theory that does not posit H + P + in the syntax appears to be easily given." The point is correct, but it raises an issue: should it be possible for the alternatives (or the alternatives accessed by pragmatic principles) to be ill-formed? This seems to go against the spirit of a pragmatic principle, which compares what was said to what could have been said. But for lack of a provision to this effect (except in our Colobus analysis, see Schlenker et al. 2016b, (58)), Steinert-Threlkeld's point stands (see Schlenker 2008 for related discussions in human pragmatics).

Occam's Razor
Putting together our semantic and pragmatic principles leads to a relatively powerful system. Hence the question whether it's not too powerful arises. Jäger 2016 does not find issue with our semantic analyses, but casts doubt on some of our pragmatic explanations on the ground that they are insufficiently parsimonious (and as he notes (p.c.), the problem we mention above in Section 4.1 only compounds the problem). In particular, he notes that besides simple lexical entries for pyow and hack, our theory of Putty-nosed pyow-hack sequences "have to be complemented by quite a few additional principles and assumptions: 1. the Urgency Principle (34), 2. the (revised) Informativity Principle (44), 3. the assumption of Alarm decay (40), and 4. the piece of world knowledge given at the end of (45): 'The most common situations in which there is a serious non-ground-movement-related alert but not one which is due to a threat involve group movement.'" Jäger's doubts are legitimate, but he commits one fallacy: the complexity of a theory should be assessed relative to the totality of the data it seeks to explain. And the advantage of the comparative approach we advocate is that the same principles can be applied to several data sets. The Informativity Principle played an important role in all of our analyses. Alarm Decay is just the claim that the seriousness of an alarm usually decays over the time, and versions of it were used in some of our Putty-nosed, Titi and Colobus analyses, in particular when sequences start with a specific call and end with a general call. This leaves the Urgency Principle, discussed above, which we have seen is possibly applicable to Suzuki's (as well as Engesser's) birds; and the environmental assumption about the types of alerts that one could expect to find given Putty-nosed monkey environmental conditions. This is not to say that one cannot cast doubt on this combination of principles -in our various studies we have usually remained very cautious about the analyses under consideration, with the firm belief that theories are bound to change rather radically as more data become available. But the parsimony of a theory should be assessed correctly -that is to say, relative to all of the data it is responsible for. (In addition, when the empirical database is relatively limited, it is inevitable that one should come up with principles to be tested in future research.)

Evolutionary data
Research on human language evolution is notoriously difficult and speculative. The heart of the matter is that language leaves no direct archeological traces (unlike bones and tools, for instance); and that all our closest relatives (Neanderthals, Denisovans, etc.) disappeared long ago without telling us what kind of language abilities they had, if any. In the latter respect, the situation is considerably more favorable in monkey languages: as we discussed in connection with cercopithecines, plotting the distribution of boom calls in a phylogenetic tree suggests that booms are at least several millions years old in quite a few species. Similar evolutionary inferences could be drawn on several other calls, including Putty-nosed and Blue monkey pyows and hacks/kas. One key issue for the future will be to investigate these evolutionary questions in an empirically detailed fashion, and to connect them with models of meaning evolution developed in the literature.

Evolutionary scenarios
The approach we advocate in Schlenker et al. 2016b should be extended with precise formal analyses of the evolutionary scenarios that might have led to the meanings we posit (see Franke and Elliott 2014).
q One call We focused on relatively rich monkey repertoires, as these make methods from 'formal monkey linguistics' more useful than in more trivial systems. But to analyze animal language evolution, it might be good to start with a species that uses a single call, say A. In this case, a detailed analysis of the utility obtained by the speaker and hearer when they follow certain strategies should in principle make it possible to predict (using game-theoretic tools) the final meaning of A. For instance, the speaker may call whenever there is a cat, whenever there is a raptor, or whenever there is a raptor or a cat. The hearer may have a strategy of looking up, or looking down, or of scanning, whenever it hears the A call. A detailed analysis of the pay-offs should yield precise predictions.
q Several calls When systems with a single call are understood, systems with several calls may be analyzed with similar methods. But multi-call systems raise the general issue of competition among calls, as we discussed above (see Skyrms 2010 for relevant discussion of systems in which there are fewer calls than actions to be taken). Jäger 2016 notes, reasonably, that the Informativity Principle currently lacks any support from models of language evolution. In fact, to our knowledge the question has never been asked, so we take Jäger's point to be an excellent issue for future research: Can the development of the Informativity Principle be studied within current models of meaning evolution?
Still, we agree with Jäger that several theoretical objections can be raised at the outset when the problem is viewed from an evolutionary perspective. We start with Jäger's general objection, and state more specific objections that could be leveled as well.
Objection 1: As Jäger 2016 writes, in evolutionary models "meanings are conceptualized as actions of the receiver which induce different fitness values for both sender and receiver. So they correspond to "interpretations" rather than "meanings" if we draw a distinction between semantics and pragmatics. Abstract meanings, however, being abstract, are not directly relevant for fitness. This begs the question how they -and therefore the distinction between semantics and pragmaticscould have evolved in the first place in connection with innate signaling systems." Reply: Certainly the effects of meanings -and in particular the different fitness values they give rise to -should be assessed in terms of actions they trigger. But from this it does not follow that meanings are cognitively represented in terms of actions (this might not always be cognitively possible if the actions involved are disjunctive or complex; and constraints on communication, involving for instance context dependency and multiple receivers, might also favor a less direct relation between signals and actions). So the net fitness effect of the meanings we posit together with the associated pragmatic principles will be evaluated in terms of actions; but this does not preclude the kind of modular approach we favor -at least if some evolutionary scenarios can explain how the various modules could have developed.
Objection 2. The Informativity Principle only has some 'bite' to the extent that one signal is strictly stronger than another. But on evolutionary grounds this scenario is unlikely to be stable in the first place.
To make things concrete, consider the case of A and B calls in Titi monkeys. According to our analysis, B is a general call, A is a 'serious non-ground threat' call. So when the A-call can be used, the B-call can be used as well. But now consider the pay-offs associated with the specific vs. the general information. It is highly likely that the specific information allows the receiver to obtain a higher pay-off, namely by adopting a raptor-appropriate escape strategy. Now consider the semantic associations in (13) (for simplicity we write 'raptor' in lieu of 'serious non-ground danger' in order to simplify the discussion, but nothing hinges on this).
(13) a. Meaning 1: A → raptor B → danger b. Meaning 2: A → raptor B → danger, non-raptor A sender adhering to Meaning 1 would use A in a proportion (1-ε) of raptor situations, but also B in the remaining proportion ε of these raptor situations. This leads to a dilemma. (i) If ε = 0, the receiver strategy can be improved by always interpreting B as 'danger, non-raptor'. This means that in the end the sender and receiver are using Meaning 2, not Meaning 1. (ii) If ε ≠ 0, a mutant sender strategy that only uses A in raptor situations will presumably produce greater utility and thus come to replace the initial strategy (on the assumption that information exchange is cooperative). This will bring us back to case (i), and Meaning 2 will replace Meaning 1.

Reply:
A possible rejoinder is that this analysis assumes that all meanings are equally available. But the intuition behind our initial argument for the Informativity Principle is that certain meanings do not correspond to natural classes -notably, 'non-raptor' in the case at hand just might not be a cognitively available meaning (and more strikingly, in our 'real' analysis, the meaning corresponding to the negative concept 'non-[serious non-ground threat]' might not be available in the first place). From this perspective, the Informativity Principle might be the only way to approximate the non-raptor meaning, which in turn might provide the beginning of an explanation for the emergence of the Informativity Principle.

Objection 3.
A second possible objection is that what appears to be a general, underspecified meaning corresponds in fact to a highly specific action, namely of the kind that is appropriate for cases of ground predators as well as of general uncertainty about the nature of a threat. On this analysis, then, we should re-conceptualize the underspecified nature of the B call as a highly specific action-tied meaning, something like: A → pay attention to a high danger B → scan the environment Reply: This alternative should be investigated. 11 But it should be noted that its predictions might be different from those of Schlenker et al. 2016b. In particular, the alternative would seem to predict constant responses to what we took to be general calls. 12 By contrast, the theories we propose predict that, when the general meaning is not strengthened by competition with other calls, reactions might be diverse and context-dependent (because the information conveyed by the call is unspecific). More research should consider these alternatives, both on an empirical level (to determine what the facts are) and on a theoretical level (to determine under what conditions underspecified calls can emergefor instance in case the sender cannot directly provide information about the receiver's optimal reaction, either for lack of information or because there are several receivers). These remarks only scratch the surface of this part of the debate. Two things should be clear. First, our analyses should be embedded within precise formal analyses of the evolution of monkey meanings. Second, the latter could be greatly enriched by taking into account the three-pronged strategy advocated in our work, which has a formal, a typological and an evolutionary component.