To Click or Not to Click? Deciding to Trust or Distrust Phishing Emails

. While the email traﬃc is growing around the world, such questions often arise to recipients: to click or not to click? Should I trust or should I distrust? When interacting with computers or digital artefacts, individuals try to replicate interpersonal trust and distrust mechanisms in order to calibrate their trust. Such mechanisms rely on the ways individuals interpret and understand information. Technical information systems security solutions may reduce external and technical threats; yet the academic literature as well as industrial professionals warn on the risks associated with insider threats, those coming from inside the organization and induced by legitimate users. This article focuses on phishing emails as an unintentional insider threat. After a literature review on interpretation and knowledge management, insider threats and security, trust and distrust, we present a methodology and experimental protocol used to conduct a study with 250 participants and understand the ways they interpret, decide to trust or to distrust phishing emails. In this article, we discuss the preliminary results of this study and outline future works and directions.


Introduction
Technical and externally centred Information Systems security solutions allow the prevention of intrusions [17], the detection of denial of service attacks [68], and the strengthening of firewalls [46]. Nevertheless, the academic literature as well as industrial professionals consider that a predominant threat is neither technical nor external, but human and inside the organization [54,29,64,66]. Such an insider threat may be intentional or non intentional, malicious or non malicious [35,65,5].
In fact, according to audit and advisory surveys such as [28], more than 33% of reported cyber-attacks between 2016 and 2018 used phishing, just behind those who used malware (36%). The proportion of insiders among threats increased from 46% in 2016 to 52% in 2018. According to cybersecurity ventures, employees should be trained to recognize and react to phishing emails [43]. More than 90% of successful attacks rely on phishing, i.e. emails leading their recipients to interpret and decide to trust them [21]. Recipients are invited, suggested or requested to click on a link, open a document or forward information to someone they should not.
For authors such as [2], security concerns and actual behaviour are disconnected due to the "lack of comprehension". It may be difficult for users to understand security warnings [14], as well as to identify ongoing attacks [20]. In this paper we argue that such understanding difficulties may be studied by focusing on trust and distrust elements users rely on when receiving an email. A logo, a date, a number or an email address, those elements and others are used to decide either an email may be trusted or not.
In the first section of this paper, background theory and assumptions are presented: First, the ways individuals interpret and understand information relying on the knowledge management literature; Second, insider threats and their different categories; Third trust and distrust with a particular focus on individual psychological processes. In the second section of this paper, a study that involved 250 participants is presented: First, the methodology and experimental protocol; Second, a discussion of the preliminary results; Third, a presentation of future works. The overall purpose of this paper is to share observation statements, preliminary results, and future expectations on ways to prevent insider threats by identifying how we decide to trust or distrust phishing emails.

Background Theory and Assumptions
In this section, we first draw from the knowledge management literature to present the ways individuals interpret and understand information. Second, we discuss the importance of considering insider threats and their different categories. Third, we expose individual psychological processes leading to trust or distrust.

On Interpretation and Understanding
To describe the complex interpretation machinery, some authors talk about a "mental model" [23] or a "neural apparatus" [25], a place of chemical reactions that can be analysed. Others believe that interpretation involves above all the socio-individual [67], resulting from our history, a place for expressing a form of intellectual creativity specific to each person. We all act as interpretative agents, information processors interacting with the world that surrounds us through a filter. In fact, this indescribable filter through which we interact with the world may be called an "interpretative framework " [63].
Information is transmitted by talking, writing or acting during a sense-giving process. We collect data from this information by listening, reading or watching during a sense-reading process. Sense-giving and sense-reading processes are defined by [50] as follows: "Both the way we endow our own utterance with meaning and our attribution of meaning to the utterances of others are acts of tacit knowing. They represent sense-giving and sense-reading within the structure of tacit knowing" [50, p. 301]. When he studied the processes of sense-giving and sense-reading, [63] highlighted the idea that knowledge was the result of the interpretation by an individual of information.
Information is continuously created during sense-giving processes and interpreted during sense-reading processes. Knowledge can then be: made explicit, i.e. it has been made explicit by someone within a certain context, it is sense-given and socially constructed. Individuals, as well as computers are "information processing systems" [19, p. 9]; tacit, i.e. it has been interpreted by someone within a certain context, it is sense-read and individually constructed. Relying on [49]: "We can know more than we can tell".
So that made explicit knowledge is tacit knowledge that has been made explicit by someone within a certain context. It is information source of tacit knowledge for someone else. It is "what we know and can tell" answering to [49] quoted above. Every piece of information can be seen as a piece of knowledge that has been made explicit by someone within a certain context and with their own intentions.
When a person P 1 structures his/her tacit knowledge and transmits it, he/she creates made explicit knowledge, i.e. information created from his/her tacit knowledge. A person P 2 perceiving this information and absorbing it, potentially creates new tacit knowledge for him/herself (see Figure 1). Knowledge is the result of the interpretation by an individual of information. This interpretation is done through an interpretative framework that filters the data contained in the information and with the use of pre-existing tacit knowledge [63]. This interpretation leads to the creation of meaning that can vary from one individual to another: this is meaning variance [3,4]. This question of meaning variance is central in organizations, notably for deciding whether an email may be trusted or not to prevent insider threats.

On Insider Threats
At the beginning of the 1990s, the literature on information systems security had already affirmed that there was "a gap between the use of modern technology and the understanding of the security implications inherent in its use" [35, p. 173]. The massive arrival of microcomputers was also accompanied by questions regarding the security of interconnected systems where computer science was previously mainframe oriented.
Indeed, the number of technological artefacts has exploded and this increase has gone hand in hand with the evolution of their various uses [9]. Yesterday, a terminal connected the user to the computer, while today entry points into the information system are multiple, universal, interconnected and increasingly discreet. Employee's social activity can be supported by social networks and their health maintained using connected watches.
The taxonomy of threats targeting the security of information systems proposed by [35] presented in Figure 2 is disturbingly topical, with regard to the four dimensions that make up his angle of analysis: (1) sources, (2) perpetrators, (3) intent, and (4) consequences. It should be recognized that independent of the sources, perpetrators, and intent of a threat, the consequences remain the same: disclosure (of profitable information), modification or destruction (of crucial information), or denial of service (by hindering access to resources). These consequences are covered in the 2013 ISO/IEC 27001 standard: information security management, which defines information security management systems as ensuring (1) confidentiality, (2) integrity and (3) availability of information [22].
A business's firewall constitutes a protection against external threats, which appear on the left branch in Figure 2. Authors such as [66] represent a part of the literature on information systems security that tends to pay attention to insider threats, more particularly those whose perpetrators are humans with the intention to cause harm (upper right branch in Figure 2). For authors such as [5], insider threats may be categorized along two dimensions: (1) whether the character of the threat is intentional or not, and (2) whether its character is malicious or not. From the point of view of the employee, who may constitute the entry point into the system, an insider threat can be: The study presented in this article focuses on the manipulation and social engineering techniques that exploit unintentional insider threats (category 1 above). Even though the attacker is outside the system and the organization, he makes an employee, a component of the system, unintentionally facilitate his/her infiltration: the latter has, for example, clicked on a link or even opened the door to a self-proclaimed delivery person with a self-proclaimed task. A social engineer is an attacker who targets a legitimate user from whom he/she obtains a direct (rights of access, harmful link visited, etc.) or indirect (vital information, relationship of trust, etc.) means to get into the system [42].
As new technological solutions are developed, the exploitation of hardware or software weaknesses becomes more and more difficult. Attackers are then turning toward another component of the system susceptible to attack: the human one. For authors as [56]: "Security is a process, not a product". For others such as [42, p. 14], breaching the human firewall is "easy", requiring no investment, except for occasional telephone calls and involves minimum risk. Every legitimate user constitutes thus an unintentional insider threat to the information system's security.
Individuals are not trained to be suspicious of others. Consequently, they constitute a strongest threat to the security of the information system insofar as any well-prepared individual can win their trust.

On Trust and Distrust
Even if trust is recognized as particularly important in security issues of computer networking environments [26], very little studies on information systems deal with both trust and security [51]. Some authors focus on end-users' trustworthiness [1,60] and others on trustworthy information systems' design [48,53,62].
In human and social sciences, authors as [37] and [47] consider that the psychological functionality of trust is to reduce the perceived uncertainty, i.e. the perceived risk in complex decision-making situations. Trust induces a mental reduction of the field of possibilities leading to take a decision without considering the outcome of each possible alternative [33].
Some authors consider concepts such as interpersonal trust and organizational trust, between respectively two or more people [13,24,57]. Others consider systemic trust, toward institutions or organizations [37], and trust in technologies [30,39].
An overall definition of trust seems to be lacking when tackling the literature. Relying on the taxonomy of [6], [44,45] defined trust as expectations: expectation of persistence of the natural and moral social orders, expectation of competence, and expectation of responsibility. [51, p. 116] proposed an operational definition of trust as a "state of expectations resulting from a mental reduction of the field of possibilities". Such a definition appears to be consistent with the concept of distrust, which is a "confident negative expectation regarding another's conduct" [32, p. 439]. Distrust is often presented as relying on [36] and his suggestion that those who choose not to trust "must adopt another negative strategy to reduce complexity" [27, p. 24]. So, you trust when you have positive expectations, you distrust when you have negative expectations. Distrust should not be confused with mistrust, which is "either a former trust destroyed, or former trust healed" [61, p. 27] and is not considered in the study presented in this is article.
[10, p. 7] went deeper when they stated that "the quantitative dimensions of trust are based on the quantitative dimensions of its cognitive constituents". These constituents are the beliefs on which we rely to trust, and they may explain the contents of our expectations. Examples related to trusted humans are: benevolence, integrity, morality, credibility, motives, abilities, expertise [38,40]. Examples related to trusted technologies are: dependability, reliability, predictability, failure rates, false alarms, transparency, safety, performance [16,18,34,55].
An appropriate distrust fosters protective attitudes [32] and reduces insider threats. Nevertheless, authors such as [51, p. 118] consider that "trust and distrust are alive, they increase or decrease depending on how expectations are met (or unmet [...])". Initial trust is notably based on information from third parties, reputation, first impressions, and personal characteristics such as the disposition to trust [41]. Then, from facts, understanding of the trustee's characteristics, predictability and limits notably [31,47], the trustor calibrates his/her trust [45,11]. Trust is adjusted, meaning expectations are adjusted.

Research Proposal and Experimental Protocol
In this section, we first present the methodology and experimental protocol we used to conduct a study with 250 participants and the ways they interpret, decide to trust or distrust phishing emails. Second, we discuss the preliminary results. Third, we outline future works and directions following this work-in-progress.

Description of the Study
The study has been conducted with 250 students of the Paris-Dauphine university. Half of them were Computer Science students and the other half Management Science students. Half of them were Bachelor students and the other half Master's degree students. Participants were given course credits for participating in the study. In the following we refer to the students involved in the study as the "participants". The average age of participants is 20.2 years.
A research engineer scheduled the presence of participants in a room with 10 computers and copy-protection walls. For the first waves of answers, a member of the research team was here to explain the purpose of the study and answer questions. Then the research engineer continued to manage the response room during one month in order to collect data and he was available to answer questions participants might have.
A short video presented the study to participants who were then given 20 emails, 8 of which were in English. They viewed each email one at a time on the computer screen and they were then asked to: (1) click on the areas leading them to trust it, (2) click on the areas leading them to distrust it, and (3) comment their choices in a general remarks field. Finally, participants should answer some profiling questions (age, academic level, etc.).
Participants arrived in the response room and gave informed consent to participate. Once installed, the researcher asked them if they had any questions. A short video gave them instructions: This animation will present you the objective of this questionnaire, as well as the perspectives of the study. It will introduce the way you have to understand the questions in order to improve the usefulness of your answers for our investigation. Each question is composed of three parts: 1. click on the areas of the email that make you think that it is official; 2. click on the areas of the email that make you think that it is fraudulent; 3. comment in a few words. The results of the first part will allow to identify elements of trust carried by the mail, whereas the results of the second part will allow to identify elements of distrust carried by the mail. The free text box "comments" allows you every time to explain your feeling. When you are ready, click "start" below. Then participants completed the task, as described above.

Discussion of the Preliminary Results
Data has been managed by the online reaction time experiments solution Qualtrics (see [7]) notably in order to produce heatmaps [8] such as shown in Figure 3. General remarks fields have been manually tagged by the research team and when interpretation doubts were encountered, participants were contacted to clarify their meaning.
In this article, we consider the subset of 8 English emails in order to restrict the amount of data to process. Thus the collected data represents for each participant and each email, trust and distrust areas, meaning: (250 × 8) × 2 = 4 000 images of emails with clicked areas. These images have been aggregated by the Qualtrics solution to 8 × 2 = 16 heatmaps: 8 trust-leading areas images and 8 distrust-leading areas images. Figure 3 shows three examples of trust-leading and distrust-leading areas in emails. As outlined in the next section, the research team is now analysing such images, notably to find invariant elements used as trust or distrust givers by individuals, i.e. elements leading participants to decide to trust or to distrust. We also have to consider that each participant could explain their choices for each email in a general remarks field. Such data represent (250 × 8) = 2 000 open text fields explaining the choices. A preliminary analysis of the open text fields related to the 8 English emails highlights that participants first focused on elements leading them to distrust emails. Even if they were asked to select trust and distrust areas in emails, only 7.6% of the participants mentioned both trust and distrust elements in their written explanations, the rest of them focusing only on distrust elements. This may be a bias of the study, probably induced by the material of the study: phishing emails. The heatmaps generated by the Qualtrics solution allow to partially bypass such a bias by analysing both trustleading and distrust-leading areas.
Participants who mentioned trust elements in their written explanation listed the presence of privacy concerns (6%) or logos (1.6%) in the emails. 44.1% of the participants mentioned the sender's address as a distrust element in their written comments. 25% of the participants mentioned the presentation of the email, i.e. typeface, structure, and spelling, as a distrust element, and 14% of them mentioned the pressure or emergency as a distrust element. Few of them (less than 2%) mentioned the presence of a link (particularly non-HTTPS) or an attachment to download, the occurrence of terms such as "secure", the absence of a human contact or personal data as distrust elements.

Presentation of Future Works
It is obvious that the results presented in this article are attached to the set of the study, meaning the 250 participants: students in higher education whose average age is 20.2 years. The purpose of the study is not to cover all the trust and distrust mechanisms that individuals put in place when interacting with computers or digital artefacts. The study presented in this article is still in progress and it aims to share observation statements, preliminary results, and future expectations on ways to prevent insider threats by identifying elements individuals rely on when deciding to trust or distrust phishing emails.
By understanding the ways individuals interpret, understand, trust or distrust an email, this study intents to prevent manipulation techniques hackers can use in order to influence individuals' decision to trust. Authors as [12, p. 92] stated that "in most of [the] studies no attempt was made to differentiate between the survey samples drawn from those who intentionally violate the procedures and policies and drawn from those who unintentionally violate them". The case of phishing emails is particularly interesting because existing behavioural countermeasures such as improving awareness, installing a rule-oriented organizational culture, deterrence or neutralization mechanisms show their limits on sloppiness and ignorance [12].
Currently, the research team is going deeper in the analysis of the collected data, working particularly on the overall set and not only the English emails. We aim to observe invariant elements used as trust or distrust givers by individuals. Such elements may be used by hackers as well as by security teams in organizations to adapt their formations and future actions.
As explained in Section 2.2, such a study focuses on unintentional insider threats notably caused by sloppiness or ignorance. In the future, we plan to tackle intentional insider threats, when individuals intentionally violate the information systems' security policy.

Conclusions and Perspectives
In this article, we focus on a particular threat for information systems' security: the unintentional and insider threat represented by individuals receiving phishing emails. Such individuals may facilitate the infiltration of an attacker despite themselves, by deciding to trust a phishing email.
In the first section, we presented the ways individuals interpret information relying on the knowledge management literature, the landscape of insider threats and their specificities, trust and distrust mechanisms involved in complex decision-making situations. In the second section, we presented a study conducted with 250 participants in order to highlight trust and distrust leading areas in emails and understand trust and distrust elements used by participants when receiving phishing emails.
The decision to trust, as well as manipulation techniques, were involved in decision-making situations well before the introduction of computers. In this article we studied the ways individuals interpret information to understand how they decide to trust or distrust phishing emails. Ethical issues should not be neglected in such a research, notably by considering the risk of dual-use of research results, as stated by [52]. The reader has to be aware that the results of this research may be used maliciously to mislead recipients.
The study presented in this article is a work-in-progress and the research team is now going deeper by analyzing the overall set of responses. In a near future it is planed to tackle another threat for information systems security: the intentional and insider threat represented by individuals deciding to intentionally violate the information systems' security policy. Such a study will be realized within industrial fields due to its potential managerial causes and implications.