CAHIER DU LAMSADE 309

This work addresses the problem of the representation of spatial relationships between symbolic objects in images. We have studied the distribution of several categories of relationships in LabelMe 1 , a public database of images where objects are annotated manually and online by users. Our objective is to build a cartography of the spatial relationships that can be encountered in a representative database of images of heterogeneous content, with the main aim of exploiting it in future applications of Content-Based Image Indexing (CBIR), such as object recognition or retrieval. In this paper, we present the framework of the experiments made and give an overview of the main results obtained, as an introduction to the website 1 dedicated to this work, whose ambition is to make available all these statistics to the CBIR community.


Introduction
We are interested in the representation of spatial relationships between symbolic objects in images. In CBIR, embedding such information into image content description provides a better representation of the content as well as new scenarios of interrogation. Literature on spatial relationships is very richseveral hundreds of papers exist on this topic -and a lot of approaches were proposed (see for example the survey [4]). Most of them describe different aspects of spatial relationships, e.g. directional [7] or topological [3] relationships, and have been evaluated on small synthetic or specific image datasets, e.g. medical or satellite imagery. In this work, we propose to build a cartography of the spatial relationships that can be encountered in a database of images of heterogeneous natural contents, such as audiovisual, web or family visual contents. We have chosen a public annotated database, from the platform LabelMe 1 , which is described in section 2. This cartography collects statistical informations on the trends of spatial relationships involving symbolic objects effectively encountered in this database, with the aim of exploiting them in future CBIR applications, for improving tasks such as object recognition or retrieval. Here, we focus on the analysis of unary, binary, and ternary relationships. We present the results of this analysis, which are made available to the CBIR community on our website 2 .
This report is organized as follows : In Section 2, we introduce the LabelMe image database used in our work and objects categories extracted from it. Section 3, 4, and 5 are respectively dedicated to the statistical studies on unary, binary and ternary relationship. Finally, a conclusion of this work to finish the report is presented in Section 6.
2 Annotated image database

Studied database
LabelMe [12] is a platform containing image databases and an online annotation tool that allows users to indicate freely, by constructing a polygon and a label, the many objects depicted in a image as they wish. Thus, each object, called entity in this work, is presented by a polygon and a label. In our work, each label is considered as the name of an entity category, so all entities possessing the same label belong to a same category. We used one of the test databases of this platform which contains 1133 annotated images in daily contexts (see examples in Fig.1 and Fig.2). The content of these images is very heterogeneous, it contains many categories and many images, and it is not specific to a particular domain. Therefore, studying this database can provide a general view about categories and their relationships, and the results should not be influenced noticeably by changing the database.
In order to guarantee the quality of the database we verified carefully each annotated image for consistency : -Firstly, we manually consolidated synonymous labels by correcting orthographic mistakes and merging labels having the same meaning. -Secondly, we identified and selected 86 different categories in taking into account only ones having at least 15 occurrences. This decision was taken to ensure an independence of statistical results even whether the image database is changed. These 86 categories are listed in Table 1 ordered by category's label and in Table 2 ordered by category's id . -Lastly, we added missing annotations to entities of the considered categories, except for too small size entities or entities belonging to a category having a high frequency of already annotated entities in the image, such as "leaf", "window", "flower", etc. In this way, the statistical results should not be biased by these missing annotations.
In the rest of the paper, we call DB this database. Now, we can ensure that the set of entities annotated in DB contains all the interesting entities that attract human attention view. Thus, this new annotated image database has a higher quality than the original one. Before beginning this work, we formulate two different hypotheses : -The set of entities annotated in DB contains all the interesting objects that the photographer 2. Our website : http ://www.lamsade.dauphine.fr/˜hoang/www/cartography. wants to present. -The entities annotated are the ones attracting most attention view of LabelMe's annotators, and contain a subset of interesting objects that the photographer wants to present.
Sometimes, the viewpoint of a photographer is different from public's one. That means the subject annotated can be different from the photographer's intention. Consequently, the statistical results would depend on annotations of LabelMe's users. With the original database, the second hypothesis can represents a useful dataset for a study on human attention view. After a verification and a consolidation, we think that first hypothesis is verified with DB. Sky, tree, person, lake, ground Road,car, building, window Sky,tree, mountain, ground

Statistics on categories
Before studying different relationships between categories, we take a look at statistics concerning each category, for example, its highest and lowest numbers of entities in an image, the total number of its entities in DB, the number of images where at less one of its entities appears, etc. This statistical study is presented in Table 3. A overview of these statistics is presented in Table 4.
From Table 3, we can see that the average entities number of each category in an image could be used to have a quick view about the possibility of having more than one of its entities in an image. For example, category 1 (window) has a high number of occurrences in DB and its average is around 19 entities per image. That means that, if we find a window in an image, we can expect to find another window in the same image. Category 82 (block) has a considerable average also, around 10.25 entities per image. Meanwhile, some categories, like lake or sun, do not have more than one entity per image. Certainly, it is not current to have two entities of lake in an image, and it is evident that there is only one sun in the sky. Note that, because of a low number of occurrences of sun in DB, we did not take into account this category in DB.
Interpretation with averages can provide quickly a general information on categories, but we can do it better. For a more detailed study, we have computed the intra-class correlation of categories, based on the classic correlation function between two categories. For a category, the inter-class correlation Category label Category ID Category label Category ID Category label Category ID  window  01  headlight  30  crosswalk  59  car  02  handrail  31  awning  60  fence  03  wind-shield  32  bus  61  tree  04  lamp  33  clock  62  building  05  column  34  head  63  sidewalk  06  pane  35  torso  64  license plate  07  person  36  arm  65  chimney  08  van  37  cone  66  road  09  truck  38  table  67  sky  10  light  39  umbrella  68  wall  11  pot  40  sand  69  sign  12  box  41  boat  70  plant  13  pipe  42  sea  71  roof  14  grass  43  water  72  pole  15  balcony  44  attic  73  street-light  16  fire escape  45  lake  74  ground  17  fire-hydrant  46  duck  75  mirror  18  flag  47  bird  76  tail light  19  bench  48  motorbike  77  wire  20  text  49  billboard  78  air conditioning  21  chair  50  mast  79  rock  22  parking-meter  51  cloud  80  manhole  23  curb  52  sculpture  81  stair  24  traffic light  53  block  82  door  25  mailbox  54  leaf  83  wheel  26  flower  55  mountain  84  blind  27  path  56  field  85  railing  28  poster  57  hat  86  grille  29 bicycle 58 function is defined as : N is the number of images in DB. For a category C j , every first entity found in an image is considered as variable x, another entity as variable y. Therefore,x andȳ are their average occurrence number in DB. In image I i , if there is only one entity of C j , then x i = 1 and y i = 0. If there are more than two entities, then x i = 1 and y i = 1. Otherwise, x i = 0 and y i = 0.
Slightly differently to classic correlation between two categories that represents impact of one cate-  gory's appearance on another, the intra-class correlation is never negative. Returning to the previous examples, we obtained 0.776 for the intra-class correlation of windows, that is also the highest score among intra-class correlations obtained. This score is high enough to conclude that we can find mostly at least twowindows in an image where a windows entity has already detected. The lowest score  in this study is 0, related to lake category. Therefore, no image in DB contains more than a lake. In fact, it is not usual to have two or more instances of lake in the same image. Summarizing, 21 categories have intra-class correlation higher than 0.3 while only 8 categories have a score higher than 0.5, for example car, window, building (view histograms of inter-class correlation in Fig.3 and for more details, view Tab.16 in Annex A.1). This study can provide useful information in the category detection process, if we want, for example, to detect all entities of a category C i present in an image I. Knowing that C i has, in general, one entity per image (based on a threshold on correlation, for example), as soon as the first entity of C i is detected, we could finish the detection process, thus reducing significantly the execution time of the detection. The statistics for all categories are available on our website 2 .
In the next sections, we present a discuss of the statistical results on three different types of relationships : unary, binary and ternary relationships.

Representation
We call unary relationship, the relationship between an entity and its localization in an image, where localization is defined as a region or area of the image, represented in this work by a code. More formally, let A = {A i }, I = {I j }, and C = {C k } be the set of areas, the set of images, and the set of categories, respectively. The unary relationship is an application R from C × I to A. R(C k , I j ) ∈ A allows knowing where C k is located in I j .
Areas of an image can be represented in different ways like quad-tree or quin-tree, see for example [11,13]. Since we do not have any knowledge a priori of the location of the categories in the images, we propose to split images in a fixed number of regular areas (i.e. equal size areas). First, we divide each image in a fixed sized grid. Each cell of this grid, called atomic area, is represented by a code. Fig.4 and 5 depict a splitting in 9 or in 16 different basic areas and theirs codes, respectively. We then combine these codes to present more complex areas, by example for 9-area splitting, code 009 represents area ( ) grouping together areas 001( ) and 008( ).

Results analysis
The combination of nine 9-area splitting codes ( Fig.4) gives 511 possible atomic/complex area codes. However, some codes could not be used, for example, code 017 ( ) or code 161 ( ) because their atomic areas are not connected by an edge (i.e. they are disjoint). It is impossible to have locations occupied by an entity in this way. In consequence, based on this idea, there are only 218 theoretically authorized codes (see the recursive algorithm to create theoretically authorized codes from a set of atomic codes in equation 16 of Annex B.1). Concretely, in DB, we did not find any entity in areas represented by impossible codes. Moreover, there are only 138 useful theoretically authorized codes, meaning that 80 codes are not encountered in DB. For example, DB does not contain any entity in areas with codes 47( ) or 125 ( ). In the same way, the combination of 16 codes in Fig.5 gives us 65535 different codes. In theory, we can reach 11506 atomic/complex areas (based on connected areas), but in DB, only 649 codes are present.  An overview on present codes in DB for each type of splitting is represented in Fig.6 and 7. For each type of splitting, we could retrieve easily number of occurrence or distribution of categories by report of their codes (view Fig.6 and 8 for 9-area splitting, Fig.7 and 9 for 16-area splitting). Some codes having the highest or lowest number of occurrences are reported in Table 5. In Annex B.1, Fig.  34 and 33 illustrate the distribution of categories according to two 9-area codes : 16( -code having highest frequency) and 128( -code concerning the most number of categories). Fig.36 and 35 represent a such distribution in 16-area splitting for codes 1024( ) and 16384( ). These informations provided us some interesting information to interpret the trend of categories'location in image.

Interpretation
From the Fig.6 and 7, we can observe that on the one hand, that large or complex regions have a small number of occurrences. That means that object categories are mostly represented by a simple and small area. On the other hand, the trend of the categories' presence, in first, is on the middle line, then, on the second line, and finally on a combination of the second and the third lines. In fact, it is not usual to present an interesting object only on the bottom line. And in practice, this line does not attract also the attention view. Similarly, we can observe that the trend of the categories' presence, on the left is higher than on the right. These conclusions confirm the well known rules concerning photography and ergonomics (human-computer interaction) : -In photography, there is the rule of thirds 3 , one of the first rules of composition taught to most photography students. An image is cut by two horizontal lines and two vertical lines. It is recommended to present interesting object in the intersections or along the lines presented in this rule (see Fig.10). -According to [8,10] concerning ergonomic studies on human-computer interaction, the center of computer screen is the most attracting. Next, the human attention view is attracted by the top and the left of screen more than by the bottom and the right consecutively, leading to slightly more annotated entities in these areas.
We have studied the distribution of categories across areas of the image, according to 9-area and 16-area splittings. Basically, the results obtained can be encapsulated in a knowledge-based system  where they will be interpreted as a probability of presence of a given category in a given area. For example with 9-area splitting, chimney and sky appear more frequently on the top of the image, with probabilities 0.72 and 0.81 respectively ; see them respective distribution in Fig. 11 and 12. In a object detection task for example, these measures can help in determining priority searching areas, and then in reducing the searching space of the objects. They are available for all categories on our website 2 .

Spatial reasoning
We have also a question : "Could a category be frequently and entirely present in a given area ?". This question could help us to find an efficient method for detecting a category in an image. This idea drives us to examine the distribution of occurrences of each category C j according to each theoretically possible area in the image, by the way of a normalized histogram : When a category is integrally in an area A i of split, it can probably appear in a smaller theoretically authorized area A k included in A i . Let F C be the function allowing to create theoretically possible where cod(A k ) is the code representing area A k . A category C j , whose instances appear entirely in A i , has a specific histogram where the number of occurrences of a code c, c ∈ SC split (A i ), is not null. Then, to do spatial reasoning on such histograms, we propose a function F H such as : where is the dot product and G a 1D template mask of size the number of theoretical codes c according to the splitting method : F H has values varying in [0..1] ; F H = 0 means that all not null frequencies correspond to codes SC split (A i ), and then that category C j is always entirely in area A i . If F H = 1, we can say that C j is never entirely in A i . The more F H is high, the less C j appears entirely in A i . We present categories according to highest/smallest F H in Tab.6 for 9-area splitting and in Tab.7 for 16-area splitting. From F H, we can deduce the probability p a of presence of More generally, if we examine the presence of C j in n disjoint areas A i , the probability becomes . For category person for example, the values F H for the three A i areas in 9-area splitting are respectively 0.704, 0.644 and 0.721, that gives p a ({A i } 3 ) = 0.931. This result means that the probability of category person to be entirely in one column is high, and that its presence in two columns at least is very small. Consequently, we can say that in DB, entities person are present vertically most of the time, and that they appear rarely at scales larger than one column. These statistics can help designing a person detection task for future applications. Similar spatial reasoning can be done with other categories and other areas. For each category, it  is possible to expose a study that can exhibit a specific size, shape and area(s) for a searching/detection process. For example, we examine the center area of images in DB with 16-area splitting. From Tab.8, we saw that there are five categories having frequency more than 50% : mirror, tail light (of car), hat, mailbox, head (of person). If we would like search these categories in images, we could begin the process by the center area.

Binary relationships
A binary relationship links two entities of distinct categories together in an image. It can be a co-occurrence or a spatial relationship. From the 86 categories of the database used, there are 3655 possible binary relationships between categories. Among them, we observed first that 879 couples of categories never occur together. For more details, the reader can consult Fig.13 which present a map of co-occurrences categories and Fig.14   relationships, we examine co-occurrence relationships.

Co-occurrence relationships
To begin, we give an example. In DB, window appears in 677 images and car in 519 images. This couple of categories appears together in 480 images. Then, we can conclude that their co-occurrence relationship is quite remarkable : for instance, 92% of the images containing a car also contain a window. This rate corresponds to a conditional probability, denoted P (window|car). Fig.15 gives an idea of the distribution of the number of images where all the category couples appear. Couples having a high number of occurrences are listed in Tab.9. Additionally, we can compute their correlation to learn more about the co-occurrence of such couples. Hence, these measures can help understanding better which category's presence conducts to the presence or absence of another category. Statistics for some couples can be found in Tab. 9.
The correlation score resumes in one value the presence or absence together of two categories and especially the strength of this knowledge. We can apply the formula of equation 1. Variable x i shows the presence of at least on instance of category C j in an image I i , then x i = 1 if this condition is satisfied, otherwise x i = 0. Variable y i concerns another category C k . Thenx andȳ are their average occurrence number in database. Hence, if a couple's correlation is negative, then this couple is rarely present in a same image. The highest score obtained is 0.984 for torso-arm ; in fact, only 3 couples have a correlation higher than 0.8 (the distribution of these correlation is displays in Fig.37 of Annex   Table 9 -Couples of categories having either a highest number of occurrences or a highest conditional probability or a highest correlation. C.1, some obvious scores can be found in Tab.18 of the same Annex). The lowest score obtained is −0.297 for couple building-bird (view Table 9). Hence, any couple in database has a strong decor-  A, B). For example, P (building|sidewalk) is very high (see Table 9). That means that, in detecting a sidewalk, we can expect finding a building in the same image. Such relationship should be integrated with benefit in a knowledge-based system dedicated to artificial vision. Indeed, sidewalks are easy to detect because of their specific and universal visual appearance, while the variability of buildings makes them harder to detect. Then the prior detection of a sidewalk would contribute to facilitate the detection of a building by reducing the number of images to process. This reasoning can be generalized to other couples of categories, since in total, there are  Table 9), making the possibility of replacing the detection step of one category by the detection step of another, if easier, to find images of that category. All these measures are available on the website 2 of this work. Their distribution is displays in Fig.38 of Annex C.1.

Binary spatial relationship
In last years, there have been many approaches proposed for representing binary spatial relationships. They can be classified as topological, directional or distance-based approaches (see [4] for more details), and can be applied on symbolic objects or low level features. Here, we have focussed on relationships between the entities of the database described in terms of directional relationships with approach 9DSpa [7], of topological relationships [3] and of a combination of them with 2D projections [9]. We do not use orthogonal [2] and 9DLT relationship [1] because of its inconveniences mentioned in [7]. The detail of each approach is explained in the following sections.

9DSpa relationships
9DSpa describes directional relationships between a reference entity and another one based on the combination of 9 codes associated to areas orthogonally built around the MBR (Minimum Bounding Rectangle) of the reference entity. To complete this description, the autors take into account topological relationships. Because we want to study distinctly topological relationships, we examined only  the directional part of this approach. With the original 9DSpa approach, the description of the code uses only 8 bits, then, the center (or MBR of reference object) is coded by 0. With this type of code, we cannot identify if the second entity in a couple overlaps the MBR of reference one (see example in Figure 16). Therefore, we use a new description based on 9 bits to recognize the intersection between two entities (see Table 10).
Firstly, we present a overview of the 9DSpa codes that can be encountered for each category in Fig.17. 9DSpa approach gives 511 possible codes. But we saw that several codes are never used and  be not associated with any category. In fact, similar to 9-area splitting, with 9DSpa approach, we can build only 218 theoretically authorized codes. In DB, we have found 206 codes among these theoretical ones. In interpreting horizontally Fig.17, we see that one category C j can be associated only to some 9DSpa codes. This information can be integrated usefully in a knowledge base dedicated to artificial vision. For example, in an image where an instance of category C j was detected, we suppose that another category C z can appear, and we would like localize this category. Quickly, we can give the priority only to the searching areas around C j associated to some codes relevant with C j . This action can reduce considerably the searching time. Interpreting vertically Fig.17, we observe that the most frequent codes are : 004 ( ), 016 ( ), 064 ( ), 256 ( ) with respectively probabilities 14%, 13%, 14%, and 13% (see distribution of 9DSpa codes in Fig.18). Furthermore, we can use the probability of each 9DSpa code for each couples of categories. Some examples about these probabilities are listed in Tab 19 of Annex C.2.
We examine now some particular examples. With category chimney, 9DSpa relationship of this reference category with others categories is resumed in Fig.19 and 40(b) of Annex C.2). In accordance  with reality, the statistics show that chimney is usually above other categories. Some other examples concerning roof, car, road also are cited in Fig.40 of Annex C.2. Moreover, we can study 9DSpa relationship between chimney and a particular category, for example with roof (see Figure 20). This couple obtains the three best probabilities of presence 0.10, 0.14 and 0.17 with respective areas , and . These results can provide an advantage in limiting a searching area for a target entity when knowing the location of reference one. During an object detection and localization task, this knowledge gives the possibility to constrain the search of the target object to priority searching areas in the image and to corresponding object's size, given a reference object. All the associated statistics are available on the website 2 of this work.   The description of topological relationships provides eight types of relationships (represented in Table 11). We remark that, in DB, "equal", "cover", and "coverby" do not appear (see Figure 22). "Disjoint" is very frequent with a frequency more than 94%. The second position is for "overlap" with a frequency around 2.8%. "Contain" and "inside" are present only 0.7% and 1.1% consecutively. "Meet" relationship is dully represented (0.3%) : its number of occurrences is small because the notion of strict adjacency between high-level objects is not common in natural contents such those of the database and because of manual annotation. Meanwhile, in literature "meet" is a popular relationship often used with some image analysis techniques such as region segmentation that generates adjacent regions by definition, with application to specific domains, e.g. satellite imagery.    The distribution of topological is clearly different according to couples of categories, then it can be useful in certain cases, for example in object localization. In Fig.23(a), we observed that a car appears mostly inside or overlaps regions occupied by a road. Hence, for searching a car in an given image, it is possible to begin on a region of a road if this last is already located. Meanwhile, "disjoint" information of couple table-chair (see Fig.23(b)) could not provide any profitable information and could complicate the searching of chair based on the presence of table whose size in an image may be usually small.

Topological relationship
More generally, we can say that statistics on topological relationships do not provide a discriminative information. With this approach, it is difficult to get a typical interpretation or conclusion for a couple of categories, except for some special categories like road and car. However, these statistical results can be used as a supplementary information for other approaches.

2D projection relationships
Similarly to topological approach, 2D projection approach is one of basic approach in image domain. The 2D projections approach associates 7 basic operators plus 6 symmetric ones (denoted by adding symbol "*" to the basic ones, see Tab 12) to each image axis, leading to 169 possible 2D relationships between MBR of entities.  Table 12 codes in 2D projection approach [9].
In the same way as in previous sections, we can study co-occurrence between 2D operators and categories (see Figure 24 for x axis and Figure 25 for y axis), the frequency of occurrence of each 2D operator (see Figure 26(a) for x axis and Figure 26(b) for y axis). A concrete example is represented in Fig.27. We observe that 1D relationships |, | * , ], ] * , [, [ * and = are not present at all on x or y axes. This result confirms that adjacency relationship is not noticeable in DB, and it also shows that 2D projections do not describe well this relationship, since they are not able to detect it here. Operators < and < * are the most frequent. It confirms partially the high frequency of "disjoint" relationship in topological approach, and moreover, of areas with 9DSpa. In fact, operator < associated with x axis corresponds to areas in 9DSpa. Thus, the intersection of frequencies of < and < * on axis x and y explains partially frequency of 9DSpa codes. Table 13 presents a summary of the statistics obtained with DB fir the three representations of spatial relationships studied.

Approach
Nb of possible Nb of effective Relationships with best relationships relationships occurrences (and frequency in %) 9DSpa 511 206 (14%), (13%), (14%), (13%) Topological rel.      leads to the first conclusions that the digital codes of these relationships could be optimized and that indexing them would more benefit from data driven than space driven indexes. Moreover, among these three approaches, we think that 9DSpa is the one that allows providing the most relevant statistical knowledge for future interpretations. In particular, it is possible to deduce from them the probability of presence of a given entity in an area having a given directional relationship with a reference entity, as well as an indication on its size. During an object detection and localization task, this knowledge gives the possibility to constrain the search of the target object to priority searching areas in the image and to corresponding object's size, given a reference object. All the associated statistics are available on the website 2 of this work.

Ternary relationships
A ternary relationship describes a relationship of a triplet of categories. Similarly to binary relationships, we examined co-occurrence and spatial relationships.

Co-occurrence relationships
We continued to examine co-occurrence relationships for triplets of categories. We found 38031 present triplets in total knowing that we can have C 3 86 = 102340 possible triplets where order does not matter. We could compute the frequency of presence of each triplet. Fig.28 gives this frequency for each possible triplet. We observed that the most frequent triplets are (window-building-sidewalk) and (building-sidewalk-road), that have frequencies of 0.5013 and 0.4872 respectively.
Then we have calculated the correlation of each triplet by adapting the basic function (see equation 1) to relationship between a category and a couple of other categories that is present in database. For a triplet ( is present in image I i , then y i = 1 otherwise y i = 0. Therefore, we examined 86 * (85 * 84/2) = 307020 possible combinations. We obtained highest score 0.9891 for triplet (torso -(building -arm)) and lowest score −0.2494 for triplet (water -(window-building)). Only 272 triplets have a correlation score more than 0.5. In Tab.14, we present the 40 triplets having highest or lowest correlation. We observed that there are the link between this correlation and correlation of couples of categories presented in previous section 4. In fact, the highest correlation in this section concerns two categories 64 (torso) and 65(arm), that is the same result for correlation between couples.

Ternary spatial relationships
In last years, to our knowledge, a few approaches were proposed to describe triangular relationships of three symbolic entities. We can mention TSR approach [5] and our approach ∆-TSR [6]. By applying to a set of heterogeneous symbolic entities that do not have fixed shape and size, these approaches cannot described finally triangular spatial relationships between symbolic entities since they take into account only the center of each entity as representation of it. However, to complete this study, based on the theory of ∆-TSR, we have studies the relationships between three different categories by using ∆-T SR 3D . This description is invariant to translation, 2D rotation, scale, and flip. Triangular relationships are built on the centers of three entities. The first component of ∆-T SR 3D is the identification of the triplet of categories, the second and the third components are consecutively the first and the second angles of triangle obtained from the three centers. They correspond to angles a 1 and a 2 in Fig.32.
Firstly, we present a general vision on approach's second component in all DB with Fig. 29 and 30. We observed that this component is distributed quasi homogeneously in interval [0..180]. Then, with the ternary relationship, we can say it is complex to give a direct interpretation, for example to predict an area of searching, by using simply an angle. Although, this relationship can be useful for a representation fuzzy relationship like "between" relationship. Suppose that we do not take into account  the shape of category's instance, the "between" relationship can be used by restricting the value of the two angles in ∆-T SR 3D . For example, a third entity C 3 can be viewed "between" C 1 and C 2 when a 1 <= 60 and a 2 <= 60 (see Figure 31). If we take into account the entity's shape, we can combine the 9DSpa approach with ∆-T SR 3D to get a definition more complete of "between" relationship. We have computed the probability of the third category to be "between" the two first categories in triplet (see Figure 32). We found 3376 triplets having probability score more than 0.5. For example, when we find a sidewalk and a chair in an image, if there is a motobike in this image, we could believe that this motorbike would be probably "between" these two first entities since the corresponding probability is 0.978. In the same way, the same study for the first or second category in triple can be done easily.  Because of limits of the spatial representation of ternary relationships for symbolic entities, we did not conduct additional statistical study on this type relationship. ∆-TSR provided more many advantages with low level feature like interest points. We think that this approach can be relevant for symbolic entities if we know how to associate other contextual information of category to it. It can surely be done in some domains like medical domain where ∆-TSR could show its ability on homogeneous entities having a fixed size and shape.

Conclusion
We have presented a statistical study on spatial relationships of categories of entities from a public database of annotated images. This study provides a cartography of the spatial relationships that can be encountered in a database of heterogeneous natural contents. We think that it could be integrated with benefit in a knowledge-based system dedicated to artificial vision and CBIR, in order to enrich the description of the visual content as well as to help to choose the most discriminant type of relationships for each use case. Here, we have focussed on the analysis of unary, binary, and ternary relationships. Study on unary relationships highlights trends on location of categories of entities in the image. These measures allows to determine the probability of the presence of a category in a given area, and to perform spatial reasoning. In the same way, study on binary relationships allows deducing the probability of presence of a category in an area regarding the location of another reference category. In addition, it gives indications on the relevance of the tested representations of these relationships. Ternary spatial relationships were already studied. Because of limits of the spatial representation of ternary relationships for symbolic entities, we did not conduct deeper statistical study on this type relationship.
This work was done on a manually annotated database of one thousand images. Therefore, it is evident that these statistics will have to be confirmed or refined on other image databases of larger size. However from now, we think that these measures can help us, on the one hand, to better understand which kinds of spatial relationship should be employed for a given problem and how to model them. On the other hand, such statistics can help to start a knowledge base on these relationships, that can be applied quickly to some topical problems of artificial vision and CBIR such as object detection, recognition or retrieval in a collection.  We were interested how to define the function allowing to determinate the theoretically authorized codes from a set of initial ones (the smallest atomic ones). Suppose that an image I is splitted in n atomic areas A s .

A Annotated image database
The code representing A s is noted cod(A s ). The set of areas that are joint by edge with A s is noted edge(A s ).
For two atomic areas A si and A sj , we call comb(A si , A sj ) the function combining these two areas to give a new complex area. comb(As i , As j ) =      null if As j / ∈ edge(As i ) A k |A k = As i ∪ As j otherwise (12) with : Now, we can define the function F C allowing to indicate all theoretically authorized areas from a set of two atomic areas.
Suppose that we have a set A i a complex area containing more than two atomic areas, then, we can define recursively the function F C on A i :      (a)Roof is reference category (b)Chimney is reference category (c)Car is reference category (d)Road is reference category Figure 40 -Examples of statistical study on 9DSpa relationships between a reference category and others.