Informal Versus Formal: A Panel Data Analysis of Earnings Gaps in Madagascar

Little is known about the informal sector's income structure vis-à-vis the formal sector, despite its predominant economic weight in developing countries. While most of the papers on this topic are drawn from (emerging) Latin American, Asian or some African countries, Madagascar represents an interesting case. So far, very few studies in general, even less so in Sub-Saharan Africa, used panel data to provide evidence of the informal sector heterogeneity. Taking advantage of the 1-2-3 Surveys in Madagascar, a four-wave panel dataset (2000-2004), we assess the magnitude of various formal/informal sector earnings gaps. Is there an informal sector job earnings penalty? Do some informal sector jobs provide pecuniary premiums and which ones? Do possible gaps vary along the earnings distribution? Ignoring distributional issues is indeed a strong limitation, given the compound question of how informality affects earnings inequality. We address heterogeneity issues at three different levels: the worker, the employment status (wage employment vs. self-employment) and the earnings distribution. Standard earnings equations are estimated at the mean and at various conditional quantiles of the earnings distribution. The results suggest that the sign and magnitude of the formal-informal sector earnings gaps highly depend on the workers' employment status and on their relative position in the earnings distribution. In the case of a poor and fragile country like Madagascar, these findings provide new and robust empirical backups for the existence of a mix between the traditional exclusion vs. exit hypotheses of the informal sector.


Introduction
Little is known about the informal sector's income structure vis-à-vis the formal sector, despite its predominant economic weight in developing countries. Some works have been carried out in this field using household surveys, but they only consider some emerging Latin American countries (Argentina, Brazil, Colombia and Mexico;Gong et al., 2004;Perry et al., 2007;Kwenda, 2011, 2014) and more recently South Africa, Ghana and Tanzania for Africa Kwenda, 2011, 2014;Falco et al., 2010) and Vietnam for Asia (Nguyen et al., 2013). It is then hazardous to generalize these results (sometimes diverging) to other parts of the developing world, in particular in very poor countries in Sub-Saharan Africa where the informal sector is the most widespread.
Empirical evidence shows that the existence of informality in poor countries can be understood by a mix of two traditional assumptions (Maloney, 1999(Maloney, , 2004Perry et al., 2007): the exclusion and the exit hypotheses, following Hirschman's seminal work (Perry et al., 2007). The first hypothesis, also called the "dualist approach", considers a dual labor market model where the informal sector is viewed as a residual component of this market and is totally unrelated to the formal economy. It is a subsistence economy that only exists because the formal economy is incapable of providing enough jobs, and is condemned to disappear with the development process. Informal sector workers, suffering from poor labor conditions, are queuing for better jobs in the formal sector. The second assumption, also known as the "legalist approach", considers that the informal sector is made up of micro-entrepreneurs who prefer to operate informally to evade the economic regulations (de Soto, 1989); this conservative school of thought is in sharp contrast to the former in that the choice of informality is voluntary due to the exorbitant legalisation costs associated with formal status and registration. Then, confirming Field's stylized assessment (1990), a few studies stress the huge heterogeneity among informal sector jobs, which combine two main components: a lower-tier segment, where occupying an informal sector job is a constraint choice ("exclusion hypothesis"); an upper-tier segment, in which informal sector jobs are chosen for better earnings, and non-pecuniary benefits ("exit hypothesis"). Usually, the former segment is assimilated to the informal sector wage jobs, while the latter is associated with the self-employed jobs.
Therefore, whether one segment is predominant over the other remains an empirical question, depending on local circumstances. To test these alternative views, one major strand of literature focuses on the estimation of earning gaps. Embedded in revealed preferences principle, and considering income as a proxy of individual utility, the approach assumes that if informal sector workers earn more than their formal counterparts all else being equal, one could presume that they have deliberately chosen the informal sector. This may not be true for all informal sector workers. Thus, the challenge is to identify segments of jobs or position in the income distribution where informal sector workers get a higher pay.
In this paper, this is the method we follow in the case of Madagascar. We take advantage of the rich 1-2-3 Surveys dataset for Antananarivo, specifically designed to capture the informal sector, and in particular its four-wave panel data (2000-2001-2002-2004), to ask the following questions: Is there an informal sector job earnings penalty? Do some informal sector jobs provide pecuniary premiums and which ones?
Do possible gaps vary along the earnings distribution?
While most of the papers on this topic are drawn from (emerging) Latin American, Asian or some African countries, Madagascar represents an interesting case. To our knowledge, very few studies in general, even less so in Sub-Saharan Africa, and none in the case of Madagascar, used panel data to provide robust evidence of the informal sector heterogeneity.
Madagascar experienced an exceptional period of economic expansion between 1995 and 2001. Growth appeared to be associated with a decline in the share of the informal sector in urban employment (see Vaillant et al., 2014). But, in 2002, a major political crisis following presidential elections reversed this trend. This crisis had disastrous effects on the economy: exports and foreign direct investments fell sharply, GDP declined by about 13% and inflation was close to 16% in 2002 (Cling et al., 2005). The share of employment in the informal sector grew again, as workers were laid off from the private sector, in particular in the Export Processing Zones (EPZs). Despite the severity of the economic downturn, recovery was quick, with a GDP growth of about 10% in 2003 and around 5% in the two following years, the period covered by our panel dataset. 1 The country remains however today one of the poorest countries in the world.
Our empirical analysis consists of assessing the magnitude of different types of informal-formal earnings gaps using fixed effects OLS and quantile regressions. While many pieces of work rely on proxy variables to identify the informal sector, we use the official international definition of the informal sector elaborated by the ILO (1993), including all non-registered non-farm unincorporated enterprises 1 The political instability and consequential macroeconomic shocks around 2001 are particularly interesting phenomena to explore. However, this will not be the purpose of the present paper. Indeed, given the labour market adjustment mechanisms that Madagascar has known in this period (among which, a huge share of new entrants on the labour market, such as household additional workers who were pushed to work to compensate income losses), we believe that the panel data at hand are not, by construction, the best tool to assess the impact of the crisis, nor to use the crisis as an identification strategy for estimating the informal-formal gaps (no idiosyncratic shock, neither completely exogenous; see Razafindrakoto et al., 2015).
(household businesses). Standard earnings equations are estimated at the mean and at various conditional quantiles of the earnings distribution. In particular, we estimate fixed effects quantile regressions to control for unobserved individual characteristics, focusing particularly on heterogeneity within both the formal and informal sectors employment categories. Our purpose is to address the important issue of heterogeneity at three levels: the worker level, taking into account individual unobserved characteristics; the job level, comparing wage workers with self-employed workers; and the earnings distribution. Ignoring distributional issues is indeed a strong limitation, given the compound question of how informality affects earnings inequality. A few studies still make use of quantile estimations to estimate informal-formal earnings gaps along the conditional earnings distribution Kwenda, 2011, 2014). While there is neither formalized theory nor any definitive consensus about why the formal-informal earnings gaps should vary along the income distribution (and, if so, in which direction), this assumption is nonetheless a key element of the debate exploring the exclusion vs.
exit hypothesis (see Perry et al., 2007, for an extensive discussion).
The remainder of this paper is organized as follows. Section 2 presents the context, the data and some descriptive elements of income dynamics in the recent period, while Section 3 focuses on the econometric approach to assess formal-informal earnings gaps. Empirical results are discussed in Section 4. Section 5 concludes.

Context
After a long period of economic recession which started with the country's independence in 1960 and interrupted only by very short periods of growth, Madagascar experienced an exceptional period of economic expansion between 1997 and 2001. Several factors, both economic and political, drove this favorable development. Firstly, the political stability since the election of Didier Ratsiraka in 1996 and agreements with the Bretton Woods institutions to reduce debt created a favorable environment for investment. Secondly, the development of EPZs attracted foreign industry, in particular textile, which stimulated exports and employment. The rise of tourism also contributed to economic growth.
The presidential elections of December 2001 triggered a serious political crisis that lasted six months and had catastrophic economic effects (Razafindrakoto and Roubaud, 2002 (Cling et al., 2005).
The general strikes, roadblocks and the vacancy of power caused by the political crisis in the first half of 2002 reversed this trend. In only one year, the informal sector gained nearly 8 percentage points (Table   1), erasing all the progress in the formalization process observed during the previous four years, absorbing the laid-off workers from closing formal enterprises and the new entrants, deprived from any alternative source of jobs. While both dependent and independent informal sector employment 6 increased, the growth in the number of informal entrepreneurs was much faster than the overall increase in the number of workers. This is a sign that informal sector employment growth is extensive rather than intensive, as it happens mainly through the creation of new firms rather than the expansion of employment in existing firms. Interestingly, in the period of growth (1998)(1999)(2000)(2001), although dependent informal sector labor was absorbed in formal enterprises, the absolute number of firms continued to increase, even faster than the overall growth of the employed labor force. This suggests that the informal sector consists of both workers queuing for a formal sector job and voluntary entrepreneurs . Conversely, in the period of crisis and the following recovery, the decrease in formal sector employment seems to have been mainly compensated by an increase in informal independent labor (the share in total employment increases from 35% to 38.6%), rather than informal hired or family labor, suggesting that existing firms were not able to absorb the surplus labor released by the formal sector, and most of these workers started an informal activity. Additionally, an important fraction of the fast growth in the number of informal firms is explained by new entries on the labor market. The EPZ paid the highest tribute to the crisis, employment being divided by nearly three. From 2002 onwards, the EPZ recovers its pre-crisis number of jobs. Yet, recovery of domestic formal enterprises seemed to be limited (Cling et al., 2009).
At the macro level, this contra-cyclical evolution of the informal sector employment, taken as a whole, seems to confirm the dualistic hypothesis discussed in the introduction. This interpretation is reinforced by the subsequent trends. As a second political turmoil occurred in 2009 combined with the international financial crisis, which resulted in a new drastic shock, the informal sector 're-colonized' the labor market. The informal sector employment absorbed nearly two thirds of the labor force in 2010 (65%), its highest share ever .

Table 1 about here
The growth process registered at the national level until 2001 is confirmed by the survey data that are used in this paper. Urban households benefited most from the situation. In Antananarivo, the real average labor income increased by 53% between 1995 and 2001, which corresponds to a huge 8% annual growth rate, an unprecedented pace in Madagascar's history Roubaud, 2002, 2010). Consequently, the poverty incidence decreased from 39% to 19% while income inequality was also reduced. The 2002 crisis stopped this positive trend: the unemployment rate nearly doubled along with a massive increase in time-related underemployment 3 and child labor. Real incomes dropped by 5%.
Thereafter, despite the quick macroeconomic recovery, household living conditions stagnated: in 2004, earnings were as low as in 2002 and, in 2006, they were only 2% higher than during the crisis.
In terms of labor income, the informal sector is, as expected, the lowest paying segment of the urban labor market, with jobs in the public sector at the top of the earnings ladder (first row of Table 2).
Interestingly, although it is significant, the earning gap with EPZs jobs is quite low, stressing the potential trade-offs in choosing one sector or the other for low skill workers, especially women (Glick and Roubaud, 2006). The decline in informal sector employment in the second half of the nineties was accompanied by large income gains from informal activities. Between 1995 and 2001, real average informal sector earnings increased by 66%, this is more than the 53% registered over all sectors taken together. Given that the informal sector is less exposed to international competition than the formal tradable sector, informal firms have been able to benefit from the increase in domestic demand. In spite of the lower income elasticity of their products and of a decreasing market share for consumption goods (-6 percentage points), informal goods still satisfied nearly three quarters of household consumption in 2001. If only food is considered, the share catered by the informal sector was even 95% .
Conversely, in 2002, the average income in the informal sector was reduced by 11%, while the decline for the whole labor market was 'only' 5%. Shrinking aggregate demand combined with the absorption of labor quitting the formal sector are likely to be the main drivers of this sharp contraction. The shift from formal to informal consumption goods following the impoverishment of the population was not sufficient to counterbalance the two former effects . On the contrary, the formal sector was able to maintain real wages, but at the expense of a massive reduction in jobs. These figures are consistent with the common belief that the formal sector would adjust during downturns through quantity, while price adjustment would be the main mechanism at work in the informal sector. Subsequently, informal sector incomes progressively recover part of their purchasing power, at least up to 2009, before a new drastic drop occurred. Up to now, we analyzed informal sector dynamics through repeated cross sections of labor force survey data. However, such data provide only an aggregate and partial view of the process at stake.
Understanding better the informal sector dynamics requires to dig beyond average along two dimensions, by taking into account its intrinsic heterogeneity and individual mobility across sectors. We will take advantage of the availability of panel data for the sub-period 2000-2004 to accurately focus on our main objective, i.e. to assess the formal/informal earnings gaps.

Data description
The data used in this paper are drawn from the 1-2-3 Surveys conducted in the capital city, Antananarivo, since 1995 by the National Statistics Institute, with the technical assistance of DIAL, on behalf of the authors (Rakotomanana et al., 2003). The 1-2-3 Survey is a mixed household/enterprise survey specifically designed at capturing the informal sector in all its dimensions . Phase 1 is an extended labor force survey, providing accurate labor market indicators, including, among others, main and secondary jobs of every member aged 10 years and over by status of firm (formal/informal). Phase 2 is an enterprise survey, carried out on a representative subsample of informal firms identified in Phase 1 and seeking to measure their main economic and productive characteristics. Phase 3 is an income and expenditure type household survey, which sample is drawn from Phase 1 and which aim is to estimate the weight of the formal and informal sectors in household consumption by product and household type.
In terms of sample design, the 1-2-3 Surveys are a classical two-stage stratified random survey, covering the ordinary households in the agglomeration of Antananarivo. 4 The sample size is constant over years and quite large for this kind of geographical coverage. Approximately, 3,000 households and all household members have been interviewed each year (see details in Table 3). Among all individuals, more than 9,000 belong are 18 years and over, of which around 5,500 held a job in the considered years. 5 For the purpose of the econometric analysis of this paper, we use exclusively four successive rounds of Phase 1 (2000, 2001, 2002 and 2004), which presents the advantage of including a panel component.
From 2000, 2,999 households have been re-interviewed during the three subsequent rounds. In order to 4 The primary sample units are census enumeration areas and the secondary sample units correspond to households and individuals. For more details, see Rakotomanana et al. (2003). 5 The full sample consists in all members of the households surveyed in 2000. In this paper we restrict our analysis to the individuals aged 18 years old and over (in 2000), to better control for education achievement. Taking a lower threshold would lead to a censored education variable. Less than 5% of the individuals aged 18 years and over are still at school. keep the total number of households surveyed each year (3,000) constant, the disappeared or nonresponding households have been randomly renewed from one round to the other. More importantly, no significant differences in labor market related variables, in particular in earnings or type of jobs (formal vs. informal) are observed (Rakotomanana, 2011).
Being specifically designed to capture informal sector jobs, the 1-2-3 Surveys allow us to capture the concept of informal sector following the international definition strictly (ILO, 1993). In Madagascar, the informal sector is defined as all private unincorporated enterprises that produce at least some of their goods and services for sale or barter, are not registered (statistics licence, supposed to be compulsory for all kinds of businesses) or do not keep book accounts. Apart from our formal/informal sector divide, special care is dedicated to get reliable measures of variables where informality status may lead to sampling and measurement errors, due to its characteristics. In particular, the questionnaire includes a detail set of questions to capture information on activity status, the classical procedures leading to the under-declaration of informal sector workers participation for those with the weakest labor market attachment. We compute the labor income associated with each remunerated job. For wage workers, the survey captures their current monthly wage, while for self-employed workers earnings correspond to the disposable income (before taxation). For those who do not want to declare (or don't know) their precise earnings, a complementary question asks for intervals, proposed in detailed ranges (10)  To our knowledge, the database used in this paper is one of the largest and highest quality labor market panel in Sub-Saharan Africa (apart from being one of the few ones available).

Econometric Approach to Measuring Informal-Formal Earnings Gaps
The econometric analysis consists of assessing the magnitude of different types of informal-formal earnings gaps using OLS and quantile regressions with log hourly earnings as dependent variable.
Standard earnings equations are thus estimated at the mean and at various conditional quantiles of the earnings distribution. The models are regressed on a pooled sample of workers over years employed formally and informally. The different covariates introduced into the regressions are the completed years of education, the years of potential experience (with quadratic profiles for these two regressors), a dummy for being married, a dummy for being a woman, ten dummy variables of industries to account for technological differences between branches of activity 7 , ten area dummies to capture labor market local specificities and four time dummies to control for macroeconomic trend effects on earnings. 8 A number of studies based on data on African manufacturing firms have shown that wages are positively correlated to firm size, conditional on standard human capital variables (Strobl and Thornton, 2002;Manda, 2002;Söderbom, Teal and Wambugu, 2005). The literature discusses numerous reasons why wages are positively correlated with firm size. One of the frequently made arguments is that firm size is correlated with omitted worker quality because large firms usually attract more productive workers. Thus, not accounting for this demand side characteristic may induce severe biases in the usual Mincerian equations. Fortunately in this paper, we are able to control for the size of the firms that we aggregated in four ordinate ranges. However, given that firm size is highly correlated with informal/formal status, we systematically estimate our models with and without the firm size in order to disentangle the effect of these two variables. To account for informal-formal differences in earnings at the mean earnings level, we rely on pooled OLS regressions across years and Fixed Effects OLS The estimated coefficient is interpreted as a measure of the conditional earnings premium/penalty experienced by workers who change status between informal sector jobs to formal sector employment (or the reversal). However, as mentioned previously, informal sector employment is extremely heterogeneous and a finer job divide should be considered. We then define four categories of workers split by status in employment (wage workers vs. self-employed workers) and institutional sector (formal vs. informal) and create four dummies taking value one if the individual i at time t is an informal sector wage worker ( , a formal sector wage worker ( ), an informal self-employed worker ( and a formal self-employed worker ( . Taking the formal sector wage workers as the reference category, the model we estimate can be written as . ( 2) The estimated coefficients , and are interpreted, respectively, as the IW -FW, IS -FW and FS -FW conditional earnings gaps. Identification of these conditional earnings gaps relies on the presence in the sample of movers between employment states over time. Those movers can be compared to the stayers in terms of earnings. As an illustration, we consider a simple two-period example and eight cases of transitions out of the various possibilities of professional trajectories (which are 16 in a two-period example): 2 cases of stayers: Equations (3) and (4) give examples of the changes in earnings for stayers, i.e. for workers that do not change their employment state between the two periods.
6 cases of movers: with Δ Equations (5) and (6) illustrate the changes in earnings for those workers coming from an informal sector wage job and moving, respectively, into an informal self-employed job and a formal sector wage job; equations (7) and (8) represent these earnings differentials for those coming from a formal sector wage employment and moving, respectively, into an informal self-employed job and a formal self-13 employed job. Finally, the cases of informal self-employed workers moving to, respectively, formal selfemployed and formal sector wage jobs are considered in equations (9) and (10).
The identification strategy of FE on movers is quite standard but, in practice, one should verify that the number of moves across employment states is sufficient for a valid use of this estimator. We verify that this is the case in Table 5 in the next section. More generally, the identification strategy supposes that movers change employment states more or less randomly, or at least that they do not systematically move for better earnings. However, people may change jobs in particular if they see an opportunity to earn more. We present in the following section earnings matrices showing that this is actually not the case (Table 6).
Finally, to allow the earnings gaps between employment statuses to differ along the earnings distribution, we rely on Quantile Regressions (QR). Quantile earnings regressions consider specific parts of the conditional distribution of the hourly earnings and indicate the influence of the different explanatory variables on conditional earnings respectively at the bottom, at the median and at the top of the distribution.
Using our previous notation, the model that we seek to estimate is: where is the th conditional quantile of the log hourly earnings. The set of coefficients provide the estimated rates of return to the different covariates at the th quantile of the log earnings distribution and the coefficients , and measure the parts of the earnings differentials that are due to informal-formal job differences at the various quantiles.
We then turn to Fixed Effects Quantile Regressions (FEQR). The extension of the standard QR model to longitudinal data has been originally developed by Koenker (2004). More recently, Canay (2011) proposed an alternative and simpler approach which assumes that the unobserved heterogeneity terms have a pure location shift effect on the conditional quantiles of the dependent variable. In other words, they are assumed to affect all quantiles in the same way. It follows that these unobserved terms can be estimated in a first step by traditional mean estimations (for instance by FE). Then, the predicted are used to correct earnings, such as , which are regressed on the other regressors by traditional QR. 10 When running the regressions (2) and (11), we always provide robust standard errors using bootstrap replications. Table 4 presents some basic summary statistics of the main characteristics of the panel data used in our analysis. These descriptive statistics are reported for the sub-samples of wage/self-employed workers, broken down by formal and informal sector jobs.

Table 4 about here
The results obtained for average earnings are in line with common findings in the literature. Workers holding formal sector jobs earn more on average than those engaged in informal sector jobs. Among each group of formal and informal sector workers, self-employed workers are those with higher earnings in comparison with wage earners. If the average age of the labor force is the same between the two sectors, informal sector wage workers tend to be younger than their formal worker counterparts. Selfemployed workers exhibit on average longer potential experience in the labor market (which is calculated as age minus years of reported schooling minus five). As expected, workers having higher level of education are less likely to be engaged in the informal sector and vice versa. The gender ratio varies significantly between formal and informal sector jobs. Female workers have more opportunity to get informal sector jobs, female participation is at its highest in informal self-employment and at its lowest in formal one.
Finally, formal and informal sector workers are differently allocated across branches of activity.
Specifically, informal sector employment is found more in trade, restaurants and construction, while formal sector jobs are more concentrated in clothing and services (in particular public administration). 10 In Canay (2011), the most problematic assumption is that the estimator is consistent and asymptotically normal only as time periods T goes to infinity. In the case of non-linear operators like quantile regressions, FE are not estimated consistently when the number of time periods (T) is fixed and small (<10), the inconsistency being transmitted to the estimators of the other covariates of interest (Koenker, 2004). As an alternative to Canay's method, we relied on a quantile regression model with correlated random effect (CRE), as suggested by Abrevaya and Dahl (2008), which sees the unobservable as a linear projection onto the observables plus a disturbance term. In other words, unobservables are linearly correlated with the explanatory variables. In the simplest approach à la Mundlak, unobservables are modeled as the mean values of time-varying covariates over all periods plus a normally distributed term. This approach is more restrictive in the sense that it requires using a balanced panel.
The new estimates (available upon request from the authors) point to similar qualitative results, although the magnitudes of the gaps are often larger in the CRE.
Interestingly, the share of manufacture is identical between informal sector jobs and formal ones (31% in both cases). Within institutional sectors, the distribution is even more unbalanced: informal sector wage workers are stubbornly engaged in services to the person (51%), whereas informal self-employed workers hold trade jobs (36%). Formal sector wage workers are engaged prominently in services (63%), while formal self-employed job's structure looks like the informal self-employed one. In terms of firm size, formal sector wage workers are as expected over-represented in large enterprises, while the three other groups are quasi exclusively engaged in micro-enterprises (informal self-employed workers operating the smallest ones). These significant differences in the distribution of job structure underline the importance of controlling for sectors of activity and size in our earnings estimations.  To save space and given the small number of observations, formal self-employed workers have been aggregated with informal ones (we will distinguish them in our estimations; see Section 5). Inactive and unemployed are also aggregated into one broad category (not working). First, the proportion of movers (from one category to another) is far from negligible and is quite stable over time. From one year to the next, movers represent around one third of the three samples (from a minimum of 31% between 2000 and 2001 to a maximum of 36% between 2002 and 2004). If we consider only those holding a job, the target of our earnings gap estimations, the rate of movers is reduced to one fourth (22% to 26% respectively for the same periods). Formal sector wage jobs are the most stable, followed by the selfemployed ones. Informal sector wage workers are the most mobile: only 30% keep their status from one year to the other. The flows between sectors follow a consistent pattern. Informal sector wage worker movers mainly get formal sector wage and (informal) self-employed jobs, equally distributed. Formal sector wage worker movers privilege self-employment, but substantial flows go to informal wage jobs.
Conversely, self-employed workers change more often for formal sector wage jobs than for informal jobs, withdrawing from work being their first option (retirement). On the methodological side, the substantial numbers of movers, in both directions, and for all types of jobs, is key for our estimation strategy.
Another striking finding is the surprising weak impact of the macroeconomic context on transition flows. Changes in year-to-year transition flows (direction and intensity) are limited, stressing a robust structural pattern. This assessment is confirmed by the long run transition matrix, as shown in the low right panel of respectively for those who kept a job). For each of the four initial positions, the distribution of movers between categories are surprisingly close to the year-to-year one. However, at the margins, the crisis spell  To end this descriptive analysis, we turn to the earnings dynamics by institutional sector and status in employment. Table 6 present the levels (in constant 2000 Ariary) and the changes (in %) in real earnings for the three year-to-year periods and the "long run" spell (2000)(2001)(2002)(2003)(2004). Compared to Table 5, the panel sample is restricted to the individuals holding a job and having positive earnings in both period.
Consequently, those who are not working or unpaid family workers are excluded. The number of observations is around 3,000 for year-to-year matrices and 2,000 for the 2000-2004 matrix.
The left panel of Table 6 shows the level of real hourly earnings in the final date by transition status.
Consistently with Table 4, informal sector wage workers get the lowest pay, followed by informal selfemployed, formal sector wage workers and the formal self-employed workers at the highest end of the earnings ladder. If we now take into account transition status, informal sector wage worker stayers systematically perceive less than those who changed to self-employment or formal sector wage jobs.
Symmetrically, self-employed stayers get a better remuneration than those who move to formal or informal sector wage jobs, with the exception of the 2001-2002 period. Such exception can be due to a crisis effect (shrink in demand and increased competition), while formal sector wages are more rigid.
Finally, formal sector wage worker stayers, as primary labor market insiders, are by far the best compensated workers (compared with the other eight transition status); the only exceptions are formal self-employed workers. This result suggests that, on average, creating an informal firm from a formal sector wage job induces a decline in earnings. Two potential reasons may be invocated: some have been constraint to settle an informal business because of a lay-off in a formal activity or other institutional factors (like retirement age); non-pecuniary considerations may be at stake, but a lower pay than those who obtained a formal sector wage job.
These unconditional earnings in the end year do not tell much on earning dynamics, initial conditions being only taken into account through the labor status in the base year. Considering growth rates is a first step to control for initial earnings (right panel of Table 6). Moving to informal sector wage jobs is associated with the lowest increase in earnings over all periods, whereas being able to change to a formal self-employed job is associated with the highest earnings growth. Moving out of informal sector wage job ensures higher earnings growth rates, while abandoning self-employment for wage jobs, or formal to informal sector wage jobs provides lower growth rates. In terms of earnings growth, the picture for those who quit a formal sector wage job to create an informal business is mixed: in two cases out of four they perform better than their stayers counterparts (2000-2001 and 2001-2002), but do worse in the two other cases (2002-2004 and 2000-2004). This suggests a potential trade-off between these two kinds of jobs, a stylized feature underlined in the literature, which we will investigate further in Section 5 for the case of Madagascar.
Of course, these unconditional averages should be controlled for observed and unobserved characteristics, which is the purpose of the following section. Furthermore, changes in job states are not systematically associated with upwards (or downwards) trends in incomes. Out of the 24 groups of movers, 13 suffered a lower income growth than their respective stayers, while 11 benefited from a relative increase. This comforts the identification strategy of earnings gaps based on movers and stayers (see previous section). 11 Finally, our analysis shows that earnings levels and changes are highly dependent on transitions.
Transition and earnings matrices are very consistent, confirming the high quality of our data, a feature already stressed in previous methodological papers (Roubaud, 2000;Rakotomanana, 2011). 11 We also investigated statistics and tried to characterize between-sector movers and stayers by running probits of sector movements in both directions: informal to formal, and formal to informal (results available upon request). Statistics show that movers are not extremely different from the overall population of stayers in terms of their observed characteristics. Looking at the probits, we found that the pseudo-R2 of these regressions reach only about 0.03, depending on the specifications. Hence, although we controlled for a set of observed characteristics, and also for an unobserved (fixed) component of individuals using the individual fixed effects, we were still not able to explain much of the reasons for transition between sectors.

Earnings Gaps Analysis
In this section we discuss the earnings gaps between formal and informal sector jobs at the aggregate level, estimated using the four estimations procedures presented in Section 3. In the following discussion, we compare the three other work status with formal sector wage workers, as our benchmark.
We also investigate the gender issue. For instance, UICs may have to do with more efficient social networks to get a formal sector job.

Formal vs. informal sector workers
However, the remaining -10% gap, once we control for UICs, highlights that formal sector jobs provide higher earnings per se. Here again, this result can be due to various factors which end up, at the firm level, to a higher productivity or market power, and/or, at the worker level, to a stronger bargaining power of formal sector workers to negotiate higher earnings.
To go beyond average, we ran quantile regressions. While informal sector workers suffer earnings penalties at almost all levels of the conditional distribution, the gap is sharply decreasing from the bottom to the upper part. Beginning with a huge -38% (quantile .10), the gap continuously shrinks to become insignificant around quantile .80. From then, it even reverts to reach +7% at the upper-tier of the distribution (quantile .90). The Fixed Effects Quantile Regression (FEQR) gap not only confirms both the key role of UICs in reducing the "true" gap but also the pattern along the earnings distribution: from -28% for the bottom quantile (quantile .10) to 14% for the upper one (quantile .90).
However, once we control for the size of the enterprises, the average earnings gap nearly disappears.
The OLS gap is only -6.3 % ( Figure 1A), while the FEOLS gap is slightly negative but non-significant.
Interestingly, the profiles of the earnings gap along the distribution remain unchanged, with a systematic 12 Models without firm size are not reported. They are available from the authors upon request.
penalty decline for informal sector workers from the lower to the upper tier (QR, Figure 1A and Tables   A4 and A5). The QR estimates range from a -23% penalty for informal sector workers at the bottom (quantile10) to a 11% bonus at the top (quantile 90), while the respective numbers are -13% and 10% for FEQR, the turning point (from penalty to premium) being around the third quartile in both cases.
The interpretation of the size effect is not straightforward in our informal vs. formal perspective. First, conditional earnings grow with the size of the enterprise. This result is robust to any of our specification and consistent with the literature in this respect. Second, as the informal sector is often defined as enterprises under a certain size threshold (minus 5 or 10 workers), introducing the size in our estimation as an independent variable tends to absorb the impact of informality on earnings. This is all the more the case that the two criteria used to identify the informal sector (size and registration) are highly correlated.
In the remainder of this paper we still decide to comment the earnings gaps based on the regression including the size as an independent variable. As a consequence, two important points should be kept in mind: our results focus on the impact of non-registration on earnings, net from the size effect; the exhibited gaps should be interpreted as the most conservative estimates, which are systematically higher without control for the firm size.
Finally, whatever the earnings specification (with or without firm size), the huge gap variations along the distribution point to the intrinsic informal sector heterogeneity. This result is mainly due to the fact that the "dualistic assumption" is too rough, gathering together very diverse categories of workers within each sector, which we investigate below in more details.

Formal vs. informal sector wage workers
As expected, within wage workers, those employed in the informal sector are on average worse-off than their formal sector counterparts ( Figure 1B, column (3)). The OLS gap (-18%) is significantly reduced to -9% when individual fixed effects are introduced, suggesting that informal sector wage workers may have a disadvantage in terms of their unobserved productive attributes. Taking or not taking into account the fixed effects, the gap is continuously decreasing ( Figure 1B and Tables A4 and A5): from -30% (quantile .10) to -5% (quantile .90; non-significant) for the latter, and from -16% to 1% (nonsignificant) respectively controlling for UICs. In both cases, formal sector wage workers conserve an earnings advantage at any position in the pay ladder. Even if we cannot exclude that non-pecuniary disadvantages of formal sector wage jobs may be compensated by earnings (such as poor working conditions), 13 these results could be taken as an acceptable validation of the exclusion hypothesis (for this category of workers), according to which informal sector wage workers are constraint in their job choice, and are probably queuing for formal sector jobs.

Formal sector wage vs. informal self-employed workers
For the bulk of the labor force, this alternative choice is probably the main trade-off, and also the most discussed in the literature. At odd with the previous case considered and more generally the dualistic approach, the conditional OLS gap is positive, with a significant premium of +18% for the informal self-employed ( Figure 1C, column (3)). Furthermore, the FEOLS models still shows a premium at +12% (column (5)). Again, this would mean that informal self-employed workers have an advantage in terms of their unobserved productive characteristics (probably in terms of their entrepreneurial skills), which produces an overestimation of the premium associated with being an informal self-employed worker compared to exerting as a formal sector wage worker if this individual heterogeneity is not accounted for. We nevertheless should be cautious before claiming that the exit option may be at stake, as the selfemployed earnings may be overestimated for at least two reasons: first, the measure of earnings we computed remunerates both labor and capital factors (mixed income), the latter being far from negligible in the informal sector ; second, the self-employed earnings include the share which should be attributed to the productive contribution of unpaid family workers. As we do not have any order of magnitude of these two phenomena, it is difficult to exclude the possibility that the premium we obtain may not turn into a penalty, once these two factors are taken into account. 14 When turning to quantile regressions ( Figure 1C and Tables A4 and A5), the distributional profile of the gap presents the same now clear pattern, as in the two previous cases. The gap steeply increases with earnings level, and is in favour of the informal self-employed workers. In absolute terms, informal selfemployed laborers suffer a penalty only at the lowest end of the conditional distribution (up to about the first quartile where the gap is not significant). Afterwards, the gap is reversed into a significant premium, growing continuously up to 60% for the richest decile (quantile .90), crossing the OLS estimate at the median point of the earnings distribution. FEQR confirm this trend, the only difference being that the range of variation of the gap along the distribution is attenuated. Once the UICs are controlled for, 13 For a detailed analysis of the possible existing pecuniary compensations for working conditions along the earnings distribution, see Fernández and Nordman (2009) in the case of UK and Bocquier et al. (2010) in the case of West Africa. 14 The definitive assessment is even more complex as measurement errors in incomes are usually considered as more important for self-employed than for wage workers, as the former usually do not know their precise level of income (especially informal self-account workers who do not have book accounts), and the richest ones tend to understate their level of activity.
informal self-employed workers are better-off at all points of the pay scale above the first quartile up to 39% at quantile .90. All in all, and given the size of the premium, we can confidently conclude that informal self-employment may be more lucrative that formal sector wage alternatives, especially for the richest workers. As a matter of consequence, we have good presumptions to assert that, in Madagascar, a substantial part of the labor force has deliberately chosen to work in the informal sector as non-wage workers, for pecuniary reasons.

Formal sector wage vs. formal self-employed workers
The earnings comparison of formal sector wage workers and formal self-employed workers is clearly in favour of the latter, whatever the model chosen ( Figure 1D and Tables A1, columns (3) and (5)). The OLS estimate presents a +93% premium, just slightly reduced with fixed effects (+30%). As with the informal self-employed workers, their unobserved productive attributes may be better than those of the formal sector wage workers. As in the case of informal self-employed workers, the premium is continuously increasing with earnings levels, but is translated upwards, a pattern in line with the empirical results obtained in the literature for developed countries. Controlling for UICs or not, formal self-employed workers are always better-off in terms of earnings than formal sector wage workers, the premium culminating at +149% (QR) and +69% (FEQR). Overall, it seems that the Malagasy labor market functions under a regime of wage repression. Whatever the reasons -macro pressures of international integration, deliberate policies to control inflation, or weak bargaining power of the wage workers; the latter being the most plausible -, it seems globally preferable to work as an independent (even in the informal sector) than as a wage worker (at least in non-farm activities). 15

Formal vs. informal self-employed workers
Lastly, we turn to the comparison between the two types of self-employed workers: formal and informal.
Formal self-employed workers are rarely considered in the literature on LDCs, maybe because they are too few in the countries considered. But there are many reasons to focus on this category of workers: first, to compare our results with those obtained in developed countries on salaried vs. non-salaried workers' earnings gap, as in these countries self-employed workers are quasi-exclusively formal; second, 15 In Madagascar, the formal sector wage repression is well established. The legal minimum wage is very low (100,000 Ariary, the equivalent of 50 US dollars at the time of the last survey in 2012). It is not regularly updated, and it shows a steep decreasing trend in real terms, in the short as in the long run. Official survey reports in 1996 and 2013 (INSTAT, DIAL, UNDP, ILO) show that, in constant terms, the minimum wage decreased by 55% between 1964 and 1996. More importantly, the minimum wage is not a binding constraint as more than 81% of the labor force earned less than this threshold in 2013.
because it allows us to establish the link with the existing formal/informal sector literature from a business perspective (not job). Finally, the comparison appears more legitimate as the nature of incomes and unobservables potentially at play are in both cases equivalent (which is not true concerning wage workers).
Formal self-employed workers are systematically in a better position than their informal counterparts, all along the pay scale ( Figure 1E; the reference group is now informal self-employed workers; regressions tables are not reported to save space). Returns to firm's formalization is always positive and increasing with the net earnings, even when controlling for entrepreneurial skills and other unobserved characteristics, the most favoured in this respect choosing disproportionately the formal sector. This advantage of formal household businesses may be due to higher initial level of physical capital or more productive combination of factors (our models do not provide elements on this point), but it is compatible with the potential causal benefits of getting formal (access to credit and markets) as found in the literature.

A gender perspective
Exploring the gender dimension associated with informality is crucial for various reasons. First, there are strong imbalances in the job structure, females being more prone to hold informal sector jobs than their male counterparts. Second, the raw gender earnings gap is in general significantly higher in the informal sector. 16 Finally, and more importantly, the motivation to hold informal sector jobs is highly dependent on gender. Women may have a welfare function which is less dependent on income incentives, as they take more care of extra professional activities ( (Grasmuck and Espinal, 2000;Duflo, 2003;Duflo and Udry, 2004). Cultural norms define the respective roles of women and men within the household and society, and may explain for instance why female-run businesses tend to stay small and more subsistence-oriented. As traditionally the primary caretakers of children and responsible for domestic chores, women could choose self-employment in the informal sector, not necessarily for the level of earnings, but because this offers flexible work arrangements, and enables them to balance work and family activities. Gender-specific spending priorities also define the amount reinvested in the business, as females are known to devote a higher share of their earnings to the welfare of children. It has also been argued that women run their businesses in a subsistence-oriented manner to complement their husbands' income (Kevane and Wydick, 2001;Nichter and Goldmark, 2009;Nordman and Vaillant, 2014).
Firstly, whatever the models' specifications and the category of workers considered, females always financially suffer more (or benefit less) when they are employed in the informal sector. For instance, at the aggregate formal vs. informal level (Figures 2A and 3A), the OLS gap is slightly positive (but not significant) for men while reaching -19% for women; the FEOLS being respectively 1% (non-significant) and -7% (significant at the 10% level). Such a feature is compatible with the idea mentioned above, that women may accept lower wages in the informal sector because it provides other non-pecuniary advantages, relatively more valuable to them. However, it can also reveal barriers or labor market segmentation, which would be more pronounced for women competing for formal sector jobs.
Quantile regressions shed an interesting light on the informal vs. formal earnings gap by gender. For men, working in the informal sector is financially penalizing below the median of the distribution and advantageous afterwards, whether taking UICs into account or not. For women, holding an informal sector job is always associated with lower conditional earnings, or at best equivalent to being employed in the formal sector (for the last quartile of earnings). By contrast, while the penalty for being informal sector wage workers remains substantial for women once UICs are controlled for (-18%, Figure 3B), it is no more significant for men. For the latter, working informally is at least financially as rewarding as having a formal sector job, whether dependent ( Figure 3B) or independent ( Figure 3C).
Secondly, in spite of differences in absolute levels, the distributional profile of the earnings gaps is quite similar across gender: no noticeable effect for formal self-employed workers compared to informal ones, an increasing slope for the other categories with respect to formal sector wage workers. The only exception is for informal sector wage workers, whose earnings are globally as rewarding as those obtained by formal sector wage workers for men, while the penalty suffered by female informal sector wage workers is continuously and steeply decreasing, but never turns into a premium.
Thirdly, the sorting process in the allocation of men and women across employment status (which is partly revealed by the effect of controlling for UICs) does not differ substantially across gender: informal sector wage workers have detrimental UICs (in order to get a better income) vis-à-vis formal sector wage workers, while the unobserved skills are favourable for self-employed workers (whether formal or informal). The only exception is for male wage workers, who have comparable UICs along the formal/informal divide.

Conclusion
In this paper, we study which of the exclusion or the exit hypothesis regarding informality is best suited to the urban Malagasy labor market. To this end, we focus on the earnings gaps between formal and informal sector workers. Assuming that individual earnings are proxies of individual utilities, our approach considers that if informal sector workers earn more than their formal counterparts, this reflects a deliberate choice of the former to be informal sector workers. Taking advantage of the rich 1-2-3 Surveys for Madagascar, the four wave panel data (2000-2001-2002-2004) give us the unique opportunity to control for time-invariant unobserved individual characteristics. Using both standard and fixed effects earnings equations estimated at the mean and at various conditional quantiles of the earnings distribution, we address the key issue of heterogeneity, at three different levels: the worker level, taking into account individual unobserved characteristics; the job level, comparing wage workers with selfemployed workers; the distributional level. Gender issues are also examined. To our knowledge, this approach is applied for the first time ever in Madagascar, and rarely for Sub-Saharan Africa.
Our results suggest that the informal earnings gap highly depends on the workers' employment status (wage employment vs. self-employment) and on their relative position in the earnings distribution. The main conclusions are often at odds with the exclusion hypothesis and what would show the observed raw earnings gaps: in many cases, informal sector jobs are more rewarding (self-employment) or as rewarding (male wage workers) as formal sector wage jobs. This feature is due to the relatively low wages of formal sector wage jobs. The reason for such a specificity should be investigated further (international competition pressure? wage repression policy?). Second, Madagascar's labor market seems more integrated than what its development level would have predicted. The earnings gaps look more like those observed in emerging countries, characterized by a weak segmentation between formal and informal sector jobs, than the standard dualistic Sub-Saharan labor markets. Third, the systematic premium at all points of the distribution of formal self-employed workers over their informal counterparts suggests that formalization of non-farm household businesses seems to be beneficial. Policies aiming at easing administrative procedures to register informal firms should be encouraged. Finally, females always financially suffer more (or benefit less) when they are informally employed. This feature opens space for specific policies to align the functioning of labor market for women with that of men (reduction in entry barriers to formal sector jobs, improvement of access to physical capital, etc.).

25
In a nutshell, in the case of a poor and fragile country like Madagascar, these findings provide new and robust empirical backups for the existence of a mix between the traditional exclusion vs. exit hypotheses of the informal sector.