1. Modules specification

The table below lists all the equations estimated in the model, identifying outcome variables and determinants. The equations are estimated on the 2005-2011 waves of the EU-SILC longitudinal panel.

We now discuss the most important assumptions in the construction of the model, which, as any model, is based on a number of simplifications.

Demography. The Demographic module ages all individuals in the population and then passively aligns the population to Eurostat projections, by gender, age and simulation time. This means that at each simulated time (e.g. 2022) the simulated population is partitioned by gender and age. The size of each resulting cell is then compared to the Eurostat projections : if in a given cell there are too many simulated individuals, those in excess (randomly selected) are removed from the simulated population; if on the other hand there are too few simulated individuals, an appropriate number of randomly selected simulated individuals are cloned and added to the simulation. This also takes care of migration, under the assumption that (i) those who move out of the country do not self-select, and are on average similar, given their age and gender, to those who do not emigrate, and (ii) immigrants immediately become similar to those already living in the destination region. This is a simplifying assumption and is motivated by the fact that projections on migration flows are not easily available. Also, internal mobility within regions is not modelled, again because demographic projections at a regional level are not available from Eurostat. Because cell resizing (i.e. removing and cloning) is done randomly, this implicitly assumes that the distribution of the population between different regions remains constant. Hence, any internal migration flow (for instance between the South and the North of Italy) is assumed to be persistent (constant over the years), so that what we observe in the data, and keep constant throughout the simulation, is an equilibrium distribution.

Education. Individuals never re-enter formal education once they have left it. This is clearly an oversimplifying assumption, but is compensated by the fact that education levels are aligned to external projections. This clear temporal distinction between time in education and adult life beyond formal education motivates the assumption that students do not enter the other modules of the simulation (though a flag for individuals who have just left education is present in all equations, see the table above) . Hence, students can’t marry, they can’t have children, and they can’t work, in the model. Testing these specification restrictions in the EU-SILC data is difficult, because EU-SILC data only record whether individuals are in education or not , and not the number of hours per week spent in education, nor the type of the education course. Moreover, apprentices are also considered in education. Therefore, many individuals show up as “in education” at later stages in life possibly because they are following on-the-job or minor training programs.

Retirement. Given that the primary focus of the microsimulation exercise is not on the elderly, individual behaviour with respect to retirement decisions is kept as simple as possible, while preserving individual heterogeneity. When individuals first enter the simulation they draw a percentile in the distribution of retirement age. This determines a time-specific individual threshold for retirement (the threshold follows the evolution over time of the average retirement age). In each year, individuals check whether their age is above their (time-specific) threshold: if this is the case, they retire. This approach allows to abstract from the details of the specific pension reforms implemented in each country, and is motivated by the observation that the average age of retirement changes shows a high degree of persistence, as depicted in the figure below(the within-country variance remains substantially constant):

Figure: Average effective age of retirement. Source: Our computations on Oecd data.

Maternity. The probability of having a child is estimated around an overall fertility rate coming from the Eurostat demographic projections, and it is aligned ex-post so that in the aggregate the official projections are met. This means that the total number of newborns in each year is externally given, and that the microsimulation is used to distribute the births among women in fertile years, given their individual characteristics. The underlying assumption is that the microsimulation is not relied upon to provide demographic projections (it is not constructed to be a “good” demographic model at the aggregate level), but can accurately model heterogeneity in outcomes (it is a “good” demographic model at the micro level, that is a “good” model of differential fertility around a given exogenous trend).

Participation and employment. Labour market outcomes are modelled with a double hurdle approach: first we determine participation; then employment, given participation (Blundell et al., 1987). The alternative is to have a discrete choice model of hours worked à la Aaberge et al. (1995) or van Soest (1995), where a set of possible choices are considered based on individual optimisation over leisure and consumption and 0 hours are interpreted as non-employment. On the other hand, modelling participation as a separate process allows to distinguish non-employment into unemployment and non-participation.

Given that we do not have monetary variables (i.e. wages, other forms of income, savings, and wealth) and we do not explicitly model consumption, there is little advantage of having a “structural” model explaining hours worked.  Consequently, employment is a dichotomous variable in the model, although the average number of hours worked can be backed up, at an aggregate level, given the predicted employment rates and the assumed evolution of the share of part-timers. This latter variable is a scenario parameter, computed at an aggregate level over the whole working population, and catches not only the structure of labour demand by firms, but also preferences and a general attitude towards family-work life balance. The underlying assumption is that this variable is exogenous: the overall availability of part-time job opportunities in the economy affects the participation rates of female with children, but their participation decisions do not affect (have only a minor effect on) the overall share of part-timers, computed across genders, ages and family structures.

Demand side (business cycle) constraints are taken into consideration by including the overall unemployment rate among the determinants of individual employment.  This effectively makes the employment module a model of differential employment (analogously to the maternity module), where the overall unemployment rate is a scenario parameter and catches the interaction between supply and demand, which remains outside the scope of the model. The microsimulation then distributes employment among the active population, based on individual characteristics.
The model further assumes that demand side constraints do not affect participation at the extensive margin (i.e. the first hurdle). In reality, discouragement can take place and deter individual participation when work opportunities are limited; conversely, a tight labour market can attract new people into the labour force. The justification for not including demand side variables (e.g. the unemployment rate) into the participation equations is that at the aggregate level this would imply having a model of labour force participation that depends on unemployment, which in itself is defined conditional on being part of the labour force: the unemployment rate is an endogenous variable, and including it would deteriorate the quality of the estimates of the other coefficients, and hinder interpretation. Distinguishing between the two hurdles, where the second hurdle is a model of differential employment over the business cycle and the first hurdle is a model of participation that abstracts from business cycles considerations is a good compromise, given that the effects of labour market conditions at the extensive margin are known to be smaller than those at the intensive margin (Attanasio et al., 2015).


Aaberge R, Dagsvik J, Strøm S. (1995). Labor Supply Responses and Welfare Effects of Tax Reforms, Scandinavian Journal of Economics 97(4): 635–659.

Attanasio O, Levell P, Low H, Sánchez-Marcos V (2015). Aggregating Elasticities: Intensive and Extensive Margins of Female Labour
Supply. University College London, mimeo.

Blundell R, Ham J, Meghir C (1987). Unemployment and female labour supply. The Economic Journal 97(Supplement): 44–46.

Van Soest A (1995). Structural models of family labor supply – a discrete choice approach. The Journal of Human Resources 30(1): 63–88.

Next: 2. Estimates