where ????????????
????????????
is the outcome of individual ???????????? living in Florence in the 15
th
century, ????????????
????????????
????????????
is a
vector of controls, including age, age squared and gender, ????????????
????????????
????????????
is a set of dummies
for each surname, and ????????????
????????????
????????????
is the error term.
In the second sample, we have information about pseudo-descendants, i.e.
taxpayers currently living in Florence. For reasons of data availability, the data are
aggregated at the surname level. The regression of interest is:
????????????
????????????
????????????
= ????????????�????????????̂????????????
????????????
????????????
� + ????????????????????????
????????????
????????????
+ ????????????
????????????
????????????
(2)
where ????????????
????????????
????????????
is the average outcome of individuals with surname ???????????? currently living in
Florence, ????????????
????????????
????????????
is, as above, a vector of controls for (average) age, age squared and
gender, ????????????̂????????????
????????????
????????????
is the log of ancestors’ outcomes, imputed using surnames and the
surname coefficients estimated in equation (1), and ????????????
????????????
????????????
is the residual; the
parameter ???????????? is the TS2SLS estimate of the intergenerational elasticity. To replicate
the original population, the regressions are weighted by the frequency of the
surnames. The standard errors have been bootstrapped with 1,000 replications in
order to take into account the fact that the key regressor is generated.
In the second part of the paper, we complement the evidence on the long run
elasticities with an empirical exercise aimed at testing the persistence in belonging
to the following professions: lawyers, bankers, medical doctors and pharmacists,
and goldsmiths. We restrict the analysis to them because they are affluent
professions already existing in 1427 and for which data are currently publicly
available (see more on that in Section 6.2). By merging information drawn from
the surname distribution in the province of Florence with the public registers
containing the surnames of the above mentioned professions, we built a dataset at
the individual level where, for each taxpayer, we are able to define a dummy
variable indicating whether she belongs or not to a given profession. Finally, for
each profession, we regress this dummy variable on the share of ancestors in the
same profession. Namely, for each profession ???????????? (???????????? = lawyers, bankers, medical
doctors and pharmacists, and goldsmiths), we estimate a probit model whose
estimating equation reads as:
????????????????????????�????????????
????????????????????????????????????
= 1� = Φ� ????????????????????????
????????????????????????
�
(3)
where ????????????
????????????????????????????????????
is a dummy variable that equals 1 if individual ???????????? with surname ????????????
belongs to profession ???????????? in 2005 and 0 otherwise, ????????????
????????????????????????
is the share of ancestors with
surname ???????????? belonging to profession ???????????? and Φ(. ) is the cumulative distribution
function of the standard normal distribution. Since the estimation combines
9
individual-level data for the dependent
variable and aggregate, surname-level data
for the covariate, the standard errors are clustered at the surname level (Moulton,
1990).
3. Data and descriptive analysis
3.1 Data sources
Florence originated as a Roman city, and later, after a long period as a
flourishing medieval trading and banking commune, it was the birthplace of the
Italian Renaissance. According to the Encyclopedia Britannica, it was politically,
economically and culturally one of the most important cities in the world from the
14
th
to 16
th
centuries.
5
In 1427, in the midst of a fiscal crisis provoked by the
protracted wars with Milan, the Priors of the Republic decreed an entirely new tax
survey that applied to the citizens of Florence and to the inhabitants of the
Florentine districts (1427 Census, henceforth). The assessments were entrusted to
a commission of ten officials and their staff, and were largely complete within a
few months, although revisions continued during 1428 and 1429. It has been
acknowledged as one of the most comprehensive tax surveys to be conducted in
pre-modern Western Europe. The documentary sources are fully described in
Herlihy and Klapisch-Zuber (1985).
The 1427 Census represents our first sample, containing information on the
socioeconomic status of the ancestors. Indeed, the dataset reports, for each
household, among other variables, the name and the surname of the head of the
household, occupation at a 2-digit level, assets (i.e. value of real property and of
private and public investments), age and gender. The data were enriched with
estimates of the earnings attributed to each person on the basis of the occupations
and the associated skill group.
6
The Florence 2011 tax records represent our second sample, containing
information on the socioeconomic status of the pseudo-descendants. From the tax
records, we draw information on incomes and the main demographic
characteristics (age and gender). The income items reported on personal tax
returns include salaries and pensions, self-employment income, real estate income,
and other smaller income items. In order to comply with the privacy protection
5
The Medici, the most renowned rulers, gathered to court the best artists, writers and scientists of
the time, such as Botticelli, Dante, Galileo, Leonardo da Vinci, Michelangelo and Machiavelli.
6
The data on earnings were kindly provided by Peter Lindert (University of Davis). See the
document gpih.ucdavis.edu/files/BLW/Tuscany_1427.doc for further information. The same data
were also used in Milanovic et al. (2011) for an analysis on inequality in the pre-industrial societies.
10