Linkage Criteria Time Requirements of the Linking Process

7. Linkage Results

7.1 Completeness of life courses

When records from multiple sources are linked simultaneously, it is impossible to know the definite answer to the question: How many times does each person appear in the sources? One way to evaluate linkage success is to analyse single individual event records that are still not linked, to try to find patterns among them as a possible approach to an explanation of why they are not linked. An overview is given in Table 2.

TABLE 2

All together 14 % of the individual event records were not linked, or to put it in another way: Of the 31230 person numbers generated by the computer, 45% were registered with one event only. This seems to be a high number, but the composition of the raw data (see Table 1) explains many of these non-links. An event type which stand out with a high number of not linked events, are children at baptism. For children born after 1865 the reason why most of them are not linked is obvious. As there is no computerized census after 1865, children born after 1865 will, unless they die, necessarily have only one event before 1878 in their life course. But even before 1865 the percentage not linked is quite high, 16%. This is because there were no censuses between 1835 and 1865. Among the baptisms in the period 1836-45, 33 % are not linked. Generally it can be observed that events from periods where there are many sources are most easily linked, see e.g. the censuses of 1825 and 1835.

The 1801-census, on the other hand, contains many people not linked. Since the computerized ministerial records only start in 1814, and as the census for 1815 only covers a part of the parish, the 1801-records of people who died in the period 1801-1813 are not linked. We can estimate the number by using national demographic information. The age composition of the 1801 population in Asker and Bærum was not at variance with the whole Norwegian population. The average crude death rate for the years 1801-1813 on the national level was slightly above 25 per thousand. Given the same rate in Asker and Bærum, this means that about 1300 of the 1868 not linked people had probably died by 1814. Most of the rest had moved out.

Other events which stand out with a high percentage of no links are the fathers at the wedding, 29% of the fathers of the grooms, 22% of the brides' fathers. The difference is explained by the fact that marriages more often took place in the home parish of the brides than of the grooms. Many of the fathers never lived in the parish. Owners of land also belong to the easily explained 'life courses' with only one record. These must belong to absentee landowners, or to people who moved in after 1865. The last land register was from 1888.

Among the deceased 15% were not linked, altogether 1226 persons. This is probably where the problem of underlinkage is the greatest. The entries in the burial lists sometimes give the name of the husband of a deceased woman, but never vice versa. Age tends to be inaccurately stated at old age. During the linkage process there were several examples of life courses where the age implied that the person in question should be dead within the period of study, but where there was either no apparent candidate record, or there were several equal competitors.

Stillbirths were registered for analytical purposes, but of course there will be only one event record for stillborn children. A preliminary study of infant mortality also revealed that many children who died soon after their birth were only registered in the burial and not the baptismal lists in the church books.

The seemingly high percentage of non-links, 14 %, can thus largely be explained by the composition of the sources used. If we accept that children born after 1865, most of the fathers of brides or grooms and people who are just mentioned in a land register, only have this event in their life course in the Asker and Bærum records, this amounts to almost 5000 people. The stillbirths as well as the estimated number of deaths before 1814, adds 462 and 1300 respectively, to the number.

Among the other more than 7500 not linked events (7%), there are surely some that are underlinked, either because of errors that were not discovered, or because there were too few or too weak identification items in the records. The main reason why they are not linked must, however, be that the people moved out of the parish. The parish is situated close to Oslo, the capital, and this must have been important for migratory movements. Norwegian church books in the 19th century list migrants in and out of the parish, but they notoriously understate the number. For Asker and Bærum the migration registers were not computerized, but manually counted from 1815-1865. According to the church books the total number of outmigrants was 2311.

7.2 Name frequency and linkage results

It was easier to link people with uncommon than common names, but how did name frequency affect the result of the linkage? All the first names in the data material except the people born after 1865 were counted. Double names were counted twice, once for each name. The total number of names were 116,112, distributed among 1052 different male and 1321 female names. The names were grouped into three categories, common, average and rare names. People with double first names formed a fourth category. Each name category contained approximately the same number of people. Among the people with a single common first name, there were about 45 % with only one event in the life course. The frequencies among those with average or rare names were 6-7 percentage points lower. Of those with double first names 32% had only one event in their life course. Thus there is a tendency that people with common names are overrepresented among the non-links. There are, however, many disturbing factors in such an analysis. One is that there was more sharing of the same names at the beginning than at the end of the period. The space and purpose of this article does not allow a more thorough presentation [Fure 1990a, 1990b].

In general, the variation in the number of events linked according to name frequency is not alarming. However, when so many people share the same first name, this is a problem even with an interactive system like Demolink.

7.3 Social group and linkage results

Earlier family reconstruction studies have been criticized because stable families have been more successfully reconstructed than the more mobile ones. Such a bias is likely to have consequences for analyses of other types of demographic and social behavior. Stability in this period and context is related to wealth and higher social class: farmers moved less frequently than cottars, craftsmen and workers.

The social status of the heads of households given in the Norwegian sources was coded into three different social groups. Group one consists mostly of farmers who owned their land or tenant farmers; group two consists of persons who were craftsmen, masters of small vessels and small-scale traders; group three consists of workers, servants, fishermen, cottars, sailors, as well as people who received poor relief. A social grouping of a person for the whole lifetime was not made, because of the problem of handling changes in social status. In order to examine the relationship between the number of events in the life course and social group, the social status at a specific time must be chosen, i.e. in one of the censuses. The relationship was examined for all of the censuses, and the pattern was homogeneous. Table 3 shows the information for the 1835-census [Note 4].

TABLE 3

We see there is a slight tendency that the farmers have more events registered than the heads in the lower social groups. This may not be an effect of a poorer linkage result for the lower social groups, it might just as well be that there were more events or more registered events in the farmers' life courses. Age at marriage was lower for farmers than cottars, thus they might have experienced more baptisms and more burials. They also remarried more often. On the whole the difference between the groups is not dramatic, and it seems safe to conclude that Demolink's interactive linkage of records from all the sources simultaneously, has given good results for all social groups.

 

Linkage Criteria Time Requirements of the Linking Process

Interactive Record Linkage: The Cumulative Construction of Life Courses
Eli Fure
© 2000 Max-Planck-Gesellschaft ISSN 1435-9871
http://www.demographic-research.org/Volumes/Vol3/11