Project Plan

(Extended version, see also the short version)

Since 1991 the Historical Sample of the Population of the Netherlands (HSN) has been working at the International Institute of Social History to construct a database containing micro-level data on the Dutch population. And since 1997 the Netherlands Institute for Scientific Information Services (NIWI), in collaboration with Statistics Netherlands (and more recently with the Historical Databank of Dutch Municipalities), has been digitising the census data published in countless volumes of official statistics. The work on Life Courses in Context is thus being carried out under the auspices of two KNAW institutes. Through these projects, the institutes have built up substantial relevant experience in this field in recent years, and both institutes have been able to interest or recruit researchers from various universities to take part in these projects.

A. Life Courses
B. Census data


Life courses (HSN)

The life-history database is based on the existing HSN database. The HSN aims to reconstruct life courses as completely as possible for a representative segment of the nineteenth and twentieth century population. The sample required for this purpose (N=77,000) has been derived from the birth registers for the period 1812-1922. Earlier grants from the Programme for NWO Medium-Sized Investments have already enabled this base sample to be input. A large number of death certificates, marriage certificates and all the personal record cards of the subjects in the sample have also been added to the database. Much diverse research has been and is still being carried out using the database. Research based on the HSN has so far resulted in a large number of publications in the Netherlands and abroad including two PhD theses.

The dynamic population registration system introduced in the Netherlands in 1850 has resulted in an archive that is unique in international terms. It is unique because it recorded details of where inmigrants came from and where outmigrants went to, thus enabling the inclusion of the complete life courses of migrants in the dataset. The project thus involves the systematic retrieval of the data on life courses from the population registers. Although the population registers were already in use in 1850, we will input life-course data only for subjects born in 1863 or later. There are two reasons for this: first, the registers did not function properly everywhere in the Netherlands during the first few years of their existence. Second, new regulations concerning population registration were introduced in the course of 1862. As a result, the design of the system was modified, and every household was then re-registered.

Once the population registers have been input, the HSN database will contain the following information for the 40,000 subjects born after 1862:
  • composition of the family into which the subject is born, and the changes in that family before the subject ultimately leaves home
  • illiteracy of the subject's father (evidenced by the absence of signature on birth certificate or on death certificate of the subject)
  • migration history, including information on boarding houses, etc.
  • occupational title, marital status and religion of all the relatives with whom the subject co-resides
  • occupational title of the parents, parents-in-law, four witnesses, subject and partner of the subject (in the event of a marriage certificate)
  • literacy in the subject's social environment (signature of parents, parents-in-law, four witnesses, subject and partner of subject (in the event of a marriage certificate))
  • composition of the subject's own nuclear family and the changes in that family prior to the subject's death
  • relief or care arrangements for the subject in old age.
These features will make the dataset a fundamental source for scholars investigating historical issues in demography, sociology, epidemiology, social economics and human geography. The nationwide coverage of the dataset ensures that regional variations can be identified, whereas current research involving such topics is usually local in nature and necessarily excludes migrants.

The database
For the HSN database, data for each individual are systematically collected from the records kept in the public archives. Primarily, these are birth certificates, death certificates, and personal record cards. The birth certificates include information on the person born, as well as the names, addresses, ages and occupations of the parents. The death certificates include the most recent place of residence and most recent occupation of the deceased, and information on his/her spouse(s); certificates of deceased children provide a second indication of the occupational title of the father (the person reporting the death) as well as a double-check on illiteracy. The personal record cards for all subjects who died between 1 January 1940 and 1 October 1994 have now been input. The cards include data on occupation (from 1940), cause of death (up to 1953), a full migration history (all addresses), family composition, and religion. On the basis of these data, it is now already possible to research topics such as childhood mortality and migration patterns for the whole of the Netherlands.

The above data are now to be supplemented by information from marriage certificates. These give details of the occupational titles, literacy (signature), and place of residence of the bride and bridegroom, their parents and the witnesses (usually friends or family of the couple). These certificates will enable scholars to research topics such as social and geographical mobility, marital mobility and literacy. A substantial proportion of the marriage certificates in the provinces of Utrecht, Friesland, Limburg, Gelderland, Groningen and Zeeland, have already been entered. Much useful "pilot" experience has been gained through the data input. Furthermore, various studies based on the certificates have already been published.

At a later stage, information from the population registers, land registers and tax records will be input. These sources are extremely rich, providing information on the family structure, pattern of migration, further occupational history, and the income and wealth of the subject (and sometimes of his or her relatives). From 1850 the Dutch population registers were maintained as dynamic records. By this we mean that the registers did not merely record a situation at a particular moment in time (a snapshot), but that all changes in a subject's address, family size and migration are noted, creating a longitudinal record. From 1870 onwards the records are actually fairly accurate. Many subjects born after 1870 can also be found in the personal record cards archive at the Central Bureau for Genealogy, so that their migration history can be traced in the reverse direction, thus minimising the risk of "losing" a subject.

The subjects were selected by taking a simple random sample from the birth registers for 1812-1922. The aim was to secure a sample size of 77,000. This is just over half of one per cent of the total number of births, assuming around 14.5 million people were born in the Netherlands in this period. A sample of 77,000 is sufficient for drawing statistically reliable conclusions on subpopulations of two per cent or more of the population born in the Netherlands during the period. So far (at date 1st June 2003) the following have been entered: birth certificates for the entire group, 22,000 death certificates, 16,000 personal record cards (available only for persons alive on 1 January 1940), 9000 marriage certificates, and details of 4000 initial registrations in the population register.

Through the collaborative projects with other researchers, the HSN database has been further enriched with around 20,000 birth certificates, 4000 marriage certificates, 4000 personal record cards and information from the population registers for 5000 individuals.

In addition to being an important source for research and a control database that can be used in interpreting findings on specific groups, the HSN database also acts as a foundation for the collection of new data. In practice, this is achieved by maintaining a data structure that can be used by individual researchers, and by consistently using the database as a starting point in subsequent research, both by expanding the number of subjects included (oversampling) and enriching the database by introducing supplementary data for specific groups of subjects. For researchers, it cuts both ways. Not only can they use the material already input, they also have access to the software and expertise developed by the HSN. This expertise can be seen as an important by-product of the data-entry work carried out over the past ten years. In return for the use of the software and the data already recorded, the HSN requires researchers to add to the dataset any new data they collect in the course of their research, thus ultimately making it available to other researchers too.

Public access to the data is subject to a dedicated set of privacy regulations (Dutch Data Protection Authority, number O-0030426; Law for the Protection of Personal Data: Wet Bescherming Persoonsgegevens, 6 juli 2000, Stb. 302). In accordance with the Personal Data Protection Act, the premise embodied in these regulations is that public access to data for research purposes is governed by the same arrangements as those of the archive from which the data were derived. This might mean that some data can only be made available in an anonymised form. The work on the database is carried out at the International Institute of Social History; ownership of the data rests with the HSN Foundation.

Literature:
T. van den Brink, 'The Netherlands population registers', Sociologia Neerlandica 3, 51-63
C. Gordon, The Bevolkingsregisters and their use in analyzing co-residential behaviour of the elderly (NIDI report no. 9: Den Haag 1989)
A. Knotter and A.C. Meijer (ed.), De gemeentelijke bevolkingregisters, 1850-1920 (Den Haag 1995)
K. Mandemakers, 'The Netherlands: Historical Sample of the Netherlands', P.K. Hall, R. McCaa and G. Thorvaldsen (ed.), Handbook of international historical microdata for population research (Minneapolis 2000)
R.F. Vulsma, Burgerlijke stand en bevolkingsregister (Den Haag 2002)

Census data (NIWI)

The population census
National population censuses are one of the fundamental sources of information on conditions in a country. In addition to the population size, the population census generally contains information on the structural characteristics of a population, such as age, gender, marital status, religion, household status, occupational activity and nationality. In some years the Dutch censuses were combined with an occupational census and a housing census.

The first general Dutch population census was held in 1795 under the Batavian Republic. From 1829 onward, censuses were held every ten years. The 1940 census was postponed until 1947 because of the war. No population census has been held in the Netherlands since 1971 because of growing privacy consciousness (and refusal to take part) among the general public.

Only a limited number of original copies of the 200 or so published volumes of the 1795-1971 Dutch population censuses have survived. These censuses have always played an important part in historical and social-science research. Many of the published census volumes are now in poor condition. Digitisation can help preserve this material while also increasing access to it. The c. 42,500 pages contained in these volumes have now been scanned and are available digitally on CD-ROM. About 35,000 pages relate to the period 1859-1947. About 10,000 pages of the 1899 census have been published electronically and about 5,000 more pages have been converted to digital form but are not yet available. About 20,000 pages of tables still need to be converted in order to make all published population censuses until 1947.

The digital census databases will be important resources for historical and social-science research. National historical census projects have been, or are being, carried out in a number of countries, including the US, the UK, Ireland, France, Norway, Denmark, Germany, Russia and Austria. Some of these projects are based on the original source material, which means the databases can be constructed at the level of the individual.

Full digitisation
The purpose of this part of the programme is to construct and make accessible a comprehensive database containing the 1859-1947 national population censuses (including occupational and housing censuses). The population censuses, which form the basis for so many other statistics, will be a key research resource for social and economic historians, demographers, social scientists and epidemiologists. Secondary target groups are amateur historians, local authorities, the media and education. The project will comprise the following elements:
  • Making accessible the figures from the 1859 and 1930 census publications that have already been input, and publishing them digitally on CD-ROM and the Internet and publishing them digitally on CD-ROM and the Internet; the software for retrieval and access to the data will be StatLine, a package developed by the CBS (Statistics Netherlands).
  • Content conversion (by manual data entry and/or optical character recognition) of the figures from the 1869, 1879, 1889, 1909, 1920 and 1947 censuses (ca. 20,000 pages) and publishing them in digital form. The material needs to be checked and, where necessary, corrected, documented and converted to the retrieval StatLine. Wherever possible, the data will be standardised or uniformalised to enable comparisons over time.
  • A historical database of co-ordinates is available, enabling data at the municipal level to be represented visually for any desired point in time during the past two centuries. The project will also provide additional information for the Historical Database of Dutch Municipalities (HDNG), which is currently being developed and in which the universities of Nijmegen and Amsterdam, the Netherlands Interdisciplinary Demographic Institute and the Historical Sample of the Population of the Netherlands collaborate. A number of existing historical-statistical and demographic databases (the historical-ecological database at the University of Amsterdam, the 'Hofstee' database at the NIDI, and databases at the University of Nijmegen) will be integrated with the HDNG.