Introductioni
It can be difficult to grasp the scale of newspaper publishing in the United Kingdom. Taken as a whole, the huge and diverse production of newspapers since 1700 provides an enormous resource for research on all subjects for all of the UK, both urban and rural. Newspapers have a long history of being copied after publication. For decades, even hundreds of years after their publication, researchers of all kinds, all over the world, may turn to newspapers for information relating to their needs. For those libraries that have collected newspapers (particularly national libraries), the need to provide ready access to newspaper texts has posed a dilemma, given the often poor quality of the paper the newspaper was printed on. The need to prevent undue wear and tear upon the paper has provided impetus to copying texts, to allow continuous public access.

Systematic collection of newspapers at the British Museum (the precursor of the British Library) did not really begin until 1822. At that time publishers were obliged to supply copies of their newspapers to the Stamp Office so that they could be taxed. In 1822 it was agreed that these copies would be passed to the British Museum after a period of three years. From 1869 onwards newspapers were included in the legal deposit legislation and were then deposited directly at the British Museum. This systematic application of legal deposit requirements means that many thousands of complete runs of newspapers have accumulated. The majority of newspapers collected are those published since 1800.

Today, British Library Newspapers aims to acquire all newspapers published in the United Kingdom and in the Republic of Ireland, including free newspapers, and our collections of newspapers from the British Isles are the finest in the world. Currently about 2,600 UK and Irish newspaper and weekly/fortnightly periodical titles are received, which represents about 90 percent of current acquisitions. This includes the main London edition of the national daily and Sunday newspapers, and free newspapers, with the exception of those consisting entirely of advertising.

The newspaper collections in the British Library at Colindale, London, occupies some 45 kilometres (or 28 miles) of shelving:

  • more than 664,000 bound volumes and parcels, on 32 kilometres (20 miles) of shelving;
  • and over 370,000 reels of microfilm, on 13 kilometres (8 miles) of shelving
  • More than 52,000 separate newspaper, journal, and periodical titles are currently held in British Library collections at Colindale.ii

In the past fifty years, microfilm has been the preferred medium for copying newspapers. The durability of the photography is well established, provided that appropriate procedures are followed in the creation of the film, in its processing and copying, and the storage of all copies of the microfilm. When these procedures are followed, microfilm negatives are likely to last for more than 200 years. Of course, microfilm has its limitations, some of which also apply to the original paper medium: only one person can use a film at a time; users have to load the reels of film onto reading machines; pages have to be read sequentially (or, the skipping of pages must be done in sequence); the entire contents of an article, or of a page, has to be read, if the user is to avoid missing relevant information; and paper copies of an article, a photograph, or of a page, must be paid for.

Digitisation of Newspapers: Some Problems
Since 1999, the pace of technological change has been rapid. It has become possible to have information indexed from all kinds of sources. A practical distinction has emerged between current newspapers, which are "born digital", and newspapers that were created in the letterpress era, which may now require conversion to digital formats, using scanning equipment. It is possible now to plan to have microfilms scanned digitally, with the digital images capable of further manipulation by software to permit a good degree of readability—with the texts also being made searchable by the use of optical character recognition (OCR) software.

Huge efforts are being invested in the development of software to perform these functions automatically, rather than to link searches for text via keywords that are typed manually. Texts created digitally using a computer have been the simplest for the application of such OCR software. On the whole, texts in manuscript or in letterpress have been more difficult for software to deal with, as the variables that exist in handwriting, and also in older printed texts which are significant. These must be defined to enable optical character recognition software to work successfully.

The new technologies have made it possible for libraries such as the British Library to plan to digitise large quantities of runs of older newspapers. In the first instance, it was decided to digitise newspapers published before 1900. This was done to minimise the difficulties that arise with copyright in the publication of newspapers, which can last for 70 years or more. The British Library sought funds for this work, and the Joint Information Systems Committee (JISC) awarded funds in 2004 for the digitisation of up to 2 million pages of newspapers.

Early in 2004, the British Library secured funding of 2 million pounds from JISC.iii Under their Digitisation Programme, JISC enabled a small number of large-scale digitisation projects that would bring significant benefits to UK Further and Higher Education communities, one of which is the British Library Newspapers 1800—1900 (BN) project. iv, v

Aims of the project: The overall goal of the BN project is to provide a mass of historic newspaper content on the web for full text searching by UK academic and further-education communities. The main aim has been to digitise up to 2 million pages of out-of-copyright UK printed material, consisting of regional and local newspapers, the majority digitised from new microfilm, and to offer free access to that collection via a sophisticated searching and browsing interface on the web. The project plan has included the following areas:

1. The balance of new filming is 90% of all pages copied, to enable consistency of images.

2. Filming one page per frame for optimum digitisation.

3. Introduction of an in-house Quality Assurance Team to prepare and repair the volumes, collect both issue level and condition level metadata and filter out duplicates and variants; and identify missing pages, issues, and the last timed edition at the start of the project.

4. Placing an academic User Panel at the core of the project to steer the selection of newspapers and advise on the website design.

The project aimed to deliver the following—the scanning of the entire microfilmed content; article zoning and page extraction; OCR of the page images; and production of the required metadata. Searching online will include searching by names and dates, obituaries, advertisements, and regional and local perspectives to national news.

The selection made for this database is but a small fraction of the newspapers available in the British Library collections for digitisation. Nevertheless, there is a great wealth of information contained in the pages that you will find here. The importance of this selection will be immediately obvious to those already interested in researching nineteenth century history, and using newspaper to do this. Upon further examination in the online environment, it is likely that researchers will find significant differences of opinion reported by different newspapers. This will be especially so in the field of political reporting. At this time of development of the digital resource from the British Library's huge collections, what is presented here is only a portion of what could potentially be made available. To be truly comprehensive, all of the newspaper titles of the Library will need to be digitised, and also some additional ones held in other libraries in the UK. This must remain an ultimate goal for now; however, the appeal of the selection made for this project will be very wide. Researchers involved in the fields of: sociology, economics, religion, political science, genealogy, and literature—will all find great quantities of text to capture and retain their interest, and the results of this endeavour will enhance their work. It is more than likely that some individual issues of the newspapers selected will be very hard to find elsewhere. Newspapers then (as now) covered all subjects, often in minute detail. They found a ready audience, a mass audience, after they became significantly cheaper from the 1850s. They encapsulate the enormous variety of life, in matters both small and great.

It is this very juxtaposition between small and momentous events, presented in different newspapers published at the same time—invariably accompanied by large quantities of advertisements, in themselves fascinating as a social record—that is the unique appeal of these texts. In the second half of the nineteenth century, illustrations became increasingly important in the both the composition of newspaper articles and as stand alone items that caught the attention of the reader. The impact of illustrations must have greatly assisted sales. Nor should we forget the scale of the physical achievement in the production of a newspaper. The text all had to be set by hand, laid out in page format, and proofread—all to frequently tight timetables. When one considers the much more limited technology at the disposal of newspaper publishers at this time, one marvels at the scale and enormity of the enterprise.

NOTES

i A fuller version of this Introduction appears in: The Serials Librarian 49, no. 1/2 (2005), pp. 165-181.

ii More information about the British Library Newspapers can be found at: http://www.bl.uk/collections/newspapers.html

iii The JISC agreed to support the project from April 2004 to December 2006 at a total cost of 2,022,131 pounds. See: http://www.jisc.ac.uk

iv My thanks to Jane Shaw, Project Manager, British Library Newspapers 1800-1900 project, for permission to reproduce passages from her paper: 10 Billion Words... This was read at the 71st IFLA General Conference, Oslo, 2005. See session 97: http://www.ifla.org/IV/ifla71/Programme.htm

v For details of the project online, see:http://www.jisc.ac.uk/whatwedo/programmes/programme_digitisation/digitisation_bln.aspx

CITATION: King, Ed: "Digitisation of British Newspapers 1800-1900." British Library Newspapers. Detroit: Gale, 2007.

          

DISCLAIMER

Any views and opinions expressed in these essays are those of the author in question, and any views or opinions from the original source material are those of the publication in question. Gale, a Cengage Company, provides facsimile reproductions of original sources and do not endorse or dispute the content contained in them. Author affiliation and information within them are correct as of the original publication date.

These essays, unless otherwise stated, are © Gale, a Cengage Company. Further reproduction of this content is prohibited.