Python Jupyter notebooks + Pandas, and D3.js
Press Picker code open source licensed on Github
Press Picker visualisation demo (JavaScript) on Observable
D3 JS viz in a Python Jupyter notebook on Github
Press Picker code published, Living with Machines blog
Press Picker: visualising formats and title name changes in the British Library’s newspaper holdings, Living with Machines blog
D3 JavaScript visualisation in a Python Jupyter notebook, Living with Machines blog
British and Irish Newspapers. Dataset credit: British Library Contemporary British and Collections Metadata teams
The British Library holds over 700 million pages of historical newspapers stretching back to the 17th Century. So many, in fact, that they are housed in a dedicated building, with shelving 20 metres high and the newspapers are retrieved by robotic cranes. As these newspapers are digitised to be made available online—a long-term effort, given the scale—, which titles should be digitised first? And how can we balance cost, speed, and historical interest in the decision making? Developed as part of the Living with Machines project, Press Picker is a new tool to support this process.
Press Picker consists of two Python Jupyter notebooks. The first notebook does data filtering and processing. The second visualises the newspaper data. The per-year holdings of undigitised titles (eg. The Times or The Blackpool Herald) are shown as line graphs. These are coloured according to format (hardbound volumes shown in red, or copies on microfilm—a kind of film reel—in black/dashed). Format is important for planning digitisation; the hardcopy volumes take longer and cost more to digitise and there can be delays if conservation is needed. Though some early microfilms were made on a material that has since degraded, and cannot be digitised off.
To complicate things, newspapers sometimes change their name through time. For example, The Athletic Reporter in 1886 becomes The Reporter, which in 1888 becomes The Midland Counties Reporter and General Advertiser, which in 1889 becomes The Reporter and General Advertiser, which in 1890 becomes The Coventry Reporter and General Advertiser. Title name changes are pretty common in this collection, though in the British Library data a new publication name is treated as a separate record. We customised Press Picker, bringing together connected titles with a branching design at the left, so that these relationships are apparent.
Press Picker helps give an overview of the newspaper metadata. Titles can be selected in the tool, and data about these selections exported.
Read about Press Picker and the context behind its creation on the Living with Machines blog. We created the custom visualisation in JavaScript / D3.js, and embedded this in a Jupyter notebook. You can read about how to embed a D3.js visualisation in a Python Jupyter notebook in this blog post and see this demo notebook.
Try a demo of the Press Picker visualisation below: