April 7, 2017, by Stuart Moran

Emerging Research Opportunities from Digitised Newspaper Data

The Digital Research team are facilitating the digitisation of some paper-based research data in the form of a newspaper article archive.

We have hired an administrative assistant who has made a great start at scanning in the thousands of articles that were collated across the last decade. Roughly speaking it is taking about 1 hour to scan, quality check and rename 100 newspaper clippings. Early estimates are that it should take about 4 more weeks to scan all of the data, with about 1/5 of it completed so far.

The great thing is that this 1/5 is immediately usable, and the benefits are already starting to emerge. Meeting with Christian Karner earlier today I helped set up the software he will need to search through the digitised data. It was really fascinating to see how his thought process about the data was starting to change as he was searching for different key terms. For example, he was curious about the word ‘Church’, thinking it would be common across the articles, but only 6 instances popped up. He was also particularly intrigued about specific places or countries being named, which became apparent when looking at the frequency of different words.

This led us on to a discussion about how a map could be formed once all the data is scanned, showing different places that have been referred to, under what context and over what time period. It was great to start seeing the new digital opportunities emerge. In fact, the hope is that the digital archive will help attract new student projects and internal collaborators to help analyse the data.

Our next immediate challenge is to keep the momentum going on the scanning of the data. Please note, this work is being conducted in full consideration of the copyright principles of “fair dealing/usage”. There is only one clipping per newspaper being scanned, with only one digital copy being stored, and the archive is to only be used privately for non-commercial personal research.


Next blog in series: Digitally Preserving the Hennessey Collection

Previous blog in series: Xerox Printers as a Research Tool 

Stuart Moran, Digital Research Specialist for Social Sciences


Posted in Digital Initiatives