// Archives

Text Translation using Azure Cognitive Services Translator

In this blog, we describe how to translate text using some simple Python code and the Azure translator service. Azure Text Translator Azure Cognitive Services Translator is a cloud-based service that enables quick and accurate translation across many languages. The translator service can be used for: Language detection One-to-one or one-to-many translation Script transliteration (text …

Automated anonymisation of texts and transcripts

In this blog, we discuss an automated process for anonymising interview transcripts, patient notes, or other free-text data containing personal information.  Colleagues wishing to share participant notes or interview transcripts, for example as publication appendices or in a research data repository, will likely need to anonymise the data. Anonymisation also comes with a number of …

Train a Custom Computer Vision Model using Detectron2

Training a custom computer vision model to work on your research dataset may be daunting due to the imagined complexities and effort involved. However, some very powerful open-source frameworks, such as Detectron2, have recently been made available to simplify the process. Detectron2 is an open-source framework that implements state-of-the-art computer vision algorithms. Detectron2 comes with …

Better OCR of Research Papers and Newspaper Articles

Optical Character Recognition (OCR) describes a technique whereby computers can read printed text. OCR of newspapers, magazines and research papers has always been challenging due to the unconventional and unusual formatting of the material. Text boxes, images and tables are often overlaid and are difficult (for a computer) to distinguish. Conventional, out-of-the-box OCR software is …

Robinhood and the revenge of the reviewer

Last week saw the share price of the US retailer GameStop rise from US$65 to US$325. This extraordinary jump was fuelled by private investors who have rallied behind the shorted stock, thus pitching themselves against corporate hedge funds.  The narrative of the ‘little man’ battling Wall Street financiers chimed nicely with the name of the …

Learn how to train a Simpsons character classifier at home using Ludwig

Researchers may be wondering what tools are available for them to work on their datasets from home. In this blog, we highlight an open-source toolkit that allows researchers to implement machine learning architectures and deep learning models on their home computers with little to no coding. In this blog post you will learn about the …

An Attempt at Automatically Classifying Political Leaflets Using Ludwig

In this blog, we discuss some innovative AI-driven work being undertaken in collaboration with Prof. Caitlin Milazzo from the School of Politics and International Relations. Ludwig is on open-source toolbox that allows the training and testing of deep learning networks without the need to write any code. In this pilot study we wanted to see …

Trouble Finding Wally? Train a Custom Object Detection Framework to Find Him For You

As part of the Discovery Programme vision, which is to support and empower researchers by providing an advanced digital environment, this blog post aims to introduce researchers to an open source object detection framework that can be re-trained using their respective datasets. Opensource libraries and API’s have made it possible for researchers to easily train …

NHS Hack Days: Big Data on the Frontlines

“The future” William Gibson famously declared, “is here. It’s just unevenly distributed.” Nowhere does this feel more evident than in the NHS. In one part of the building where I’m writing this – the Queen’s Medical Centre – research oncologists are experimenting with Artificial Intelligence (AI) and Machine Learning (ML) techniques to diagnose cancer from …

My fieldwork affair with the Neo-Smart Pen

My research projects are typically ethnographies at settings where carrying along laptops, tablets, cameras and other relatively heavy or expensive devices for data collection is not always feasible, safe or practical. I have used alternatives such as mobile phones as many now incorporate large storage spaces, high-end hardware and apps for taking pictures, recording and …