July 6, 2020, by The Digital Research Specialist Team
Automated Transcription Service – from idea to launch
This post reveals more about the Automated Transcription Service and discusses a rapid-response “early access” trial for researchers impacted by COVID-19 restrictions.
Do you record people’s speech as part of your research? Do you transcribe this for analysis or pass on the audio for others to transcribe? Our researchers will soon be able to use the University of Nottingham Automated Transcription Service to upload MP3 audio files and generate a text transcript of the speech. In comparison to human transcribers:
- The service is fast, with the transcription process typically being shorter than the duration of the audio. A 10-minute segment of audio will generally be transcribed in under 10 minutes.
- The service is secure, with the data processed and produced by this service staying on secure storage within the EU, helping you comply with GDPR data protection legislation.
- The service is cost effective, costing significantly less per hour of audio (around £0.75 per hour when the service is launched) and costs are fully recoverable under Research Council funding guidelines.
Why automate transcription?
Since 2016, the push to understand and comply with the EU General Data Protection Regulation (GDPR) has thrown our data-handling practices and those of our research partners into the spotlight, revealing plenty of room for improvement around transcription. A Digital Research blog post from 2017 outlined the gap between some of our practices and the high standards expected in the new GDPR world. It highlighted the fact that transferring sensitive audio to another person, e.g. an external transcription company, was creating risk.
As well as making our research quicker, we were interested in using automation to reduce this risk: by automating the task of transcribing, no-one except the researcher ever needs to hear the audio.
Developing an automated transcription service
By late 2019, the first functional version of the University of Nottingham Automated Transcription Service (ATS) had been built and was ready to pilot. Over forty researchers helped put the service through its paces, uploading MP3 recordings of speech with different accents and varying audio quality. While the service did make mistakes – particularly where audio quality was poor or multiple speakers spoke simultaneously – its speed surprised many pilot testers.
“Overall, I think it’s worked well, it transcribed [the audio] quickly and it didn’t take me too long to go back through it and amend any errors. It’s a great tool, thanks for letting me use it!”
“I was very impressed that it picked up on my accent. Overall, I think it is great, it makes transcribing much simpler and faster. Thank you for involving me in the pilot. It saves so much time!”
“The service significantly helped me to speed up interview transcription … this is a life-saver!”
Our pilot testers provided valuable feedback, and have given lots of constructive input into the ATS guidance now available in draft form. This SharePoint page includes an example of transcript produced by the service.
So where are we now? The ongoing COVID-19 pandemic has sparked another shift in the way we think about conducting research: our spending is restricted, and we are now engaging with partners and participants online, rather than face-to-face. Instead of putting research on hold, the Automated Transcription Service will allow us to adapt and continue transcribing. It will also provide researchers with an opportunity to revisit audio they didn’t previously have the time or funding to transcribe. And it is a quick and easy way to jump into qualitative research methods if that hasn’t previously been part of your repertoire. The ATS pairs well with tools such as Microsoft Teams or Skype for Business, allowing us safely and quickly to work with remote participants and analyse transcripts. We aim to launch the service for all researchers over the summer.
Prior to the general launch, we have introduced a final cohort of researchers into a test of the service. These researchers have all seen their ways of working impacted by the pandemic: in some cases, it has been impossible to work with the companies that would once have transcribed for them; in other cases, they have shifted their research methods towards interviews and now need a way to turn audio into text for analysis. ATS is making research possible for this cohort, and will be available for general use in a few months. If you have funded research requiring transcription that is currently being blocked or delayed due to COVID-19, please contact a Digital Research Specialist.
We also hosted a webinar on 13th July, including a quick demo and a panel discussion featuring “early access” users. UoN staff and students can watch a recording of the webinar via the Digital Research Stream channel. Details of other Digital Research webinars can be found here.