Why You Should Choose Human Transcription Services

Audio Transcription Services

It’s easy to assume that with the rapid advancement of Automatic Speech Recognition [ASR] software, the human transcription services industry faces uncertainty and decline.

However, with the current accuracy rate between only 3% to 80% [PC Magazine Survey, 2019], here are some compelling reasons why we believe that choosing humans for an audio transcription service will always remain an important option.

Accents and Dialects

Interviews and focus group discussions can contain a huge variety of diverse accents and dialects.  Automated transcription software struggles to cope with this aspect of audio transcription.

Human transcribers are continually exposed to varying accents and dialects.  Their discerning ear can listen closely to a digital audio recording, accurately deciphering and distinguishing different inflections, enunciations, twangs and drawls.

Research, Jargon and Acronyms

ASR will only recognise words and phrases that it has specifically been trained to recognise.  Highly technical vocabulary, slang, made up terms and acronyms may not be recognised.

Human transcription services will ensure they research any industry jargon, acronyms and relevant vocabulary to captured them accurately.  They have the grammar and punctuation skills that are imperative to producing highest quality transcripts.

Multiple Speakers

ASR cannot produce accurate transcripts when there are more than two speakers.  Throw in overtalking, quietly spoken respondents and a lively focus group discussion or conference call and you end up with a hodge podge of a transcript.

Humans offer an audio transcription service that not only distinguishes but identifies multiple different speakers within a digital audio recording, transforming audio into text in a professional, intelligent manner.

There, they’re, their

One of the quickest ways to tell if an audio transcript has been produced by ASR is to check the homophones.  Homophones are two or more words that have the same pronunciation but different meanings, origins, or spelling and ASR relies on sentence structure to predict the most probable word to use.

High quality human transcription services don’t just hear a discussion, they listen to it, using their expertise to produce a high class, quality transcript that is a clear written record of the discussion that has taken place.

Bespoke Levels of Service

ASR delivers a best effort full verbatim transcript, that struggles with speaker identification, punctuation and grammar, accents and dialects and multiple speakers. It cannot incorporate special instructions, [omitting introduction, tea breaks and “what a lovely dog” side-line discussions, for example].

The beauty of human transcription services is you can choose from any number of service levels and options:

Intelligent verbatim transcription:

Intelligent verbatim includes everything that is said but fillers, repetitions and stumbling over words are omitted.  Produces an easy read transcript, where you can be certain that no content is missed out and is a true reflection of the audio recording:

I did a degree in criminal justice and psychology and then after that, got a job working for a drug and alcohol charity.  I was based in a Probation Service and I was a case worker for people who are on probation and needed support with their  drug and alcohol use.”

Strict verbatim:

Everything that is said and how it is said, often used by academics and psychologists who wish to analyse every pause, sigh and hesitation.

“I-I did a degree in criminal justice and psychology and then erm, after that, after that I g-got a job working for a, a drug and alcohol charity.  I-I was based in a Probation Service.  [Pause] I was a, a case worker – for um, people who are on probation and need- needed support with their er, drug and alcohol use.”


Provides a more concise read, summarising the audio.   Dialogue is still transcribed and some verbatim quotes are included, but repetition and non-relevant discussion is omitted and some sentences edited.

I did a degree in criminal justice and psychology and got a job working for a drug and alcohol charity based in a Probation Service.  I was a case worker for people on probation, who needed support with their drug and alcohol use.”


Under Data Protection legislation and GDPR, voice recordings are considered personal data.  How and when to anonymise qualitative interviews is a growing issue for many researchers.

Human transcription services can be a valuable tool in anonymising qualitative data, recognising when and how to strip identifiers from the transcript, and identify a distinctive event or combination of descriptions that could make somebody recognisable.


In summary, human transcription services remain an important and relevant option for many industries and sectors.   ASR has its place and will no doubt improve in accuracy, but will it ever be able to offer a bespoke, personalised service that replicates the human ear?  Not yet, that’s for sure.

