The world of AI is developing at a rapid pace, with many organisations in the research sector attempting to transition to the use of Automatic Speech Recognition, understandably to attempt to save cost. However, many are finding them to be a false economy.
ASR software has been demonstrated to produce, at best, 80% accuracy with a “vanilla” recording, with humans speaking very clearly and slowly so that speech can be accurately deciphered. In the real world of human conversations however, that is rarely the case and some researchers are turning back to human transcription services, realising that “cheap transcription” is not in fact, cost saving at all.
But just why is it that transcription is so difficult for AI software to get right? What is it about human transcribers that make their transcriptions still absolutely vital?
In three little words, it’s The Human Ear. This amazing piece of kit that we’re all armed with, brings so much to bear when transcribing audio recordings.
- Distinguishing between accents and dialects.
Even if a transcriber is unfamiliar with a particular accent or dialect within an audio clip, they can use their natural ability to understand human language and make sense of it. As long as we understand the base language used, humans can make connections to comprehend unfamiliar sounds within words and sentences. Human transcribers can use this skill to put together a clear and accurate transcription that conveys meaning.
An AI automated software will listen to the recording and try to match the sounds to the closing sounding word in their chosen language – which may not be correct!
- Audio quality and background noise.
The quality of the audio and the amount of background noise can create a problem for both human and AI transcribers.
Human transcriptionists can find the words and conversations amidst the background noise. They can rewind, playback, and go over the recording, section by section to uncover parts that they find particularly difficult to make out – but they can do it!
- Technical and industry-specific terms.
Human transcribers will be experienced enough to deal with different niches. They can Google unfamiliar place names, medical or academic jargon. Human transcriptionists will build expertise in an area , the more they encounter them.
AI software can struggle to understand the lexicon associated with them. Therefore, they will not be able to create a comprehensive transcript from the recording as a human transcriber would.
- Multiple and overlapping speakers.
Humans have the ability to keep up with multiple threads of conversation, all within one recording. The dialogue can move back and forth quickly, with some speakers attempting to cut in, but human ears can differentiate voices and interruption.
An AI software can end up getting confused with multiple layers of conversation, and a professional debate could be translated into nonsense.
Transcribe It is the UK’s longest established human transcription services provider, and for the past 29 years, we have continuously adapted and flexed what we offer to meet the needs of our clients, and we will continue to do so. Please contact us for any transcription queries, we are always happy to help!
You can give us a call on 01992 445411 if you would like to discuss how we can help you with your UK transcription needs.
Alternatively, you can choose to email us your transcription enquiry at email@example.com and we’ll be in touch with a quote as soon as possible.