Medical transcription is often seen as one of the more mundane tasks that need to be done in the doctor’s office. Yet, it’s vitally important for making sure that medical records are accurate, and that all of the physician’s observations, orders, and conversations with patients is properly documented.
Google wanted to see if the voice recognition technologies already available in Google Assistant, Google Home, and Google Translate could be used to automate the transcription process and help doctors, as well as medical scribes, take notes more quickly. In a recent proof of concept study, Google developed a system that utilized two automatic speech recognition models, a Connectionist Temporal Classification (CTC) phoneme-based model and a Listen Attend and Spell (LAS) grapheme-based model, and trained them with over 14,000 hours of recorded speech. The result was a pretty respectable word error rate of 20.1% for the CTC model and 18.9% for the LAS model, although the CTC model required the researchers to clean up noise in the recordings before processing it.
Based on the favorable results, Google will be soon start working with physicians and researchers at Stanford University to investigate what types of clinically relevant information can be automatically extracted from medical conversations to reduce the amount time doing documentation and increase productive time with patients.
Journal article in arXiv (PDF): Speech Recognition for Medical Conditions…
More info at the Google Research Blog: Understanding Medical Conversations…