self-hosted web application for painless speech-to-text transcription of audio files (Spring 2016)
Technologies used: Go (for concurrent transcription request), Sphinx (to transcribe audio files for free), Backblaze (to store audio files in the cloud after transcription is complete), IBM Speech-To-Text API (to transcribe audio files), MongoDB (to store transcription information (such as timestamps, confidence, and keywords)
About the Project
“One of our greatest challenges in tracking topics is dealing with the sheer volume of content generated online. The most tedious of these are the audio and video because there is no effective way to quickly review the material without losing crucial detail.”
The best-case usage scenario will allow organizations to feed the application audio and video content stored on hosts like YouTube, SoundCloud, and other sources of audio and video (in a variety of formats) in order to generate transcriptions that can be fed into existing analytical engines.
Spring 2016 Team members:
Transcribe audio given an audio url and optional search word, using free Sphinx library or IBM Speech-To-Text API
Send transcription to given email
Perform concurrent transcriptions
Store audio files in the cloud after transcription is complete
Store transcription information (such as timestamps, confidence, and keywords)
A big technical challenge was providing automated audio transcription and storage that is fast, accurate and customizable. We researched and tested different technologies to provide that functionality.