Manual Transcription Tools?

Hi,

I’m still looking for manual transcription tools. Really at a loss. I can’t type nearly as fast as the recordings run, so I have to stop, type up what I can, rewind, start, etc.

If I’m doing everything on the computer, I have to switch focus between apps, which further slows things down. If I use a separate cd player, it’s slightly better.

I’ve tried using automatic transcription tools, but they struggle with background noise, interruptions, numbers, names, sentence fragments, etc. and seem to turn everything nto an extended nonsensical monologue.

I’m wondering if there are audio tools, preferably for Linux, that would let me type the text, and stop and resume the recordings, without switching focus, would let me slow down the recordings, and would make it easier to back up without getting lost.

I found some apps which vary playback speed, mainly for musicians. It made it harder to understand speech, so wasn’t an improvement.

After someone suggested “Whisper,” I tried that in Speech Note; it worked a lot better than the other models I’ve tried.

I can use that as a starting point, and then go over the results by hand.

And as an aid to manual transcription:

https://speechandtech.eu/tech-tools/transcription

I know this thread is old but just in case, here’s what I use:

I usually use Azure Cognition (well, now they renamed it as Azure Ai, because OF COURSE) Speech to text tools. They’re not the cheapest and of course are cloud based, but they handle better “accented english” and “foreigner english” than other tools, the support for spanish and catalan is leagues better (but that may or may not be relevant to you), and for interviews and meetings they even support voice tagging (so basically they assign a tag to every different voice they hear so you can select which ones you want).

It may not be for you (for example, I don’t know if there’s a ready made tool using azure, I send the packages using python scripts in linux, and also of course is the whole cloud based, microsoft owned, etc)

Tried whisper but did not liked it very much.

3 Likes