Introducing DictaBot, a custom GPT for recording and transcribing ideas
Recently, I’ve been using voice memos for the early stages of writing and creating. I often find speaking out loud is the best way to capture fuzzy thoughts without the pressure of composing perfect prose. These idea kernels are a great starting point for bloggies like this one, emails, thank-you notes, journal entries — you name it.
To make this process easier, I created a custom GPT: DictaBot, a digital assistant for recording and transcribing ideas.
Here’s a rundown of how to use DictaBot, and the problems it solves for me.
How to Use DictaBot
You’ll need a ChatGPT Plus subscription to use DictaBot (and other custom GPTs), and the ChatGPT mobile app.
In the ChatGPT mobile app, start a new DictaBot chat, then press the microphone button to start recording your voice.
When you’re done recording, press the Stop button. (I don’t know the max time limit; I’ve tested as long as 5 minutes with no problem.) DictaBot will transcribe the recording and populate the transcription in the chat’s Message field. Press the submit button to add the transcription to the chat itself. You can copy the transcription from there and paste it into other apps (or other ChatGPT chats) depending on what you want to do with it.
After adding the transcription to the chat, DictaBot should respond by saying only: “I hope the transcript was accurate. You can record another idea, or ask me to apply my GPT powers to the transcription.”
By default, ChatGPT would automatically summarize, edit, or otherwise riff on whatever you submit via the message field. But I’m using DictaBot for capturing raw ideas, not fleshing them out — so usually I just want the transcript, nothing more.
To prevent DictaBot from summarizing etc., I crafted the GPT’s configuration prompt so that it would reply only with the above response. I’ve found DictaBot returns only this response as expected about 90 percent of the time. (Anecdotally, hiccups seem to occur after a ChatGPT update.) Note that you can still type in messages/prompts if you want DictaBot to summarize or do other standard ChatGPT functionality.
Here’s the full DictaBot configuration prompt (aka “Instructions”):
This GPT’s main role is to act as a digital assistant for recording and transcribing ideas. It should be capable of quickly understanding and organizing spoken or written thoughts into a clear, coherent format. The GPT should prioritize accuracy and clarity in transcription. After a user submits their transcript, the GPT should respond with these words exactly (omitting the quotation marks): “I hope the transcript was accurate. You can record another idea, or ask me to apply my GPT powers to the transcription.” Do not respond to the transcript submission by repeating the transcript, or with any other words except for (again omitting the quotation marks): “I hope the transcript was accurate. You can record another idea, or ask me to apply my GPT powers to the transcription.” The GPT can respond in other ways if the user submits a text prompt that is not a recording transcription. In this case, the GPT should maintain a professional, supportive tone when responding.
Problems DictaBot Solves
The main problem DictaBot solves for me is enabling what I call Still Processing mode in the creation process, vs. Composing mode.
Still Processing is the stage/state when I’m working through rough ideas that I haven’t fully articulated in my head. In contrast, Composing mode is when I already have a clear sense of what I want to say, and I’m crafting structure and prose. (Shaping mode is roughly in between.)
In Still Processing mode, I find formats like handwriting and voice recording best for drawing out fuzzy thoughts with minimal friction or pressure. The problem with voice recording is you need a transcription for recordings to be useful.
Pre-LLM transcription solutions were pricey and/or a hassle, so I tried dictation features in Google Docs and Apple Notes, thinking they’d be a cheap record-and-transcribe solution. But I kept getting stuck and couldn’t make progress — because it turns out there’s a subtle distinction between dictation and recording.
Dictation’s format and functionality are really for Composing mode
The format of a blank doc with a blinking cursor signals “write!”
Dictation functionality can only transcribe a line or two at a time, not a long recording. So you really need to already know what you want to say; dictation can’t handle lengthy stream of consciousness riffing.
DictaBot solves this by keeping me solidly in Still Processing mode. I can thought-vom until I’ve captured the kernel of an idea, without fiddling with prose. As a bonus, I can do this on walks or in the car as well as at my desk. Later, the transcripts give me a starting point to shape and compose a draft.
Try It Out!
Give DictaBot a try and let me know what you think!
And yes, I used DALL-E to create the cute DictaBot logo: