ADHD · Accessibility

The ADHD survival guide to voice messages.

If you keep deferring voice notes for days and feeling guilty about it, it's not a character flaw. It's working-memory load.

Maksim Shin· April 18, 2026· 6 min read

This post is personal. I have ADHD. I failed at voice messages for two years before I did anything about it. My wife sent me four-minute voice notes on WhatsApp. I'd see them, feel a flash of pre-emptive exhaustion, swipe away, and then never come back. Repeat. Relationship tax compounding.

I'm going to try to explain why voice messages are uniquely hard on ADHD brains, and describe the workflow that finally fixed it for me.

Why voice notes are harder than text

Three reasons, stacked.

1. Working memory is linear and costly

Listening to a voice message is a strictly linear, time-locked task. You cannot skim. You cannot re-read paragraph three without re-listening to paragraphs one and two. Every second the speaker doesn't say the point, you're holding the open question "what do they want?" in working memory.

Typical ADHD working memory research (Kofler et al., 2018; Alderson et al., 2013) shows sustained auditory working memory is one of the most reliably impaired executive functions in ADHD. A four-minute voice note is a four-minute working-memory tax. Text is not.

2. No "skim first" affordance

Before I read a text message of any length, my eyes involuntarily scan for the point. "When are we meeting?" — two seconds and I have it. With a voice note, there is no equivalent. You either listen to the whole thing or you don't know.

This is why a 3-line text summary of a 4-minute voice note isn't a convenience — it's a different cognitive task entirely. Reading the summary is the skim that voice messages structurally deny you.

3. Initiation cost

ADHD isn't primarily an inability to focus. It's primarily an inability to start. A voice message requires active initiation: find headphones, ensure environment is quiet, commit 3-5 uninterrupted minutes. Text requires none of this. You just… read it.

Every time you see that 4:17 voice note and swipe away, your brain is doing a cost-benefit calculation and correctly concluding "not now." The problem is that "not now" becomes "not today" becomes "oh god it's been a week" becomes guilt.

What doesn't work

"Just listen to them faster." Speed-listening works for podcasts with redundant content. It doesn't help with a voice note that's already terse — you're speeding through something that already requires full attention, which is the opposite of what you need.
"Schedule a 'voice notes' time block." You won't. Same mechanism that makes you defer each note individually will make you defer the block.
"Ask people to text you instead." Some will. Most won't. You're trying to renegotiate a cultural norm unilaterally.
"Just deal with it, others don't struggle with this." This is the one that kept me stuck for two years. It's not actually true, and even if it were, telling an ADHD brain to "just deal with it" is exactly the intervention that never works.

What works

Remove the initiation cost. Shift the task from linear audio to skimmable text. That's the whole mechanism.

My workflow now:

Voice message arrives in WhatsApp, Telegram, or iMessage.
I long-press the bubble → Share → VSkip.
Three seconds later I have a 3-line summary and a list of extracted action items.
If I need more detail, I tap to expand the full transcript and skim it in 10 seconds.
If there's ambiguity I'd have asked about, I use the AI follow-up chat instead of re-listening.

Total time per voice message: under 10 seconds. Total guilt: zero.

I built the app that does this because I couldn't find one that specifically targeted this workflow. Otter and Rev are for meeting recordings. iOS 26 transcription is for iMessage only. Nothing existed for the exact case of "consume other people's voice notes from any messenger, fast, as text."

For your workflow, specifically

You don't have to use VSkip. The mechanism is the point. If you can find any way to transform an incoming voice message into skimmable text before you consume it, you've cut the cognitive load 10x. Options, in order of friction:

VSkip — purpose-built for this, Share-sheet workflow, ~3s.
iOS 26 Transcribe action on iMessage — free, on-device, but only works in iMessage and only gives you full transcript (no summary).
Otter.ai — need to manually upload each audio file, export is slow, but good if you have lots of meetings to transcribe anyway.

The key is that any of these is better than the cycle of swipe-defer-guilt. Pick whichever has the lowest friction for you and commit to it for two weeks.

Try the workflow

VSkip's 7-day free trial is 2 summaries a day. Enough to test the pattern on your real voice notes for a week.

Download on the App Store iOS 26+ · No account · OpenDyslexic font support built in