This is taking
longer than expected.

Check your connection.

Still tinkering with this one
Back to Projects
a terminal app i made for fun

tinytalk

Push-to-talk transcription that lives in your terminal. Hold space, talk, get text. It runs Whisper right on your machine, so nothing you say ever leaves it. I built it to mess around with on-device models and curses, and to see if I could make a terminal app that actually looks good.

Version v0.3.0
Language Python
Cloud calls Zero
Scroll to explore

Here it is, running

A quick capture of the real thing mid-transcription.

tinytalk transcribing speech live in the terminal

Pick a model, any model

There are five to choose from, and you just tap m to cycle through. It'll tell you which ones are installed, cached, or warmed up and ready to go.

TINY fastest, rough edges cached
BASE quick notes, short clips cached
SMALL the everyday pick cached
MEDIUM long-form accuracy not installed
TURBO large-model quality, near-realtime hot

The bits I had fun with

A few parts of this I really enjoyed building, so here they are.

01

Making it run anywhere

I didn't want to write this thing three times, so it picks a backend depending on what you're on. Mac runs Whisper through MLX, Windows and Linux use faster-whisper, and it grabs the GPU if there's one going. The annoying part was hunting down models already sitting on your machine, and Windows being Windows about basically all of it.

MLX · faster-whisper
02

Everything stays yours

I really like the idea that what you say just stays with you, so every transcript gets locked up (AES-256-GCM) before it ever touches the disk. You can flip back through your last five with [ and ], and scroll the long ones with the arrows.

AES-256-GCM · local log
03

The waveform is my favourite part

When you talk, that live waveform is drawn in pure curses out of box-drawing characters (with a plainer ASCII version for terminals that throw a fit). This is the part I kept coming back to. I wanted it to feel really good to watch, and I think it does.

curses · RMS meter
04

It does files too

It's not only live. Point it at something with --input file.mp3 and it runs that through the exact same pipeline. There's also a little dev overlay that shows timing, the realtime ratio, and word counts, which is genuinely fun to watch while it works.

--input · dev overlay
why i made it

I wanted to play with a few things at once.

Mostly I wanted to mess around with on-device Whisper models and learn curses properly, and see if I could make a terminal app that actually looks nice instead of just functional. Everything staying on your machine was the whole point, so the encryption came along naturally. It's the kind of thing I build because it's fun, and it just happens to be useful too.

The stuff it's made of

Python
curses
MLX Whisper
faster-whisper
sounddevice
cryptography
having fun?