AI Speaker Diarisation
Split a transcript by speaker. Useful for interviews and podcasts — labels every line "Speaker 1 / Speaker 2 / ...".
| Mode | Speed | Quality | Best for |
|---|---|---|---|
| VAD + embeddings | Fast | Who-spoke-when labels | Interview/podcast diarisation |
AI Speaker Diarisation needs the standard AI tier (0.1 GB model). You’re currently opted into none.
Quick start
Drop audio files here — add as many as you like
or click to browse · processed one at a time · output srt, json
Tier opt-in required before this capability runs.
Want this capability inside the main editor with layers, history, and the full AI panel?
How it works
- 1
Add your audio
Drop or select the audio you want to process — it stays on your device.
- 2
Run the model in-browser
AI Speaker Diarisation loads its model (~100 MB) once, caches it, then runs locally in a worker. No upload.
- 3
Download the srt
Preview the result and download the srt. Re-run with different settings anytime.
Common use cases
Why it’s different
100% Private
Every model runs in your browser. Your files never leave your device — nothing is uploaded to a server.
True Alpha Channel
Exports preserve a real straight-alpha transparency channel (PNG / WebP / AVIF), not a baked-on background.
Free Forever
No account, no watermark, no credits. Open the tool and use it.
Works Offline
After the model downloads once it is cached, so the tool keeps working with no connection.
FAQ
How many speakers?
It separates voices using activity detection + speaker embeddings; best with a few distinct speakers.
Is it free to use?
Yes — AI Speaker Diarisation is completely free. No account, no watermark, no credits, and no usage limits.
Do my files or prompts ever leave my device?
No. Everything runs locally in your browser via WebAssembly/WebGPU — there is no server that receives your files, prompts, or results.
Which browser and hardware do I need?
A modern browser. Chrome and Edge get WebGPU acceleration for the fastest results; Firefox and Safari run via WebAssembly. The model (~100 MB) downloads once, then is cached for offline use.
Can I use the results commercially?
Yes. You own everything you create — NSS makes no claim to the images, videos, or text you process or export.
Does it work on mobile?
Lightweight tools run on phones; heavier models prefer a desktop with a GPU. The tool picks the best path for your device and falls back gracefully where needed.
Where can I see a step-by-step guide?
Yes — there is a full walkthrough at /how-it-works/ai-speakers.
Ready to try AI Speaker Diarisation?
Free, private, no signup — runs right in your browser.