A new speech engine and a major new capability: Hedy can now identify who is speaking and show it in the transcript.
Speaker-Attributed Transcripts
Transcripts now show who said what. Each speaker gets a unique color and label, and the transcript is displayed turn by turn — grouping what each person said together. This makes it much easier to follow multi-person conversations.
How Speaker Identification Works
- During recording: Speaker labels appear in real-time as the conversation happens
- After recording: A more accurate pass runs on the complete audio, refining speaker assignments for the best result
- In summaries and notes: Anonymous labels like "Speaker 1" are cleaned up — only named speakers appear
Parakeet TDT v3 Engine
The speaker identification is powered by a new speech engine based on NVIDIA's Parakeet TDT v3 model, which also offers improved transcription accuracy compared to Whisper.
How to enable:
- Open Settings and go to Transcription
- Select "Parakeet (Beta)" from the speech engine dropdown
- Download the required models (a progress indicator shows each stage)
- Start a session as normal
Parakeet also works when importing audio files, with full speaker diarization applied to the imported audio.
Beta notes:
Currently available on macOS with Apple Silicon only. You can switch back to Whisper at any time in Settings.