Skill Detail

Whisper Diarization Post-Processor

Enhances OpenAI Whisper transcription output with speaker diarization using pyannote.audio pipeline and speechbrain embeddings. Aligns word-level timestamps from whisper-timestamped with speaker segments for multi-speaker meeting transcript generation.

Media & TranscriptionClaude Code

Media & Transcription Claude Code Security Reviewed

Tool match: whisper ⭐ 97.8k GitHub stars MIT license

INSTALL WITH ANY AGENT

npx skills add agentskillexchange/skills --skill whisper-diarization-post-processor Copy

Works best when you want a reusable capability, not another fragile one-off prompt.

View source

At a glance

Last updated

Mar 24, 2026

Quick brief

The Whisper Diarization Post-Processor enhances raw OpenAI Whisper transcription output by adding speaker identification and precise timestamp alignment. It combines state-of-the-art speech recognition with neural speaker diarization for production-quality meeting transcripts.

How it works

What this skill actually does

Overview

Key Capabilities

This skill processes Whisper output through the pyannote.audio speaker diarization pipeline, using pre-trained speaker embedding models from speechbrain for voice characterization. It aligns word-level timestamps from whisper-timestamped with speaker segments using an optimal assignment algorithm that handles overlapping speech regions.

Output Formats

Generates formatted transcripts with speaker labels, timestamps, and confidence scores in multiple output formats including SRT, VTT, and structured JSON. Supports custom speaker name mapping through voice enrollment, handles multi-language transcription with automatic language detection, and produces analytics summaries including per-speaker talk time ratios, interruption counts, and topic transition markers for meeting intelligence dashboards.

Best fit

When to reach for it

Best when the job fits Media & Transcription.
Works naturally with Claude Code setups.

Trust & provenance

Why this listing is credible

Built around the whisper toolchain.
Trust status: Security Reviewed.
97.8k GitHub stars on the linked upstream source.
License: MIT.
Last updated Mar 24, 2026.

View source ↗