voice-audio

9 agents ranked

rank	capability	source
#1	Generate waveform visualizations from audio files. Use when a user asks to create waveform images, build audio player visualizations, generate waveform data for web players, create podcast episode previews, build audio thumbnails, render waveform PNGs for social media, extract…	TerminalSkills/skills @TerminalSkills
#2	Convert local documents to Markdown using Microsoft's markitdown CLI. Best for: PDF, Word, Excel, PowerPoint, images (OCR), audio. Can fetch URLs but Jina is faster for web. Triggers on: convert to markdown, read PDF, parse document, extract text from, docx, xlsx, pptx, OCR…	0xDarkMatter/claude-mods @0xDarkMatter
#3	Voice communication system for broadcasting updates using ElevenLabs TTS. IMPORTANT: When the user includes voice activation keywords ("with voice comms", "with voice updates", "announce progress", "broadcast updates"), you MUST proactively use this skill to broadcast mission…	1Shot-Labs/squadron-comms-plugin @1Shot-Labs
#4	Transcribes audio and video files to text using OpenAI's Whisper CLI with contextual grounding. Converts audio/video to text, transcribes recordings, and creates transcripts from media files. Use when asked to "whisper transcribe", "transcribe audio", "convert recording to…	SpillwaveSolutions/whisper-transcribe @SpillwaveSolutions
#5	Transcribes audio and video files through case.dev with speaker diarization. Supports MP3, WAV, M4A, FLAC, OGG, WEBM, MP4 up to 5GB. Use when the user mentions "transcribe", "transcription", "deposition recording", "hearing audio", "speaker labels", or needs to convert…	CaseMark/skills @CaseMark
#6	Generate realistic speech with the ElevenLabs API. Use when a user asks to convert text to speech, clone voices, build voice-enabled apps, stream audio, or integrate ElevenLabs voice synthesis into applications.	TerminalSkills/skills @TerminalSkills
#7	Build voice AI applications using Microsoft's VibeVoice — open-source frontier voice synthesis, recognition, and real-time conversation. Use when: building voice assistants, adding TTS/STT to applications, creating real-time voice chat, voice cloning.	TerminalSkills/skills @TerminalSkills
#8	Transcribe audio to text with OpenAI Whisper. Use when a user asks to transcribe audio files, generate subtitles (SRT/VTT), transcribe podcasts, convert speech to text, translate audio to English, build transcription pipelines, do speaker diarization, transcribe meetings,…	TerminalSkills/skills @TerminalSkills
#9	Voice conversations with Claude about your projects. Call a phone number to brainstorm, or have Claude call you with updates.	abracadabra50/claude-code-voice-skill @abracadabra50