Summarize audio, YouTube, and documents in OpenClaw.
Use SpeakNotes in OpenClaw to summarize recordings, YouTube videos, and uploaded documents in one agent workflow.
- Summarize audio files, YouTube links, and documents in one flow
- Structured outputs for notes, meeting recaps, and handoffs
- Built on the same SpeakNotes stack used in production
How it works
Three steps from media input to polished output.
Get your API key
Create a SpeakNotes API key and keep it available in your OpenClaw environment.
Install the skill
Use the downloadable SKILL.md or publish it to your own skill registry.
Run prompts
Ask for transcript, summary style, and destination in one instruction.
Prompt examples
Use direct, outcome-based prompts to get production-ready outputs.
You
Transcribe this interview URL and return meeting-notes format with action owners.
Agent
Done. I generated a structured meeting summary with owners, deadlines, and risks.
You
Summarize this YouTube link into bulletpoints for our weekly product update.
Agent
Done. I extracted key updates, grouped by product area, and wrote a concise recap.
You
Take this transcript text and generate a clean note plus a one-paragraph executive brief.
Agent
Done. I returned a polished note and an executive summary ready to share.
What you get
Everything needed for AI-first media workflows.
Media and text inputs
Handle audio URLs, transcript text, and YouTube links in a unified interface.
50+ language support
Pass language hints when needed or let the transcription service auto-detect.
Structured summary styles
Generate note, transcript, bulletpoints, or meeting-notes outputs with consistent formatting.
Meeting-ready outputs
Produce action-focused recaps that fit team standups, customer calls, and retros.
Workflow handoff
Push the resulting content into your internal docs, CRM notes, or publish pipeline.
Production infrastructure
Runs on the same backend services powering SpeakNotes web and meeting flows.
Ready to wire SpeakNotes into OpenClaw?
Use the skill and MCP server together for fast transcript-to-summary automation.
