YouTube to Skill

Transform YouTube tutorials into AI agent skills that actually work.

The Problem

Video Knowledge is Trapped

YouTube contains millions of programming tutorials, but this knowledge is:

Inaccessible to AI assistants — AI agents cannot watch videos
Time-consuming to extract — A 10-minute video takes 10 minutes to watch
Often outdated — APIs change, libraries deprecate, best practices evolve
Not actionable — Narrative format ("so what we're gonna do...") ≠ executable instructions

The Skill Quality Problem

AI agent skills extend an agent's capabilities. However, creating effective skills requires understanding:

How agents discover skills (via description matching)
The difference between narrative and imperative content
Current state of mentioned technologies

Without this knowledge, generated skills are often:

Never activated (poor descriptions)
Unusable (incomplete instructions)
Harmful (outdated information)

The Solution

This skill automates the transformation pipeline:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   EXTRACT   │────▶│  IDENTIFY   │────▶│   VERIFY    │────▶│  TRANSFORM  │────▶│  GENERATE   │
│  transcript │     │technologies │     │  up-to-date │     │   content   │     │    skill    │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘

1. Extract

Pulls transcripts from YouTube using captions API (with Whisper fallback):

python scripts/extract_youtube.py "https://youtube.com/watch?v=..."

Output: JSON with metadata (title, channel, date) and full transcript.

2. Identify

Scans transcript for technologies:

Languages and versions
Frameworks and libraries
APIs and services
CLI tools

3. Verify

Videos get outdated. Skills must not.

Uses available tools to verify current state:

Source	Tool	Use Case
Library docs	Context7 MCP¹	React, Node.js, Python packages
General search	WebSearch	Tools, services, APIs
Specific docs	WebFetch	Official documentation URLs

When outdated content is detected, the user is asked how to proceed:

Update to current version
Keep original with warnings
Skip the section

4. Transform

Converts narrative video content to actionable skill instructions:

Video Content	Skill Content
"So basically what we're doing..."	(removed)
"You want to run this command..."	`bash\ncommand\n`
"The important thing here is..."	Prerequisites section
"If you get this error..."	Troubleshooting section

5. Generate

Produces a SKILL.md following the standard skill format:

---
name: lowercase-with-hyphens
description: Action verb + what + when to use (max 1024 chars)
---

File Structure

youtube-to-skill/           # github.com/yfe404/youtube-to-skill
├── SKILL.md                 # Main skill (174 lines)
│                            # - Workflow overview
│                            # - Tool integration (Context7, WebSearch)
│                            # - Quality checklist
│
├── reference.md             # Deep knowledge (289 lines)
│                            # - How agents discover skills
│                            # - Transformation patterns
│                            # - Common mistakes
│
├── templates/
│   └── skill_template.md    # Output template (94 lines)
│
├── scripts/
│   ├── extract_youtube.py   # Transcript extractor (337 lines)
│   └── requirements.txt     # Dependencies
│
└── examples/
    └── good_vs_bad.md       # Contrasting examples (354 lines)

Total: 1,250 lines of skill content

Installation

# Install the skill
npx skills add yfe404/youtube-to-skill

# Install dependencies
pip install youtube-transcript-api yt-dlp

Restart your AI agent to load.

Usage

Start a new session and request:

Create a skill from this YouTube video: https://youtube.com/watch?v=...

The agent will:

Extract the transcript
Identify technologies
Verify current state (ask about updates)
Ask about enrichment from official docs
Generate the skill
Ask where to save it

Dependencies

Package	Purpose	Install
youtube-transcript-api	Caption extraction	`pip install youtube-transcript-api`
yt-dlp	Metadata + fallback	`pip install yt-dlp`
openai-whisper	Audio transcription (optional)	`pip install openai-whisper`

Limitations

Requires captions: Videos without captions need Whisper (slower, requires GPU for speed)
English-focused: Translation available but may reduce accuracy
Single video or playlist: Does not process channel-wide content
Verification depends on tools: Context7 coverage varies by library

References

License: MIT

Context7 MCP — Library documentation retrieval tool providing up-to-date API references and code examples. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube to Skill

The Problem

Video Knowledge is Trapped

The Skill Quality Problem

The Solution

1. Extract

2. Identify

3. Verify

4. Transform

5. Generate

File Structure

Installation

Usage

Dependencies

Limitations

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

YouTube to Skill

The Problem

Video Knowledge is Trapped

The Skill Quality Problem

The Solution

1. Extract

2. Identify

3. Verify

4. Transform

5. Generate

File Structure

Installation

Usage

Dependencies

Limitations

References

Footnotes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages