AI video toolkit — 5 tools, 1 codebase. Shared engine for text-to-speech, scene planning, rendering, and social posting.
⚠️ Alpha. APIs may break between minor versions while we stabilise. The VCM local renderer is the most polished entry point — start there.
| Tool | Input | Output | Best for |
|---|---|---|---|
| 🎬 VCM (VideoContentMaker) | A topic sentence | 1-15 min MP4 (9:16 or 16:9) with AI narration + images | YouTube long-form, TikTok shorts, batch content |
| 📰 AINews | URL or article text | 60-90 s vertical short with AI summary | TikTok / Reels news rewrites |
| 📺 NewsEditor | Multi-source URLs | Neutral multi-angle news video with citations | Balanced commentary, factual recaps |
| 📚 LingoFeeder | Language config | Vocab / grammar / dialogue lesson shorts | Language teaching channels |
| 🎤 ViralEditor | Upload + summary text | Re-edit with cautionary voiceover | Reaction-style commentary |
All five share one engine: Gemini for scene planning, Google Chirp 3 HD / Edge / gTTS for voice, HyperFrames (HTML→MP4) for rendering, Cloudflare Workers AI / Pollinations / Pexels for images, and one-click posting to Facebook Page + TikTok.
The fastest way to see what this toolkit does is the VCM local 1-click renderer. No Telegram bot, no server — type a topic, click a button, get an MP4.
| Dep | Where |
|---|---|
| Python 3.12+ (with "Add to PATH" checked on Windows) | https://www.python.org/downloads/ |
| Node.js 20+ | https://nodejs.org/en/download/ |
| ffmpeg in PATH | choco install ffmpeg (Windows) · brew install ffmpeg (Mac) · apt install ffmpeg (Linux) |
| A FREE Gemini API key | https://aistudio.google.com/apikey |
# 1. Clone
git clone https://github.com/ktbteam/ktb-studio
cd ktb-studio
# 2. Setup (creates venv, installs deps, creates .env + channels/local/.env)
local\SETUP.bat # Windows
./local/SETUP.sh # Mac / Linux
# 3. Paste your keys into the two .env files SETUP just created.
# Minimum required: GEMINI_API_KEY in .env at repo root.
# Strongly recommended for nicer voice: GOOGLE_TTS_API_KEY + GOOGLE_TTS_VOICE
# in channels/local/.env (Google Cloud TTS Chirp 3 HD has a free tier).
# 4. Render
local\RUN_VCM.bat # Windows
./local/RUN_VCM.sh # Mac / LinuxThe console asks for topic → format (9:16 vs 16:9) → image source (AI cartoon vs Pexels real photo), then runs the pipeline. The output folder opens automatically when the MP4 is ready.
→ Full local renderer docs, troubleshooting, and the "which key goes where" reference: local/README.md
Different keys live in different places. The local launcher reads them via python-dotenv, the bots read them via systemd EnvironmentFile.
GEMINI_API_KEY=AIza... # REQUIRED — free at aistudio.google.com/apikey
CF_ACCOUNT_ID=... # optional — Cloudflare Workers AI free tier (10k Neurons/day)
CF_AI_TOKEN=cfut_... # paired with above
PEXELS_API_KEY=... # optional — Pexels stock photos (free 200 req/hour, 20k/month)
→ Step-by-step Cloudflare setup (where to click, which permission to
scope, smoke test, quota math): docs/setup-cloudflare-workers-ai.md
Copy a template first: cp channels/local.example/.env.example channels/<slug>/.env (or channels/example/.env.example for AINews-style brands).
# Identity
CHANNEL_NAME=Your Brand
THEME_VARIANT=dark # dark | bright | vivid | corporate
# Voice (this is where the magic of "doesn't sound robotic" happens)
GOOGLE_TTS_API_KEY=AIza... # Google Cloud TTS — strongly recommended
GOOGLE_TTS_VOICE=vi-VN-Chirp3-HD-Achernar
# Optional — only for the Telegram bots, not the local renderer
FB_PAGE_ID=... # Facebook Page posting (skip for local)
FB_PAGE_TOKEN=...
Without GOOGLE_TTS_API_KEY, the router falls back to Microsoft Edge TTS, then to gTTS — both work without keys but sound noticeably less natural than Chirp 3 HD.
Each tool can also run as a long-poll Telegram bot for on-demand video generation from your phone. This is how the production deployment runs:
# In a separate .env, set the bot tokens you created via @BotFather:
KTB_VCM_TOKEN=8924300440:AAE...
KTB_NEWS_EDITOR_TOKEN=...
KTB_VIRAL_EDITOR_TOKEN=...
TELEGRAM_BOT_TOKEN=... # AINews uses the base name
# Then launch the bot for the tool you want:
ktb-vcm-bot
ktb-ainews
ktb-news-editor-bot
ktb-viral-editorsystemd unit files for all bots are in deploy/systemd/. Each one expects an EnvironmentFile=/your/path/.env. Don't reuse production bot tokens across local + server — Telegram polling is single-instance and the newer client steals updates from the older one.
core/ ← shared engine
├── tts/ Google Chirp 3 HD → ElevenLabs → Edge TTS → gTTS router
├── planner/ Gemini scene planning + voice gender policy
├── renderer/ HyperFrames HTML→MP4 + 4 theme variants (dark/bright/vivid/corporate)
├── images/ Cloudflare Workers AI → Pollinations → Pexels chain
├── sources/ URL extract / GitHub README / multi-source / vision
├── poster/ Facebook Page + TikTok upload
└── channel.py Brand config loader (per-channel .env → ChannelConfig)
tools/ ← thin tool wrappers, each independent runtime
├── ainews/ URL → MP4, Telegram bot + style picker (4 themes)
├── vcm/ Topic → MP4, batch CLI + Telegram bot
├── news_editor/ Multi-source → MP4, daemon + Telegram bot
├── lingo_feeder/ Language drills → MP4
└── viral_editor/ Upload → re-edit, Telegram bot
channels/ ← brand presets
├── example/ ship template for new channels
└── local.example/ ship template for local renderer
(real channels — ainews/khuetran/news/local — are GITIGNORED, never pushed)
deploy/ ← systemd unit files + sync scripts (anh's VPS workflow)
local/ ← 1-click Windows/Mac/Linux renderer for VCM
docs/ ← per-tool migration notes (mostly internal)
Rule: core/ never imports from tools/. Tools never import from each other.
The whole codebase is paranoid about provider failures because the user pressing "render" should get a video, not a stack trace. Every external dependency has a graceful fallback:
- TTS: Google Chirp 3 HD → ElevenLabs → Edge TTS → gTTS (in router)
- Images: Cloudflare Workers AI → Pollinations → Pexels (with auto-keyword extraction from cartoon prompts) → gradient placeholder
- Vision: Gemini Flash → Flash Lite → returns hint string
- Caption: FB caption / TikTok hashtags / first comment are all built from the same plan
When a provider returns 429 / 402 / 403, the router logs the failure once and moves on — no user-visible error.
- Python 3.12+ · Pydantic v2 · asyncio
- TTS: Google Cloud TTS · Microsoft Edge TTS · ElevenLabs · gTTS
- Scene planner: Google Gemini 2.5 Flash / Flash Lite (auto-routes long-form to bigger model)
- Renderer: HyperFrames 0.6.52 (headless Chrome screenshot + ffmpeg encode)
- Image gen: Cloudflare Workers AI (flux-schnell) · Pollinations · Pexels stock
- Posting: Facebook Graph API · TikTok upload
- Per-video cost on the free tier: $0 (Google free quotas + Edge + Pexels free).
This repo is built primarily with Claude Code (Anthropic) and the Claude Agent SDK. Source is open so you can study, fork, or hire the team for custom builds.
MIT — Free to use, modify, sell. See LICENSE.