An interactive CLI tool that fetches Gemini conversation transcripts from Gmail or a Google Drive folder and analyzes them to generate content suggestions. The content focus is fully configurable - by default it targets AI strategy and innovation for business leaders, but can be customized for any domain.
- Dual Source Modes - Fetch content from Gmail emails OR scan Google Drive folders
- Flexible AI Model Selection:
- Test Mode (default): Fast and economical with Gemini 2.5 Flash
- Production Mode: High quality with GPT-4o
- Custom Models: Override with any specific AI model (Claude, GPT, Gemini)
- Supports OpenAI, Anthropic, and Google providers
- Configurable Content Focus - Customize the analysis angle for any domain (AI strategy, product management, marketing, DevOps, etc.)
- Gmail Mode:
- Fetches transcripts from Gmail with subject pattern:
Notes: "[Subject]" [MMM DD, YYYY] - Extracts full transcripts from Google Docs (prefers Transcript tab over Summary)
- Label-based filtering to focus on specific conversations
- Fetches transcripts from Gmail with subject pattern:
- Drive Mode:
- Scan any Google Drive folder for transcript documents
- Recursive subfolder scanning (optional)
- Date filtering and name pattern matching
- Smart Output:
- Default: Saves each topic as a separate Google Doc (named MMDDYYYY_Topic_N_Title)
- Optional: Save all topics combined in one file with
--combined-topicsflag - Optional: Save as local markdown files with
--save-localflag - Automatic folder organization in Google Drive
- Interactive CLI with ASCII art logo and rich terminal UI
- AI-powered content analysis
- Generates:
- Recommended article topics (2-5 topics per analysis)
- Compelling topic titles geared to target audience
- 2-5 key insights specific to each topic
- 1-3 notable verbatim quotes with speaker attribution
- Optional evidence/data (up to 5 items, 3-5 sentences each)
- Optional real-world examples (up to 5 items, 3-5 sentences each)
- Batch processing support with per-topic file output (default) or combined file option
- Python 3.8 or higher
- Gmail account
- AI Provider API key (choose one):
- OpenAI (default, most cost-effective): Get from platform.openai.com
- Anthropic: Get from console.anthropic.com
- Google Gemini (FREE tier available): Get from aistudio.google.com
- Google Cloud project with Gmail API, Google Docs API, and Google Drive API enabled
git clone https://github.com/stephenhsklarew/article-idea-generator.git
cd article-idea-generatorCreate a virtual environment (recommended):
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activateInstall required packages:
pip install -r requirements.txt- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable required APIs:
- Go to "APIs & Services" > "Library"
- Search for and enable "Gmail API"
- Search for and enable "Google Docs API"
- Search for and enable "Google Drive API"
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "OAuth client ID"
- If prompted, configure the OAuth consent screen:
- Choose "Internal" (if using Google Workspace) or "External"
- Fill in app name: "Article Idea Generator"
- Add your email as a developer contact
- Add scopes:
gmail.readonly,drive.readonly,documents.readonly
- Choose "Desktop app" as application type
- Download the credentials JSON file
- Save it as
credentials.jsonin the project root directory
Important: The credentials.json file is required for the application to authenticate with Google APIs. This file is automatically excluded from git via .gitignore to keep your credentials secure.
-
Copy the example environment file:
cp .env.example .env
-
Edit
.envand configure your AI provider:# ===== AI PROVIDER CONFIGURATION ===== # Note: The default mode is TEST (uses Gemini 2.5 Flash - FREE) # You can override with --mode or --model command-line arguments # Choose: anthropic, openai, or google AI_PROVIDER=google # Add your API key (only need the one for your chosen provider) GOOGLE_API_KEY=your_google_key_here # OR # OPENAI_API_KEY=your_openai_api_key_here # OR # ANTHROPIC_API_KEY=your_anthropic_key_here # Optional: Override default model (usually not needed) # Command-line --mode and --model arguments are recommended instead # Defaults: gpt-4o-mini (openai), claude-3-5-haiku (anthropic), gemini-2.5-flash (google) # AI_MODEL=gemini-2.5-flash # Optional: Content focus for article generation # Default: AI strategy and innovation for business leaders # Customize this to change the angle/perspective of generated articles # Example: CONTENT_FOCUS=product management best practices for SaaS companies CONTENT_FOCUS= # Optional: Set a start date to only analyze transcripts from this date forward # Format: MMDDYYYY (e.g., 10232025 for October 23, 2025) # Leave blank to analyze all transcripts START_DATE= # Optional: Filter settings # Comma-separated list of people to exclude from analysis EXCLUDE_PEOPLE= # Comma-separated list of subject keywords to exclude EXCLUDE_SUBJECTS=
-
Get your API key from your chosen provider (links above)
Important: The .env file contains sensitive API keys and is automatically excluded from git via .gitignore.
For a typical 12K-word transcript analysis:
| Mode | Provider | Model | Cost per Analysis | Monthly Cost (100 analyses) | Notes |
|---|---|---|---|---|---|
| Test (default) | gemini-2.5-flash | FREE | FREE | Generous free tier limits | |
| Production | OpenAI | gpt-4o | ~$0.10 | ~$10 | High quality, reasonable cost |
| Custom | OpenAI | gpt-4o-mini | ~$0.005 | ~$0.50 | Best value, excellent quality |
| Custom | OpenAI | gpt-4-turbo | ~$0.30 | ~$30 | Very high quality |
| Custom | Anthropic | claude-3-5-haiku | ~$0.035 | ~$3.50 | Premium quality, good value |
| Custom | Anthropic | claude-3-5-sonnet | ~$0.91 | ~$91 | Highest quality |
| Custom | gemini-1.5-pro | ~$0.15 | ~$15 | Good quality, decent cost |
Recommendations:
- For testing/development: Use default Test Mode (Gemini 2.5 Flash) - completely free within generous limits
- For production: Use Production Mode (GPT-4o) - excellent quality at reasonable cost
- For maximum value: Use custom
--model gpt-4o-mini- best cost/quality ratio - For highest quality: Use custom
--model claude-3-5-sonnet-20241022- best analysis quality
On your first run, the application will:
- Open a browser window to authenticate with Google
- Ask you to select your Gmail account
- Request permission to read emails and documents
- Save the authentication token as
token.picklefor future use
Important: The token.pickle file is automatically generated after your first authentication and is excluded from git via .gitignore. If you encounter authentication issues, delete this file and re-run the application to re-authenticate.
After setup, your project should have these sensitive files (all excluded from git):
article-idea-generator/
├── credentials.json # Google OAuth credentials (you download)
├── token.pickle # Google OAuth token (auto-generated on first run)
├── .env # Your API keys and config (you create from .env.example)
└── .env.save # Backup of .env (optional)
Qwilo supports two source modes:
Gmail Mode (Default):
- Fetches emails from Gmail with specific subject patterns
- Requires Gmail authentication
- Supports label filtering
Drive Mode:
- Scans Google Drive folders for transcript documents
- Works with any plain Google Docs (no special formatting required)
- Supports recursive subfolder scanning
Gmail Mode (default):
python3 cli.py # Uses test mode (Gemini 2.5 Flash)
python3 cli.py --mode production # Uses production mode (GPT-4o)Drive Mode:
python3 cli.py --source drive # Test mode
python3 cli.py --source drive --mode production # Production modeOr make it executable:
chmod +x cli.py
./cli.py --source driveThe tool supports three ways to select AI models:
1. Test Mode (default) - Fast & Economical:
python3 cli.py # Uses Gemini 2.5 Flash
python3 cli.py --mode test # Explicit test mode2. Production Mode - High Quality:
python3 cli.py --mode production # Uses GPT-4o3. Custom Model - Override Any Model:
# Auto-detect provider from model name
python3 cli.py --model gpt-4o-mini
python3 cli.py --model claude-3-5-sonnet-20241022
python3 cli.py --model gemini-1.5-pro
# Explicitly specify provider
python3 cli.py --model claude-3-opus-20240229 --provider anthropic
python3 cli.py --model gpt-4-turbo --provider openai
# Works with all other options
python3 cli.py --source drive --model claude-3-5-sonnet-20241022
python3 cli.py --email "Meeting" --model gpt-4o --separate-filesAvailable Models:
- OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1-preview, o1-mini
- Anthropic: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-opus-20240229
- Google: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-flash, gemini-1.5-pro
List all available transcripts:
python3 cli.py --listAnalyze a specific email by subject (supports partial matching):
python3 cli.py --email "Meeting Notes"Customize content focus (overrides CONTENT_FOCUS environment variable):
python3 cli.py --focus "product management for SaaS companies"
python3 cli.py --focus "DevOps best practices"
python3 cli.py --focus "marketing strategies for B2B"Filter by Gmail label:
python3 cli.py --label "blog-potential"
python3 cli.py --list --label "AIQ"Filter by start date:
python3 cli.py --start-date 10232025Output options:
python3 cli.py --combined-topics # Save all topics in one file (default: separate topic files)Combine multiple options:
python3 cli.py --label "blog-potential" --email "Strategy" --mode production
python3 cli.py --focus "engineering leadership" --combined-topics --model gpt-4o-mini
python3 cli.py --email "Product" --focus "product management" --label "priority"
python3 cli.py --model claude-3-5-sonnet-20241022 --email "Meeting"Use a specific Google Drive folder:
python3 cli.py --source drive --folder-id 1a2b3c4d5e6f7g8h9i0jHow to find your folder ID:
- Open the folder in Google Drive
- Copy the ID from the URL:
https://drive.google.com/drive/folders/FOLDER_ID - The FOLDER_ID is the long string after
/folders/
Combine with content focus and AI models:
python3 cli.py --source drive --focus "product management best practices"
python3 cli.py --source drive --mode production
python3 cli.py --source drive --model claude-3-5-sonnet-20241022
python3 cli.py --source drive --model gpt-4o-mini --focus "DevOps practices"Save to combined file:
python3 cli.py --source drive --combined-topics
python3 cli.py --source drive --combined-topics --model gpt-4oConfiguration via .env:
# Set default source mode and folder in .env
SOURCE_MODE=drive
DRIVE_FOLDER_ID=your_folder_id_here
DRIVE_RECURSIVE=true # Enable recursive subfolder scanningThen simply run:
python3 cli.py # Will use Drive mode with configured folderDefault: Google Docs (Automatic)
python3 cli.py # Saves to Google Doc in OUTPUT_FOLDER_ID
python3 cli.py --source drive # Same for Drive mode- Analysis saved as Google Doc with name:
MMDDYYYY(e.g.,11062024) - Automatically placed in configured output folder
- Returns clickable URL to view document
Optional: Local Markdown Files
python3 cli.py --save-local # Saves as .md file locally
python3 cli.py --source drive --save-local- Analysis saved as
analysis_TopicName_YYYYMMDD_HHMMSS.md - Stored in current directory
Configure Output Folder (.env):
# Set default output folder for Google Docs
OUTPUT_FOLDER_ID=1MVUaQyfyzaaBt-hOeUC9JFSiQeqE3no5Once the application starts (in either Gmail or Drive mode), you'll see:
-
List of available transcripts/documents - Shows all content from your selected source
-
Analysis options:
- Enter a number (e.g.,
1) - Analyze a specific transcript - Enter
all- Analyze all transcripts sequentially (with display and prompts) - Enter
batch- Batch process all transcripts (skip display, auto-save) - Enter
range(e.g.,1-5) - Analyze a range of transcripts - Enter
q- Quit the application
- Enter a number (e.g.,
-
After each analysis:
- View the results in formatted markdown
- Choose to save the analysis to a file
- Continue to the next transcript or return to the menu
Batch Mode: Selecting batch processes all transcripts without displaying analysis or prompting for confirmations. Perfect for processing many transcripts quickly - the tool will automatically save results to a combined file when done.
The analysis can be customized to focus on any domain or perspective. This changes how Claude analyzes the transcripts and generates article suggestions.
Three ways to set content focus (in priority order):
-
Command-line flag (highest priority) - Overrides all other settings:
python3 cli.py --focus "product management for SaaS companies" -
Environment variable - Set in
.envfile for persistent custom focus:CONTENT_FOCUS=DevOps best practices for enterprise teams -
Default - If not specified, defaults to "AI strategy and innovation for business leaders"
Example use cases:
- Product Management:
--focus "product management best practices for SaaS" - Engineering Leadership:
--focus "engineering leadership and team management" - DevOps:
--focus "DevOps and infrastructure automation strategies" - Marketing:
--focus "B2B marketing strategies for tech companies" - Sales:
--focus "enterprise sales techniques and customer success" - Design:
--focus "UX design principles and user research"
The content focus affects:
- Article topic recommendations
- Key insights extraction
- The angle and perspective of the analysis
- Which aspects of the conversation are emphasized
You can filter transcripts by date in three ways:
Option 1: Set in .env file
START_DATE=10232025
Option 2: Use command-line argument
python3 cli.py --start-date 10232025Option 3: Interactive prompt
If no START_DATE is configured and you're not using --label, you'll be prompted whether you want to set one.
The date format is MMDDYYYY:
10232025= October 23, 202501152025= January 15, 2025
Note: When using --label to filter by Gmail labels, the date filter is automatically disabled to show all emails with that label regardless of date.
Each analysis includes 2-5 topics, with each topic containing:
**Source:** [Original email subject]
## TOPIC 1: [Topic Title]
**Description:** [1-3 sentences on why this would make a good article for the target audience]
**Key Insights:**
• [Insight 1 related to this topic]
• [Insight 2 related to this topic]
• [Insight 3 related to this topic]
• [Insight 4 related to this topic]
• [Insight 5 related to this topic]
**Notable Quotes:**
> **[Speaker Name]:** "[Exact verbatim quote from transcript]"
> **[Speaker Name]:** "[Another exact quote]"
> **[Speaker Name]:** "[Third exact quote]"
**Evidence/Data:** (Optional - when relevant data is mentioned, max 5 items)
• [Specific statistic, data point, or research finding - 3-5 sentences max]
• [Another piece of supporting evidence - 3-5 sentences max]
**Real-World Examples:** (Optional - when stories or examples are shared, max 5 items)
• [Real-world story, case study, or example from the conversation - 3-5 sentences max]
• [Another relevant example or perspective - 3-5 sentences max]
---
## TOPIC 2: [Topic Title]
...What's included in each topic:
- Compelling topic title geared towards the target audience
- 1-3 sentence description explaining why this would make a good article
- 2-5 key insights specifically related to that topic
- 1-3 notable quotes with speaker attribution (verbatim from transcript)
- Evidence/Data (optional) - up to 5 items, each 3-5 sentences max
- Real-World Examples (optional) - up to 5 items, each 3-5 sentences max
Google Docs (Default - Per-Topic Files):
- Each topic saved as separate file
- Named:
MMDDYYYY_Topic_N_TopicTitle(e.g.,11082025_Topic_1_AI_Strategy_Evolution) - Saved to configured OUTPUT_FOLDER_ID
- Returns URL for each document created
Google Docs (with --combined-topics):
- All topics in one file
- Named by date:
MMDDYYYY(e.g.,11062024) - Saved to configured OUTPUT_FOLDER_ID
- Returns URL for immediate viewing
Local Markdown (with --save-local):
analysis_[topic]_[timestamp].md
Example: analysis_AI_Strategy_Implementation_20251104_143022.md
The tool intelligently extracts content from Google Docs:
- ✅ Prefers "Transcript" tab - Contains full conversation with timestamps and speakers
⚠️ Falls back to "Notes" tab - If no Transcript tab exists (summary content only)- 📊 Reports tab selection - Shows which tab is being used during processing
For best results with verbatim quotes, use recordings that have a full Transcript tab. The tool will indicate when it's using a summary instead of a full transcript.
article-idea-generator/
├── cli.py # Main interactive CLI application
├── gmail_client.py # Gmail API integration with label filtering
├── google_drive_client.py # Google Drive API for folder scanning
├── google_docs_client.py # Google Docs API for transcript extraction
├── content_analyzer.py # Claude API integration for analysis
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── .gitignore # Git ignore rules (protects sensitive files)
├── README.md # This file
├── qwilo_logo.png # Application logo
├── credentials.json # Google OAuth credentials (you provide, not in git)
├── token.pickle # Google OAuth token (auto-generated, not in git)
└── .env # Your environment variables (you create, not in git)
Solution: Download OAuth credentials from Google Cloud Console:
- Go to Google Cloud Console
- Navigate to "APIs & Services" > "Credentials"
- Download your OAuth 2.0 Client ID credentials
- Save the file as
credentials.jsonin the project root directory
Solution: Create a .env file:
- Copy
.env.exampleto.env:cp .env.example .env - Edit
.envand add your Anthropic API key - Get your API key from console.anthropic.com
Solution: Verify that your emails have the correct subject format:
Notes: "Your Topic Here" Jan 15, 2025
The tool supports both straight quotes (") and curly quotes ("").
Solution: Delete token.pickle and re-run the application:
rm token.pickle
python3 cli.pyThis will trigger a new authentication flow.
Solution: Ensure you've activated your virtual environment and installed dependencies:
source venv/bin/activate
pip install -r requirements.txtSolution:
- Make sure you're using the exact label name from Gmail (case-insensitive)
- Spaces, hyphens, and underscores are normalized (e.g., "blog-potential" matches "Blog Potential")
- The label is filtered at the Gmail API level, not post-processing
Solution: This occurs when the document only has a "Notes" (summary) tab instead of a full "Transcript" tab. Select a different transcript that has the full conversation recorded, or use the analysis for insights rather than quotes.
- Choose the right source mode for your workflow:
- Gmail mode: Best when transcripts arrive via email with consistent formatting
- Drive mode: Best when transcripts are stored in a dedicated folder or managed outside email
- Customize content focus for your domain - Use
--focusto tailor analysis to your specific industry or content area - Gmail mode tips:
- Choose transcripts with full Transcript tabs - These contain speaker names and verbatim dialogue
- Use label filtering - Tag important conversations in Gmail with labels like "Blog potential"
- Drive mode tips:
- Organize transcripts in dedicated folders for easy scanning
- Use DRIVE_RECURSIVE=true for nested folder structures
- Name documents descriptively - the document name becomes the "topic"
- Review transcripts before analyzing - The quality of analysis depends on transcript quality
- Save important analyses - Use the save feature to keep analyses you want to reference
- Batch process related topics - Use the
batchoption to analyze many conversations quickly - Combine options effectively - Mix
--focus,--source, and mode-specific flags - Refine topics - The AI suggestions are starting points; use your editorial judgment
- ✅ Never commit
credentials.json,token.pickle, or.envto version control - ✅ All sensitive files are automatically excluded via
.gitignore - ✅ Keep your API keys secure and rotate them periodically
- ✅ The application only reads emails (readonly scope)
- ✅ Authentication tokens are stored locally on your machine
For issues related to:
- Gmail API: Google Gmail API Documentation
- Google Docs API: Google Docs API Documentation
- Claude API: Anthropic Documentation
- Application bugs: GitHub Issues
This project is provided as-is for personal use.