Skip to content

Investigate Nano-Banana MCP for AI image generation #48

@nille

Description

@nille

Overview

Investigate whether Nano-Banana MCP should be included in shipkit's MCP installation flow.

What it is: MCP server for AI image generation and editing using Google's Gemini 2.5 Flash Image API.

Capabilities

From the README:

  • 🎨 Generate images from text descriptions
  • ✏️ Edit existing images with text prompts
  • 🔄 Iterative editing (refine images)
  • 🖼️ Reference image style transfer
  • 📁 Auto file management

Usage:

"Generate an image of a sunset over mountains"
"Edit this image to add birds in the sky"
"Continue editing to make it more dramatic"

Questions to Answer

1. Use Cases

  • Who benefits? (frontend devs, designers, documentation writers?)
  • Real-world value? (mockups, assets, prototypes?)
  • How often would typical user need this?

2. Quality & Maintenance

  • Well-maintained or abandoned?
  • Production-ready or experimental?
  • Active community?

3. Setup Complexity

  • Requires: Google Gemini API key
  • Cost: What's the API cost per image?
  • Free tier: Does Gemini have free limits?
  • Setup friction: Easy or complex?

4. Where Does It Belong?

Option A: Core
✅ Universally useful (every dev needs images?)
❌ Niche (not everyone does UI/UX work)
❌ Requires API key + setup

Option B: Experimental
✅ New/unproven in shipkit
✅ Users can try it out
❌ Might be mature enough for Advanced

Option C: Advanced
✅ Specialized for frontend/design work
✅ Opt-in for those who need it
✅ Matches browser-test pattern
❓ Is it niche or broadly useful?

Option D: Marketplace
✅ Keep it opt-in, no core commitment
✅ User discovers when needed
❌ Less discoverable

5. Integration Strategy

If we include it:

During /install skill:

Would you like to install any MCP servers?

Essential (free):
  [ ] Brave Search
  [ ] GitHub
  
Development:
  [ ] Playwright
  
Creative/UI:
  [ ] Nano-Banana - AI image generation
      Requires: Gemini API key (free tier available)
      Use case: Generate mockups, assets, UI prototypes

Or: Add to Advanced layer with a /generate-image skill that wraps it

Recommendation Needed

After investigation:

Placement: [Core / Experimental / Advanced / Marketplace / Skip]

Rationale:
- [Use case frequency]
- [Setup complexity]
- [Who benefits]

Implementation:
- [Add to install flow? Y/N]
- [Create wrapper skill? Y/N]
- [Which layer?]

Similar Decisions

  • ✅ Browser automation → Advanced (niche, complex setup)
  • ✅ Playwright MCP → Opt-in during install (requires Node.js)
  • ✅ Brave Search → Essential (free, enhances /research)

Where does image generation fit?

Investigation Tasks

  • Test Nano-Banana MCP with Claude Code
  • Check Gemini API pricing and free tier
  • Assess setup friction
  • Determine real-world use case frequency
  • Decide on layer placement
  • Write integration plan

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions