Outline2Narrative: Transforming Story Outlines into Rich Multimodal Narratives

Live Demo

For a live demo, open https://carter-pay-regular-pencil.trycloudflare.com/ on your browser. It is a temporary deployment, so please let me know if the website is down. Recommend to use gemini-2.5 since gemini-3 is easily to get overloaded.

Introduction

Outline2Narrative is an AI-powered story creation system that transforms simple story outlines into rich, multimodal narratives perfect for audiobooks, children's stories, educational content, and creative storytelling projects. The system combines advanced language models with image and audio generation to create complete stories with consistent visuals and engaging narratives.

Unlike traditional AI storytelling systems that generate content incrementally, Outline2Narrative employs a blueprint-first architecture that ensures visual consistency, narrative coherence, and structural completeness from the very beginning. The system enables human-AI co-creation, making story creation easier by providing different story directions at each step, allowing creators to guide, edit, and refine stories at every stage of generation.

Features

Blueprint-First: Establish canonical world information before generation
Separation of Concerns: Each agent has a specialized role
Human-in-the-Loop: Enable editing and refinement at every stage
Consistency Through Architecture: System design ensures coherence, not just model capability
Structured Narratives: Pre-planned scene graphs guarantee story completeness
Story Creation Focus: Options and choices serve to guide story development, making creation easier

Blueprint-First Architecture

World Bible Creation: Establishes canonical character blueprints, environment templates, and style guides before generating any story content
Visual Consistency: Maintains character appearance, clothing, and environmental details across all scenes using reusable visual tokens
Style Preservation: Ensures consistent art style throughout the entire narrative

Multi-Modal Generation

Text Generation: Creates engaging narrative prose with story direction options to guide narrative development
Image Generation: Produces scene images that maintain visual consistency using a hybrid reference strategy
Audio Narration: Generates audio narration perfect for audiobook creation and immersive storytelling

Story Direction Options

Directional Choices: At each scene, choose from different story directions to explore various narrative paths
Structured Narratives: Stories follow clear narrative arcs with beginning, middle, and end
Scene Graph Planning: Pre-planned scene structure ensures story completeness and satisfying conclusions

Human-in-the-Loop Co-Creation

Text Editing: Edit narrative text and choices at any point
Image Regeneration: Regenerate images with custom prompts or automatic updates
Character Customization: Select and regenerate main characters before story begins
Real-Time Refinement: Iterate on content without restarting the entire generation process

State Management

Continuity Tracking: Maintains character states, locations, and plot threads across scenes
World State Updates: Dynamically adds new characters, locations, and plot elements
Session Persistence: Saves story sessions for later continuation, editing, or export

Installation

Prerequisites

Python 3.10+ for the backend
Node.js 16+ and npm for the frontend
Gemini API Key

Backend Setup

Create a conda environment (recommended):

conda create -n outline2narrtive python=3.10
conda activate outline2narrtive

Install dependencies:

pip install -r requirements.txt

Frontend Setup

Navigate to the frontend directory:

cd frontend

Install Node.js dependencies:

npm install

How to Run

Backend Server

Start the FastAPI backend server:

cd backend
python main.py

The backend will start on http://localhost:8000 by default.

Frontend Application

In a separate terminal, start the frontend development server:

cd frontend
npm run dev

The frontend will typically be available at http://localhost:5173.

Access the Application

Open your web browser and navigate to the frontend URL (e.g., http://localhost:5173)
On the first page, enter your Gemini API key
Once configured, you can start creating your stories!

Quick Deployment to the Internet (Temporary Cloudflare Tunnel)

Install cloudflared:

wget https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
sudo dpkg -i cloudflared-linux-amd64.deb

Update frontend/vite.config.ts server block:
- host: true
- allowedHosts: ['your-tunnel-host.trycloudflare.com']
Restart frontend dev server:

cd frontend
npm run dev -- --host --port 5173

Start backend (separate terminal):

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000

Start tunnel (third terminal):

cloudflared tunnel --url http://localhost:5173

Access https URL printed by cloudflared (your-tunnel-host.trycloudflare.com) while all three processes stay running.

Creating a Story

Enter Story Outline: Provide a brief description of your story idea
Set Story Goal (optional): Choose from preset goals (Education, Adventure, Horror, etc.) or define a custom goal
Configure Story Length: Set the maximum number of scenes (recommand at least 10 scenes to ensure a complete story)
Add Reference Images (optional): Upload style reference images to guide visual generation
Start Creation: The system will generate the World Bible, create character blueprints, and begin the story

Developing Your Story

Review Each Scene: Each scene presents narrative text with an accompanying image
Choose Story Direction: Select from 3 different story directions to explore various narrative paths
Progress Through Scenes: Navigate through the pre-planned scene graph
Reach Conclusion: Stories always conclude with satisfying endings

Editing and Refinement

Edit Text: Click edit buttons to modify narrative text and story direction options
Regenerate Images: Regenerate scene images with custom prompts or automatic updates
Regenerate Text: Provide instructions to regenerate narrative content
Adjust Scenes: Modify the maximum number of scenes during story creation
Export Story: Save your completed story as a JSON file for sharing, audiobook production, or later use

Load Your Created Story

After exporting your story as a JSON file, you can reload it later by using the "Load Story" button on the main page to review.

Use Cases

Audiobook Creation: Generate complete narratives with audio narration ready for audiobook production
Children's Stories: Create engaging, illustrated stories with consistent characters and settings
Educational Content: Develop structured educational narratives with specific learning goals
Creative Writing: Use story direction options to explore different narrative paths and refine your story
Content Production: Generate multimodal content (text, images, audio) for various storytelling projects
Video Story Production: Use consistent story narratives and images as the foundation for creating video stories and animated narratives

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Outline2Narrative: Transforming Story Outlines into Rich Multimodal Narratives

Live Demo

Introduction

Features

Blueprint-First Architecture

Multi-Modal Generation

Story Direction Options

Human-in-the-Loop Co-Creation

State Management

Installation

Prerequisites

Backend Setup

Frontend Setup

How to Run

Backend Server

Frontend Application

Access the Application

Quick Deployment to the Internet (Temporary Cloudflare Tunnel)

Creating a Story

Developing Your Story

Editing and Refinement

Load Your Created Story

Use Cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Outline2Narrative: Transforming Story Outlines into Rich Multimodal Narratives

Live Demo

Introduction

Features

Blueprint-First Architecture

Multi-Modal Generation

Story Direction Options

Human-in-the-Loop Co-Creation

State Management

Installation

Prerequisites

Backend Setup

Frontend Setup

How to Run

Backend Server

Frontend Application

Access the Application

Quick Deployment to the Internet (Temporary Cloudflare Tunnel)

Creating a Story

Developing Your Story

Editing and Refinement

Load Your Created Story

Use Cases

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages