VisualTalk Junior 🎈✨

Welcome to VisualTalk Junior, a high-performance, AI-driven voice companion specifically architected for early childhood engagement (ages 3–5). VisualTalk Junior is a child-friendly AI web application that creates a real-time voice conversation with young children based on a visual scene. The AI describes the image, asks simple questions, listens to the child’s responses, and rewards correct answers with interactive star feedback.

The project is designed to demonstrate natural human-AI interaction, real-time speech processing, and engaging user experience for early childhood learning.

VisualTalk Junior creates an immersive linguistic environment where static imagery becomes a gateway to conversation. By leveraging cutting-edge large language models (LLMs) and native browser speech capabilities, it provides a safe, responsive, and educational experience that mimics the interaction style of a compassionate preschool teacher.

🏗️ Project Structure

The repository follows a clean, decoupled full-stack architecture designed for seamless scalability and ease of deployment.

Directory Overview

/api: Modern Serverless entry point for production. Contains the idempotent Lambda-style function used by Vercel to handle AI negotiations.
/backend: Traditional Express-based server used for local development and testing. It provides a mirror of the production API behavior.
/frontend: The React core. A high-speed Vite-powered single-page application (SPA) that manages the user interface, state, and client-side voice processing.
vercel.json: The orchestration layer that routes traffic between the frontend static assets and the backend serverless endpoints.
package.json: The root manifest used for cross-environment build orchestration.

Detailed Tree

VisualTalk-Junior/
├── api/                  # Production Serverless Backend
│   └── chat.js           # Groq AI implementation
├── backend/              # Local Development Backend
│   ├── server.js         # Entry point for local testing
│   └── .env              # Sensitive environment variables
├── frontend/             # Desktop/Mobile User Interface
│   ├── src/
│   │   ├── components/   # Modular UI blocks (Image, Controls, Stars)
│   │   ├── services/     # Hardware abstraction (Speech API, Network)
│   │   └── assets/       # Visual design tokens and media
│   └── vite.config.js    # Build optimization config
├── vercel.json           # Cloud routing & rewrite rules
├── package.json          # Root build scripts
└── .gitignore            # Version control exclusions

🌟 Core Features & Interaction Design

🧒 Pedagogical Tone

The AI personality is strictly bounded by rules designed for 3-5 year olds. It uses a lexile-appropriate vocabulary, restricts sentence length to 5-7 words, and maintains a warm "Teacher Persona."

🗣️ Real-Time Voice Processing

The system utilizes the Web Speech API for ultra-low latency interaction:

Recognition: Captured locally on the device to minimize bandwidth and maximize privacy.
Synthesis: High-quality, female-leaning voices are selected to ensure the AI sounds approachable.

🍱 Conversational Scaffolding

Every interaction utilizes "scaffolding"—a technique used in early education to build confidence:

The Hook: A warm, enthusiastic greeting.
Contextualization: A simple summary of what the child is seeing.
The Prompt: A single, focused question to encourage speech.

🧠 Intelligent Guidance (Gentle Correction)

The application includes a specialized "Correction Engine." If a child provides an incorrect or unrelated answer, the system avoids negative reinforcement. Instead, it uses kind redirection: "Nice try! The dog is actually brown. What color do you see?"

🏗️ Technical Architecture

The technology stack is selected for speed, cost-efficiency, and a "premium" feel.

graph TD
    A[React SPA] -->|JSON/POST| B[Vercel Function]
    B -->|Groq Protocol| C[Llama 3.3 70B]
    C -->|Completion| B
    B -->|Stream/JSON| A
    A -->|Hardware Access| D[Microphone/Speakers]

Groq AI: Utilizes LPU (Language Processing Unit) inference to deliver response times faster than human reactivity.
Tailwind CSS: A utility-first CSS framework used to build the glassmorphic, responsive layout.
Framer Motion: Smooth entry/exit animations for stars and UI transitions.

🛠️ Installation & Local Setup

VisualTalk Junior can be run locally in three simple steps.

Prerequisites

Runtime: Node.js v18.x or above.
API Access: An active Groq API Key (Secret).

Phase 1: The Brain (Local Backend)

Navigate to the backend directory: cd backend
Install dependencies: npm install
Configure environment: Create a .env file and add:
```
GROQ_API_KEY=your_secret_key_here
```
Boot the server: npm start (Runs on port 3000)

Phase 2: The Face (Local Frontend)

Open a new terminal and navigate to: cd frontend
Install dependencies: npm install
Launch development server: npm run dev
Access the app at: http://localhost:5173

🚀 One-Click Deployment (Vercel)

This repository is optimized for deployment on the Vercel Hobby Plan.

1. Project Import

Click "New Project" in Vercel and import your GitHub repository.

2. Required Project Settings

To ensure the hybrid frontend/backend builds correctly, use these settings in the Vercel dashboard:

Framework Preset: Other (or let it auto-detect Vite)
Root Directory: ./ (Leave as default)
Build Command: npm run build
Output Directory: frontend/dist
Install Command: npm install

3. Environment Variables

Add the following secret in the Environment Variables section:

Key: GROQ_API_KEY
Value: Your Groq API Key

4. Deploy

Click Deploy. Vercel will build your React app and automatically host the serverless function in the /api directory.

📜 Usage Guide

Permissions: The browser will request Microphone access. Click "Allow."
Engagement: Click the "START TALKING" button to begin the session.
Response Cycle: Wait for the "Listening..." status to appear before the child speaks.
Conclusion: Use the "I am done! 👋" button to end the conversation early.

📜 License

Distributed under the MIT License. Developed for the next generation of digital learning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisualTalk Junior 🎈✨

🏗️ Project Structure

Directory Overview

Detailed Tree

🌟 Core Features & Interaction Design

🧒 Pedagogical Tone

🗣️ Real-Time Voice Processing

🍱 Conversational Scaffolding

🧠 Intelligent Guidance (Gentle Correction)

🏗️ Technical Architecture

🛠️ Installation & Local Setup

Prerequisites

Phase 1: The Brain (Local Backend)

Phase 2: The Face (Local Frontend)

🚀 One-Click Deployment (Vercel)

1. Project Import

2. Required Project Settings

3. Environment Variables

4. Deploy

📜 Usage Guide

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api		api
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
package.json		package.json
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

VisualTalk Junior 🎈✨

🏗️ Project Structure

Directory Overview

Detailed Tree

🌟 Core Features & Interaction Design

🧒 Pedagogical Tone

🗣️ Real-Time Voice Processing

🍱 Conversational Scaffolding

🧠 Intelligent Guidance (Gentle Correction)

🏗️ Technical Architecture

🛠️ Installation & Local Setup

Prerequisites

Phase 1: The Brain (Local Backend)

Phase 2: The Face (Local Frontend)

🚀 One-Click Deployment (Vercel)

1. Project Import

2. Required Project Settings

3. Environment Variables

4. Deploy

📜 Usage Guide

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages