Skip to content

MadtorXD/VisualTalk-Junior

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisualTalk Junior 🎈✨

Welcome to VisualTalk Junior, a high-performance, AI-driven voice companion specifically architected for early childhood engagement (ages 3–5). VisualTalk Junior is a child-friendly AI web application that creates a real-time voice conversation with young children based on a visual scene. The AI describes the image, asks simple questions, listens to the child’s responses, and rewards correct answers with interactive star feedback.

The project is designed to demonstrate natural human-AI interaction, real-time speech processing, and engaging user experience for early childhood learning.

VisualTalk Junior creates an immersive linguistic environment where static imagery becomes a gateway to conversation. By leveraging cutting-edge large language models (LLMs) and native browser speech capabilities, it provides a safe, responsive, and educational experience that mimics the interaction style of a compassionate preschool teacher.


🏗️ Project Structure

The repository follows a clean, decoupled full-stack architecture designed for seamless scalability and ease of deployment.

Directory Overview

  • /api: Modern Serverless entry point for production. Contains the idempotent Lambda-style function used by Vercel to handle AI negotiations.
  • /backend: Traditional Express-based server used for local development and testing. It provides a mirror of the production API behavior.
  • /frontend: The React core. A high-speed Vite-powered single-page application (SPA) that manages the user interface, state, and client-side voice processing.
  • vercel.json: The orchestration layer that routes traffic between the frontend static assets and the backend serverless endpoints.
  • package.json: The root manifest used for cross-environment build orchestration.

Detailed Tree

VisualTalk-Junior/
├── api/                  # Production Serverless Backend
│   └── chat.js           # Groq AI implementation
├── backend/              # Local Development Backend
│   ├── server.js         # Entry point for local testing
│   └── .env              # Sensitive environment variables
├── frontend/             # Desktop/Mobile User Interface
│   ├── src/
│   │   ├── components/   # Modular UI blocks (Image, Controls, Stars)
│   │   ├── services/     # Hardware abstraction (Speech API, Network)
│   │   └── assets/       # Visual design tokens and media
│   └── vite.config.js    # Build optimization config
├── vercel.json           # Cloud routing & rewrite rules
├── package.json          # Root build scripts
└── .gitignore            # Version control exclusions

🌟 Core Features & Interaction Design

🧒 Pedagogical Tone

The AI personality is strictly bounded by rules designed for 3-5 year olds. It uses a lexile-appropriate vocabulary, restricts sentence length to 5-7 words, and maintains a warm "Teacher Persona."

🗣️ Real-Time Voice Processing

The system utilizes the Web Speech API for ultra-low latency interaction:

  • Recognition: Captured locally on the device to minimize bandwidth and maximize privacy.
  • Synthesis: High-quality, female-leaning voices are selected to ensure the AI sounds approachable.

🍱 Conversational Scaffolding

Every interaction utilizes "scaffolding"—a technique used in early education to build confidence:

  1. The Hook: A warm, enthusiastic greeting.
  2. Contextualization: A simple summary of what the child is seeing.
  3. The Prompt: A single, focused question to encourage speech.

🧠 Intelligent Guidance (Gentle Correction)

The application includes a specialized "Correction Engine." If a child provides an incorrect or unrelated answer, the system avoids negative reinforcement. Instead, it uses kind redirection: "Nice try! The dog is actually brown. What color do you see?"


🏗️ Technical Architecture

The technology stack is selected for speed, cost-efficiency, and a "premium" feel.

graph TD
    A[React SPA] -->|JSON/POST| B[Vercel Function]
    B -->|Groq Protocol| C[Llama 3.3 70B]
    C -->|Completion| B
    B -->|Stream/JSON| A
    A -->|Hardware Access| D[Microphone/Speakers]
Loading
  • Groq AI: Utilizes LPU (Language Processing Unit) inference to deliver response times faster than human reactivity.
  • Tailwind CSS: A utility-first CSS framework used to build the glassmorphic, responsive layout.
  • Framer Motion: Smooth entry/exit animations for stars and UI transitions.

🛠️ Installation & Local Setup

VisualTalk Junior can be run locally in three simple steps.

Prerequisites

  • Runtime: Node.js v18.x or above.
  • API Access: An active Groq API Key (Secret).

Phase 1: The Brain (Local Backend)

  1. Navigate to the backend directory: cd backend
  2. Install dependencies: npm install
  3. Configure environment: Create a .env file and add:
    GROQ_API_KEY=your_secret_key_here
  4. Boot the server: npm start (Runs on port 3000)

Phase 2: The Face (Local Frontend)

  1. Open a new terminal and navigate to: cd frontend
  2. Install dependencies: npm install
  3. Launch development server: npm run dev
  4. Access the app at: http://localhost:5173

🚀 One-Click Deployment (Vercel)

This repository is optimized for deployment on the Vercel Hobby Plan.

1. Project Import

  • Click "New Project" in Vercel and import your GitHub repository.

2. Required Project Settings

To ensure the hybrid frontend/backend builds correctly, use these settings in the Vercel dashboard:

  • Framework Preset: Other (or let it auto-detect Vite)
  • Root Directory: ./ (Leave as default)
  • Build Command: npm run build
  • Output Directory: frontend/dist
  • Install Command: npm install

3. Environment Variables

Add the following secret in the Environment Variables section:

  • Key: GROQ_API_KEY
  • Value: Your Groq API Key

4. Deploy

Click Deploy. Vercel will build your React app and automatically host the serverless function in the /api directory.


📜 Usage Guide

  1. Permissions: The browser will request Microphone access. Click "Allow."
  2. Engagement: Click the "START TALKING" button to begin the session.
  3. Response Cycle: Wait for the "Listening..." status to appear before the child speaks.
  4. Conclusion: Use the "I am done! 👋" button to end the conversation early.

📜 License

Distributed under the MIT License. Developed for the next generation of digital learning.

About

VisualTalk Junior is a child-friendly AI web application that creates a real-time voice conversation with young children based on a visual scene. The AI describes the image, asks simple questions, listens to the child’s responses, and rewards correct answers with interactive star feedback.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors