Skip to content

MaharshPatelX/lite-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lite-chat

A lightweight, local AI chat application powered by Ollama. Clean dark UI with streaming responses, image support, and smart context management.

Python Next.js Ollama

Features

  • Streaming Responses — Real-time token streaming via SSE
  • Smart Context — Auto-summarizes old messages to stay within context limits
  • Image Support — Attach and analyze images with vision models
  • Model Switching — Use any Ollama model, pull new ones from the UI
  • Thinking Display — Collapsible thinking blocks for reasoning models
  • Token Stats — Live tokens/sec and usage display
  • Dark Mode — Clean, modern dark UI built with shadcn/ui

Tech Stack

Layer Tech
Frontend Next.js 14, TypeScript, Tailwind CSS, shadcn/ui, Zustand
Backend FastAPI, LangChain, SQLite
LLM Ollama (local)

Prerequisites

Getting Started

1. Clone the repo

git clone https://github.com/MaharshPatelX/lite-chat.git
cd lite-chat

2. Backend

cd backend
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload

Backend runs at http://localhost:8000

3. Frontend

cd frontend
npm install
npm run dev

Frontend runs at http://localhost:3000

4. Start Ollama

Make sure Ollama is running with at least one model pulled:

ollama run qwen3.5

API Endpoints

Method Endpoint Description
POST /api/chat Send message (SSE stream)
GET /api/conversations List conversations
POST /api/conversations Create conversation
PUT /api/conversations/{id} Rename conversation
DELETE /api/conversations/{id} Delete conversation
GET /api/models List available models
POST /api/models/pull Pull a new model
GET /api/health Health check

Architecture

System Overview

graph TB
    User([User]) --> Frontend

    subgraph Frontend["Frontend (Next.js 14)"]
        UI[React Components<br/>shadcn/ui + Tailwind]
        Stores[Zustand Stores<br/>chat · model · user]
        API[API Client<br/>fetch + SSE]
        UI <--> Stores
        Stores <--> API
    end

    API -- "REST + SSE" --> Backend

    subgraph Backend["Backend (FastAPI)"]
        Routes[Routes<br/>chat · conversations · models · user]
        Services[Services<br/>ChatService · ConversationService<br/>ModelService · UserService]
        Routes --> Services
    end

    Services --> DB[(SQLite<br/>users · conversations · messages)]
    Services --> LangChain[LangChain<br/>ChatOllama]
    LangChain --> Ollama[Ollama Server<br/>localhost:11434]
    Ollama --> LLM[Local LLMs<br/>llama · qwen · etc.]
Loading

Chat Message Flow

sequenceDiagram
    actor User
    participant FE as Frontend
    participant BE as FastAPI
    participant LLM as Ollama

    User->>FE: Type message + send
    FE->>FE: Add message (optimistic)
    FE->>BE: POST /api/chat

    BE->>BE: Save user message to DB
    BE->>BE: Build context (summary + last 5 exchanges)

    alt Messages > 10 & unsummarized exist
        BE->>LLM: Summarize old messages
        LLM-->>BE: Summary text
        BE->>BE: Save summary to DB
    end

    BE->>LLM: Stream response
    loop SSE Stream
        LLM-->>BE: Token
        BE-->>FE: SSE event (thinking/content)
        FE->>FE: Append token to UI
    end

    BE-->>FE: SSE event (done)
    BE->>BE: Save response to DB
    FE->>FE: Finalize message
Loading

Smart Context Strategy

graph LR
    subgraph Context["What gets sent to LLM"]
        direction TB
        SP["System Prompt"]
        SM["Summary of older messages<br/>(auto-generated)"]
        R["Last 5 exchanges<br/>(10 messages in full)"]
        NM["New user message"]
        SP --> SM --> R --> NM
    end

    subgraph DB["In Database"]
        OLD["Old messages<br/>(marked as summarized)"]
        SUM["Rolling summary<br/>(updated each time)"]
    end

    OLD -.->|"LLM summarizes"| SUM
    SUM -->|"injected as"| SM
Loading

Database Schema

erDiagram
    USERS ||--o{ CONVERSATIONS : has
    CONVERSATIONS ||--o{ MESSAGES : contains

    USERS {
        text id PK
        text name
        text default_model
        text system_prompt
        timestamp created_at
    }

    CONVERSATIONS {
        text id PK
        text user_id FK
        text title
        text model_name
        text summary
        text summary_upto_msg_id
        timestamp created_at
        timestamp updated_at
    }

    MESSAGES {
        text id PK
        text conversation_id FK
        text role
        text content
        text image_base64
        text thinking
        int tokens_used
        bool is_summarized
        timestamp created_at
    }
Loading

Project Structure

lite-chat/
├── backend/
│   ├── app/
│   │   ├── main.py          # FastAPI app entry
│   │   ├── config.py        # Configuration
│   │   ├── database.py      # SQLite setup
│   │   ├── routes/          # API endpoints
│   │   ├── services/        # Business logic
│   │   ├── schemas/         # Pydantic models
│   │   └── prompts/         # LLM prompts
│   └── requirements.txt
│
├── frontend/
│   ├── src/
│   │   ├── app/             # Next.js pages
│   │   ├── components/      # UI components
│   │   ├── lib/             # API clients & types
│   │   └── stores/          # Zustand state
│   └── package.json
│
├── .gitignore
└── README.md

Documentation

Detailed docs are in the docs/ folder:

License

MIT

About

Lightweight local AI chat app powered by Ollama. Streaming responses, smart context summarization, image support, and conversation history with undo - built with FastAPI and Next.js

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors