Purpose: Local AI research assistant. Users upload PDFs, take notes in Markdown, and chat with an LLM (Ollama) that can use the current note and (eventually) RAG over uploaded documents.
Origami/
├── backend/ # FastAPI (Python 3.13)
│ ├── main.py # App entry, CORS, routers
│ ├── routes/
│ │ ├── chat.py # POST /api/chat (streaming)
│ │ └── upload.py # POST /api/upload (PDF)
│ └── services/
│ ├── ollama.py # Ollama streaming + <think> tag parsing
│ └── rag.py # hybrid_search() — stub, no Chroma yet
├── frontend/ # Next.js 16 (App Router), React 19
│ ├── app/
│ │ ├── layout.tsx, page.tsx
│ │ └── api/chat/route.ts # Proxies /api/chat → backend
│ ├── components/
│ │ ├── layout/ workspace-layout.tsx (resizable editor + chat)
│ │ ├── editor/ markdown-editor.tsx (CodeMirror)
│ │ ├── chat/ chat-panel, chat-input, message, thought, animated-text
│ │ └── sidebar/ document-drawer, document-list, upload-dropzone
│ ├── lib/api/ upload.ts (calls backend /api/upload)
│ └── types/ Document, UploadResponse
- Entry:
main.py— FastAPI app, CORS forhttp://localhost:3000, mounts routers under/api. - Endpoints:
GET /health— health check.POST /api/chat— streaming chat. Body:{ messages: [{ role, content }], current_note: string }. Usesservices/ollama.stream_completion()andservices/rag.hybrid_search()(stub). Streams Vercel AI SDK data stream format:0:"text"\n,g:"reasoning"\n, thene:...,d:....POST /api/upload— PDF upload. Saves touploads/{uuid}.pdf. Returns{ id, filename, size, status: "processing" }. No ingestion yet (no text extraction, chunking, or vector store).
- Ollama (
services/ollama.py):stream_completion(messages, model)—httpxstream tohttp://localhost:11434/api/chat, default modeldeepseek-r1:7b. Parses<think>...</think>in the stream and yields{"type": "reasoning"|"text", "content": "..."}. - RAG (
services/rag.py):hybrid_search(query)— stub: returns[]. Planned: ChromaDB + BM25 + reciprocal rank fusion; not implemented. - Dependencies (pyproject.toml): fastapi, uvicorn, httpx, python-multipart, chromadb, langchain, langchain-community, langchain-ollama, langgraph, pymupdf, rank-bm25.
- Stack: Next 16, React 19, TypeScript, Tailwind 4, Vercel AI SDK (
@ai-sdk/react,ai), Motion, CodeMirror (@uiw/react-codemirror), react-dropzone, react-resizable-panels, Radix/shadcn-style UI. - Fonts: Inter, JetBrains Mono (layout.tsx).
- Single page:
app/page.tsxrendersWorkspaceLayout. - WorkspaceLayout: Header with sidebar toggle; resizable horizontal panels: left = Markdown editor, right = Chat panel. Sidebar = slide-out drawer with upload dropzone and document list. Editor content is debounced (300ms) and passed to chat as
currentNote. - Chat:
ChatPanelusesuseChatfrom@ai-sdk/reactwithDefaultChatTransportpointing atapi: "/api/chat"andbody: () => ({ current_note: noteRef.current }). Messages rendered byMessage; each message hasparts—reasoning→Thought(collapsible),text→AnimatedText. - API proxy:
app/api/chat/route.tsforwards POST toNEXT_PUBLIC_API_URL(defaulthttp://localhost:8000)/api/chat and streams the response back. - Upload:
lib/api/upload.tsPOSTs toNEXT_PUBLIC_API_URL/api/uploadwith FormData;UploadDropzoneusesuploadPDF()and pushes aDocumentintoDocumentDrawerstate (documents are in-memory only, no persistence).
- Chat: User types in
ChatInput→sendMessage→ Next.js/api/chat→ FastAPI/api/chat→hybrid_search(userMessage)(empty) +_build_system_prompt(current_note, context_chunks)+stream_completion(conversation)→ stream with0:,g:lines → frontend parses into message parts (text/reasoning) and renders. - Notes: Editor content (debounced) is sent as
current_notein every chat request body; backend injects it into the system prompt as “Active Research Notes”. - Upload: PDF → backend saves to
uploads/{id}.pdf, returns id/filename/size/status; frontend adds aDocumentto sidebar list. No DB, no RAG ingestion yet.
- Backend runs on port 8000 (uvicorn default). Frontend expects backend at
NEXT_PUBLIC_API_URL(dev:http://localhost:8000). - Chat stream format is Vercel AI SDK data stream: text =
0:JSON\n, reasoning =g:JSON\n, then end tokens. - from Ollama is parsed server-side and emitted as reasoning parts so the UI can show “thought” blocks separately.
- UI:
border-thin,text-muted-foreground,ScrollArea,ResizablePanelGroup, Motion for drawer and buttons.
- Backend: Implement RAG: PDF text extraction (e.g. PyMuPDF), chunking, embeddings, ChromaDB persistence, then real
hybrid_search(vector + BM25 + RRF). Optional: background job after upload. - Frontend: Document list is ephemeral; could sync with backend or add a “list documents” API and persistence.
Use this as the codebase context when continuing with ChatGPT or other tools.