NexusAI is a powerful, local laboratory for experimenting with Large Language Models (LLMs). It bridges the gap between complex Python scripts and a user-friendly interface, allowing you to chat with, fine-tune, and customize AI models directly from your desktop.
Designed for developers and hobbyists who want to explore LoRA (Low-Rank Adaptation) fine-tuning, persona engineering, and local inference without relying on cloud providers.
- Local Inference: Run models like
SystemZephyr-3BorQwenentirely offline. - Chain-of-Thought Visualization: See the "hidden" reasoning steps of the model with a collapsible Thinking Process UI.
- Customizable UI: Dark Mode, Resizable Sidebar, and Font Size zooming for accessibility.
- Train Custom Adapters: Upload a
training_data.jsonlfile and train the model on your own data. - LoRA Support: Uses Parameter-Efficient Fine-Tuning to create lightweight adapters (~10MB) instead of retraining the whole model.
- Real-time Metrics: Watch the training loss and progress bar update live in the UI.
- System Prompts: Define the "soul" of your AI (e.g., "You are a pirate", "You are a coding assistant").
- Adapter Management: Load and unload different personality adapters on the fly without restarting the app.
- Frontend: React, Vite, TailwindCSS, Lucide React (Icons).
- Backend: Python, FastAPI, HuggingFace Transformers, PEFT.
- Training: PyTorch, BitsAndBytes (Quantization).
- Python 3.10+ installed.
- Node.js 16+ installed.
- (Optional) NVIDIA GPU (CUDA) or Mac M1/M2/M3 (MPS) for faster training/inference.
Navigate to the root directory and install Python dependencies.
# It is recommended to create a virtual environment first
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install fastapi uvicorn torch transformers peft datasets scipy accelerate bitsandbytesStart the backend server:
python main.pyThe server will start at http://localhost:8000.
Open a new terminal, navigate to the UI folder, and start the development server.
cd nexus-lab-ui
# Install dependencies
npm install
# Start the UI
npm run devAccess the application at http://localhost:5173.
- Open the Chat tab.
- By default, it loads a base model.
- Type your message and hit Enter.
- Click the "T" icon in the navbar to adjust font size if needed.
- Prepare Data: Create a file named
training_data.jsonl.{"prompt": "hi", "response": "Hey you 😊 I was hoping you'd show up. How’s your day going so far?", "score": 10, "source": "human_feedback"} {"prompt": "hello", "response": "Hi! It’s nice to see you here. What are we talking about today?", "score": 9, "source": "human_feedback"} - Train: Go to the Train tab and click "Start Training".
- Wait: Monitor the progress bar.
- Load: Once finished, go to the Model tab and select your new adapter from the list.
- In the Model tab, you can enter a custom System Prompt (e.g., "You are a helpful coding assistant").
- This works in tandem with your loaded adapter to steer the model's behavior.
In the Model tab you can tune how the model generates text. Use the Help tab for full descriptions.
| Parameter | What it does |
|---|---|
| Temperature | Higher = more creative/random; lower = more deterministic (e.g. 0.1 for code, 0.7 for chat). |
| Top P | Nucleus sampling: only sample from the top fraction of likely tokens (e.g. 0.9). |
| Max New Tokens | Maximum length of each reply in tokens. |
| Top K | Only sample from the top K tokens (0 = no limit). |
| Repetition Penalty | Discourages repeating the same tokens (e.g. 1.1). |
| Min New Tokens | Don’t stop before generating at least this many tokens. |
NexusAI/
├── main.py # Backend API Entry Point
├── train.py # Standalone Training Script
├── requirements-train.txt # Python Dependencies
├── nexus_adapters/ # Storage for trained LoRA adapters
├── nexus-lab-ui/ # Frontend React Application
│ ├── src/
│ │ ├── App.jsx # Main UI Logic
│ │ └── ...
│ └── ...
└── README.md # You are here
"Transformers architecture not recognized"
- If you see errors about
qwen3or other new architectures, yourtransformerslibrary might be outdated. - Run:
pip install --upgrade transformers
"CUDA out of memory"
- Try reducing the
batch_sizeintrain.pyor training on a smaller model.
Built with ❤️ by the NexusAI Team.