Ollama Copilot

Overview

Copilot-like Tab Completion for NeoVim

Ollama Copilot allows users to integrate their Ollama code completion models into Neovim, giving GitHub Copilot-like tab completions.

Offers Suggestion Streaming which will stream the completions into your editor as they are generated from the model.

Optimizations:

Debouncing for subsequent completion requests to avoid overflows of Ollama requests which lead to CPU over-utilization.
Full control over triggers, using textChange events instead of Neovim client requests.

Features

Language server which can provide code completions from an Ollama model
Ghost text completions which can be inserted into the editor
Streamed ghost text completions which populate in real-time

Install

Requires

To use Ollama-Copilot, you need to have Ollama installed github.com/ollama/ollama:

curl -fsSL https://ollama.com/install.sh | sh

Also, the language server runs on Python, and requires two libraries (Can also be found in python/requirements.txt)

pip install pygls ollama

Make sure you have the model you want to use installed, a catalog can be found here: ollama.com/library

# To view your available models:
ollama ls

# To pull a new model
ollama pull <Model name>

Using a plugin manager

Lazy:

-- Default configuration
{"Jacob411/Ollama-Copilot", opts={}}

-- Custom configuration (defaults shown)
{
  'jacob411/Ollama-Copilot',
  opts = {
    -- Prefer base code models for autocomplete, not *-instruct chat variants.
    model_name = "qwen2.5-coder:3b",
    ollama_url = "http://localhost:11434", -- URL for Ollama server, Leave blank to use default local instance.
    stream_suggestion = false,
    python_command = "python3",
    filetypes = {'python', 'lua','vim', "markdown"},
    capabilities = nil, -- LSP capabilities, auto-detected if not provided
    ollama_model_opts = {
        temperature = 0.1, -- keep low entropy for stable tab completion
        top_p = 0.9,
        num_predict = 128, -- 64-256 is usually best for autocomplete
        num_ctx = 8192,
        fim_enabled = true, -- include prefix + suffix (Fill-in-the-middle)
        fim_mode = "auto", -- "auto" | "template" | "manual" | "off"
        context_lines_before = 80,
        context_lines_after = 40,
        max_prefix_chars = 8000,
        max_suffix_chars = 3000,
        stop = { "<|im_start|>", "<|im_end|>", "<|fim_prefix|>", "<|fim_suffix|>", "<|fim_middle|>", "```" },
        -- Internal payload/response logging (or set OLLAMA_COPILOT_DEBUG=1).
        -- debug = true,
        -- debug_log_file = "/tmp/ollama-copilot-debug.log",
    },
    keymaps = {
        suggestion = '<leader>os',
        reject = '<leader>or',
        insert_accept = '<Tab>',
    },
}
},

For more Ollama customization, see github.com/ollama/ollama/blob/main/docs/modelfile.md

LSP Capabilities Configuration

The plugin automatically detects and configures LSP capabilities for optimal completion support:

Auto-detection (default): If capabilities is not specified, the plugin will:
- Try to use cmp_nvim_lsp.default_capabilities() if nvim-cmp is installed
- Fall back to vim.lsp.protocol.make_client_capabilities() if nvim-cmp is not available

Custom capabilities: You can override the auto-detection by providing your own capabilities:

opts = {
  capabilities = require('cmp_nvim_lsp').default_capabilities(),
  -- or use custom capabilities
  capabilities = vim.tbl_deep_extend('force',
    vim.lsp.protocol.make_client_capabilities(),
    { your_custom_capability = true }
  )
}

This ensures backward compatibility while allowing the plugin to work without requiring nvim-cmp as a dependency.

Usage

Ollama copilot language server will attach when you enter a buffer and can be viewed using:

:LspInfo

Recomendations

Prefer base coder models for completion quality (qwen2.5-coder:*, deepseek-coder:*) and avoid *-instruct unless you explicitly want chat-like behavior.
3B models are fast but can be weak/unstable on instruction-heavy files (markdown/docs), so 7B is often a better default if your machine can handle it.

Payload Debugging

To inspect exact requests sent to Ollama and raw streamed chunks:

OLLAMA_COPILOT_DEBUG=1 nvim

or set in ollama_model_opts:

debug = true
debug_log_file = "/tmp/ollama-copilot-debug.log"

Minimal Repro Script

Use the included payload test script to verify prompt shape and suffix usage:

cd ~/path/to/Ollama-Copilot
python3 python/payload_debug_demo.py

Contributing

Contributions are welcome! If you have any ideas for new features, improvements, or bug fixes, please open an issue or submit a pull request.

I am hopeful to add more on the model side as well, as I am interested in finetuning the models and implementing RAG techniques, moving outside of using just Ollama.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
assets		assets
lua/OllamaCopilot		lua/OllamaCopilot
ollama		ollama
plugin		plugin
python		python
.gitignore		.gitignore
README.md		README.md
ideas.md		ideas.md
ollama.md		ollama.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama Copilot

Overview

Copilot-like Tab Completion for NeoVim

Optimizations:

Features

Install

Requires

Using a plugin manager

LSP Capabilities Configuration

Usage

Recomendations

Payload Debugging

Minimal Repro Script

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ollama Copilot

Overview

Copilot-like Tab Completion for NeoVim

Optimizations:

Features

Install

Requires

Using a plugin manager

LSP Capabilities Configuration

Usage

Recomendations

Payload Debugging

Minimal Repro Script

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages