neocaml is a new Emacs package for programming in OCaml. It features
both a couple of major modes (for OCaml and OCaml Interface), using TreeSitter,
and integration with an OCaml toplevel (a.k.a. REPL).
It's also as cool as Neo from "The Matrix". ;-)
Because caml-mode is ancient, and tuareg-mode is a beast. (it's very powerful, but also very complex)
The time seems ripe for a modern, leaner, TreeSitter-powered mode for
OCaml.
There have been two other attempts to create TreeSitter-powered major modes for Emacs, but they didn't get very far:
- ocaml-ts-mode (first one, available in MELPA)
- ocaml-ts-mode (second one)
Looking at the code of both modes, I inferred that the authors were probably knowledgable in OCaml, but not very familiar with Emacs Lisp and Emacs major modes in general. For me it's the other way around, and that's what makes this a fun and interesting project for me:
- I enjoy working on Emacs packages
- I want to do more work TreeSitter, how that it's getting more traction
- I really like OCaml and it's one of my favorite "hobby" languages
They say that third time's the charm, right?
One last thing - we really need more Emacs packages with fun names! :D
Build a modern Emacs major mode for OCaml, powered by TreeSitter for font-locking and indentation.
Note
This primary goal has been fully fulfilled, as of neocaml 0.1 - the
mode's first "official" release.
Secondary goal - port this functionality to Tuareg, if feasible.
The project is still young, but already usable for day-to-day OCaml editing. Font-locking, indentation, navigation, imenu, and REPL integration all work well. Bug reports and contributions are welcome!
neocaml is available on MELPA. If you have
MELPA in your package-archives, install it with:
M-x package-install <RET> neocaml <RET>
Or with use-package:
(use-package neocaml
:ensure t
:config
;; Register neocaml modes with Eglot
(with-eval-after-load 'eglot
(add-to-list 'eglot-server-programs
'((neocaml-mode neocaml-interface-mode) . ("ocamllsp")))))On Emacs 29+ you can install directly from the repository:
M-x package-vc-install <RET> https://github.com/bbatsov/neocaml <RET>
Or with use-package on Emacs 30+:
(use-package neocaml
:vc (:url "https://github.com/bbatsov/neocaml" :rev :newest)
:config
;; Register neocaml modes with Eglot
(with-eval-after-load 'eglot
(add-to-list 'eglot-server-programs
'((neocaml-mode neocaml-interface-mode) . ("ocamllsp")))))Note
If the required tree-sitter grammars are not installed, run
M-x neocaml-install-grammars to install them.
The neocaml package bundled two major modes - one for OCaml code
and one for OCaml interfaces (.mli). Both modes will be auto-enabled
when you open the respective type of files.
You can use C-c C-a to toggle between implementation and interface files.
To use neocaml with Eglot you'll need to register the modes with ocamllsp:
(with-eval-after-load 'eglot
(add-to-list 'eglot-server-programs
'((neocaml-mode neocaml-interface-mode) . ("ocamllsp"))))Note
neocaml sets the eglot-language-id symbol property on both modes
("ocaml" for .ml and "ocaml.interface" for .mli), so the correct
language IDs are sent to the server automatically.
ocaml-eglot is a lightweight minor
mode that enhances the Eglot experience for OCaml by exposing custom LSP
requests from ocamllsp — type enclosing, case analysis, hole navigation, and
more. It works with neocaml out of the box:
(use-package ocaml-eglot
:ensure t
:hook
((neocaml-mode neocaml-interface-mode) . ocaml-eglot)
(ocaml-eglot . eglot-ensure))The modes provide 4 levels of font-locking, as is the standard for TreeSitter-powered modes. The default font-locking level is Emacs is 3, and you can change like this:
;; this font-lock everything neocaml supports
(setq treesit-font-lock-level 4)See the documention for treesit-font-lock-level and treesit-font-lock-features for more details.
You can "prettify" certain symbols (see neocaml-prettify-symbols-alist) by
enabling prettify-symbols-mode via a hook:
(add-hook 'neocaml-mode-hook #'prettify-symbols-mode)When it comes to indentation you've got several options:
- Using the built-in TreeSitter indentation
- Supports
letbindings,let...inchains,match/tryexpressions,if/then/else, variant and record types, modules, signatures, loops,fun/functionexpressions, lists, arrays, sequences, and more - It still needs some work, so it might not always behave the way you'd like it to
- Supports
- Use the built-in Emacs function
indent-relativethat simply indents the next line relative to the previous line and allows you manually indent/outdent further. Very simple, but kind of bullet-proof. - Use the indent function of
ocp-indent.el(this requires for you to haveocp-indent.elandocp-indentinstalled - Use the indent function of Tuareg.
You can change the indention function used by Neocaml like this:
(defun my-neocaml-mode-setup ()
"Set up my custom indentation for neocaml-mode."
(setq-local indent-line-function 'indent-relative))
(add-hook 'neocaml-mode-hook 'my-neocaml-mode-setup)neocaml provides integration with the OCaml toplevel (REPL). This allows you to evaluate OCaml code directly from your source buffer and see the results.
You can also start a OCaml REPL (toplevel) and interact with it using
neocaml-repl-minor-mode. You can enable the mode like this:
(add-hook 'neocaml-mode-hook #'neocaml-repl-minor-mode)If you're using use-package you'd probably do something like:
(use-package neocaml
:vc (:url "https://github.com/bbatsov/neocaml" :rev :newest)
:config
(add-hook 'neocaml-mode-hook #'neocaml-repl-minor-mode)
;; other config options...
)The following commands are available for interacting with the OCaml toplevel:
| Keybinding | Command | Description |
|---|---|---|
C-c C-z |
neocaml-repl-switch-to-repl |
Start OCaml REPL or switch to it if already running |
C-c C-c |
neocaml-repl-send-definition |
Send the current definition to the REPL |
C-c C-r |
neocaml-repl-send-region |
Send the selected region to the REPL |
C-c C-b |
neocaml-repl-send-buffer |
Send the entire buffer to the REPL |
C-c C-p |
neocaml-repl-send-phrase |
Send the current phrase (code up to next ;;) to the REPL |
C-c C-i |
neocaml-repl-interrupt |
Interrupt the current evaluation in the REPL |
C-c C-k |
neocaml-repl-clear-buffer |
Clear the REPL buffer |
You can customize the OCaml REPL integration with the following variables:
;; Change the OCaml toplevel program
(setq neocaml-repl-program-name "utop") ; Use utop instead of ocaml
;; Add command-line arguments
(setq neocaml-repl-program-args '("-short-paths" "-color=never"))
;; Change the REPL buffer name
(setq neocaml-repl-buffer-name "*OCaml-REPL*")utop is an improved toplevel for OCaml with many features like auto-completion, syntax highlighting, and a rich history. To use utop with neocaml-repl:
(setq neocaml-repl-program-name "utop")
(setq neocaml-repl-program-args '("-emacs"))- Tree-sitter based font-locking (4 levels) for
.mland.mlifiles - Tree-sitter based indentation with cycle-indent support
- Navigation (
beginning-of-defun,end-of-defun,forward-sexp) - Imenu with language-specific categories for
.mland.mli - Toggling between implementation and interface via
ff-find-other-file(C-c C-a) - OCaml toplevel (REPL) integration (
neocaml-repl) - Easy installation of
ocamlandocaml-interfacetree-sitter grammars viaM-x neocaml-install-grammars - Eglot integration (with ocaml-eglot support)
- Prettify-symbols for common OCaml operators
- Integration with dune
You can install the required ocaml and ocaml-interface grammars by running
M-x neocaml-install-grammars.
You can you configure neocaml--debug to get more debug information from TreeSitter:
- When you set this to
tit will output indentation debug data and enabletreesitter-inspect-mode(this shows the current node in the modeline) - When you set this to
'font-lockit will also output some font-lock debug info. (note this can be get very noisy)
;; enable all TreeSitter debug information
(setq neocaml--debug 'font-lock)As combobulate doesn't support OCaml yet, it seems the best way to test TS queries is the following:
If you don’t want to use Combobulate to help you, the builtin method – the only method – is to call treesit-query-capture with a starting node (often the one from treesit-buffer-root-node or treesit-parser-root-node) and the query and then manually inspect the output to see if it’s right. Ugh. It’s messy, and it’s hard work. Trust me, I know. I recommend you learn how to use IELM if you decide to go this route.
Emacs doesn't support directly using .scm (TreeSitter queries) files, so we currently need
to manually code both the font-locking and indentation queries.
Emacs 31 will introduce define-treesit-generic-mode that will make it possible to
use .scm for font-locking.
The ocaml-interface tree-sitter grammar inherits all rules from the base
ocaml grammar and only overrides compilation_unit (accepting _signature_item
instead of _structure_item). Both grammars expose the same set of named node
types, which means:
- Font-lock queries that reference
.ml-only constructs (e.g.application_expression,let_binding) simply produce no matches in.mlifiles — they are harmless no-ops. - Indentation rules work identically because the node types used for anchoring
(
structure,signature,value_specification,type_binding, etc.) exist in both grammars.
This lets neocaml use a single set of font-lock and indentation rules for both
neocaml-mode and neocaml-interface-mode, keeping the code simple and maintainable.
The only place where the two modes diverge is imenu, which uses tailored categories
for each grammar (e.g. "Val" and "External" for .mli vs "Value" for .ml).
You can control the amount of fontification applied by Font Lock mode of
major modes based on tree-sitter by customizing the variable
treesit-font-lock-level. Its value is a number between 1 and 4:
- Level 1: This level usually fontifies only comments and function names in function definitions.
- Level 2 This level adds fontification of keywords, strings, and data types.
- Level 3 This is the default level; it adds fontification of assignments, numbers, etc.
- Level 4 This level adds everything else that can be fontified: operators, delimiters, brackets, other punctuation, function names in function calls, property look ups, variables, etc.
Note that the 4 levels are defined by each major-mode and the above are just recommendations.
Tree-sitter indentation in Emacs is driven by treesit-simple-indent-rules — a
list of (MATCHER ANCHOR OFFSET) triples tried in order. The first matching rule
wins. MATCHER decides if a rule applies, ANCHOR provides a reference position,
and OFFSET is added to that position's column to produce the final indentation.
The rules in neocaml--indent-rules are roughly ordered as follows:
- Empty-line handling —
(no-node ...)must come first (see below). - Top-level —
(parent-is "compilation_unit")pins everything at column 0.compilation_unitis tree-sitter's root node representing the entire source file. - Closing delimiters —
),],},done,endalign with the opening construct viaparent-bol 0. - Keyword alignment —
with,then_clause,else_clause, match|align with their enclosing keyword. - Body indentation — children of
let_binding,match_case,structure,do_clause, etc. are indented byneocaml-indent-offset. - Error recovery —
(parent-is "ERROR")indents by offset so that typing inside incomplete code gets reasonable indentation.
When the cursor is on an empty line, tree-sitter has no node at point. In
Emacs 30 the indentation engine (treesit--indent-1) sets node=nil and
resolves parent via treesit-node-on, which returns compilation_unit — the
only node spanning the empty position. This means the
(parent-is "compilation_unit") rule would fire first, always giving column 0
— even inside incomplete constructs like let x =.
We solve this with a single no-node rule placed before all other rules:
(no-node prev-line neocaml--empty-line-offset)prev-lineanchors to the previous line's indentation (first non-whitespace column).neocaml--empty-line-offsetis a custom offset function that inspects the last token on the previous line. If it's a "body-expecting" token (listed inneocaml--indent-body-tokens:=,->,then,else,do,struct,sig,begin,object,in,with,fun,function,try), the offset isneocaml-indent-offset; otherwise it's 0.
This gives the right result in all common cases:
| Previous line ends with | Anchor (prev-line) | Offset | Result |
|---|---|---|---|
let x = |
col 0 | +2 | col 2 |
let x = 42 |
col 0 | 0 | col 0 |
module M = struct |
col 0 | +2 | col 2 |
let x = (inside struct) |
col 2 | +2 | col 4 |
let x = 42 (inside struct) |
col 2 | 0 | col 2 |
- Use
treesit-explore-modeandtreesit-inspect-modeto see node types at point. Setneocaml--debugtotto enable verbose indentation logging. - Order matters: more specific rules must come before general ones.
- The
parent-bolanchor resolves to the first non-whitespace column on the parent node's starting line. This is almost always what you want. - When a parent node starts on the same line as its first child (common with
variant declarations),
parent-bolshifts unexpectedly after the child is indented. Useneocaml--grand-parent-bolto go one level up instead. - Test new rules with
eldev test— the indentation test suite useswhen-indenting-itspecs that assert exact indentation for multi-line OCaml snippets.
Based on ideas and code from:
- clojure-ts-mode
- ocaml-ts-mode
- nvim-treesitter's OCaml TreeSitter queries
- https://www.masteringemacs.org/article/lets-write-a-treesitter-major-mode
- https://www.gnu.org/software/emacs/manual/html_node/elisp/Parsing-Program-Source.html
- https://www.gnu.org/software/emacs/manual/html_node/elisp/Tree_002dsitter-Major-Modes.html
- https://www.gnu.org/software/emacs/manual/html_node/elisp/Parser_002dbased-Indentation.html#index-treesit_002dsimple_002dindent_002dpresets (indentation)
- https://www.gnu.org/software/emacs/manual/html_node/elisp/Faces-for-Font-Lock.html
- https://www.jonashietala.se/blog/2024/03/19/lets_create_a_tree-sitter_grammar/
- https://archive.casouri.cc/note/2024/emacs-30-tree-sitter/
Copyright © 2025-2026 Bozhidar Batsov and contributors.
Distributed under the GNU General Public License, version 3 or later. See LICENSE for details.