GCX1 — Gortex Compact Wire Format

Status: Draft v1. Shipped in Gortex v0.9.0.

GCX1 is a tab-delimited, line-oriented, round-trippable wire format for Gortex MCP tool responses. It is an opt-in alternative to JSON selected per-call via format: "gcx". On the benchmark bundled at bench/wire-format/ it yields a median −27.4 % tiktoken savings vs JSON with 100 % round-trip integrity across 20 representative tool responses.

Goals

Round-trippable. Every GCX payload decodes back to an equivalent Go value. No lossy text.
Tokenizer-aware. Field delimiters, escape sequences, and header syntax are chosen so tiktoken (cl100k_base) counts them as whitespace or single tokens — matching the LLM budget users care about, not just raw bytes.
Per-tool tunable. Hot-path tools (search_symbols, find_usages, analyze, ...) ship hand-tuned encoders with fixed field layouts. Everything else falls through to a generic fallback so no tool ever produces invalid GCX.
Versioned. The header carries a protocol version. Decoders reject unknown versions and agents can fall back to JSON transparently.

Non-goals

Binary encoding. GCX1 is text-only; a future GCX2 may carry binary payloads (CBOR / MessagePack) under the same version prefix, but v1 stays text so agents can read raw payloads during debugging.
Schema evolution inside a major version. The field layout for a given tool is fixed for the lifetime of GCX1. New fields ship as GCX2.
Streaming. GCX1 is full-response. GCX1-stream is a reserved future extension.

Grammar (EBNF)

payload       = section { section } ;
section       = header row-line { row-line | comment } ;
header        = TAG SP "tool=" token { SP key-value } SP "fields=" field-list LF ;
key-value     = token "=" value ;
field-list    = token { "," token } ;
row-line      = value { TAB value } LF | LF ;
comment       = "#" [ SP text ] LF ;
value         = { escaped-char | safe-char } ;
escaped-char  = "\\" ( "\\" | "t" | "n" ) ;
safe-char     = any UTF-8 codepoint except TAB, LF, "\\" ;
TAG           = "GCX1" ;
TAB           = U+0009 ;
LF            = U+000A ;
SP            = U+0020 ;

Header

Each section begins with a single-line header:

GCX1 tool=<name> fields=<a>,<b>,... [k=v]...

tool= is the MCP tool name (or a dot-suffixed sub-section name like get_callers.edges).
fields= is a comma-separated list declaring the column order for subsequent rows. At least one field is required.
Additional space-separated k=v pairs carry metadata (total, truncated, etag, rows, ms, ...). Keys are emitted in sorted order so fixtures stay deterministic.

Header values that contain spaces, =, tabs, newlines, or backslashes must be escaped exactly as row values are escaped.

Example:

GCX1 tool=search_symbols fields=id,kind,name,path,line,sig rows=3 total=7 truncated=false

Rows

After the header, each non-blank, non-comment line is a row of tab-separated values in the order declared by fields=.

Fewer values than declared fields: missing trailing columns default to "".
More values than declared fields: decoder returns an error.
Blank lines between rows are ignored.

Comments

Lines beginning with # are comments. Comments carry no data; any intermediary may drop them. The encoder uses them to annotate the first row of a section (e.g. # 3 matches).

Escape rules

A row value may contain the following characters by escaping them:

Character	Escape
`\` (backslash)	`\\`
TAB (U+0009)	`\t`
LF (U+000A)	`\n`

Any other \x sequence decodes to the literal byte x so a pathological payload cannot wedge the decoder. Callers should treat decoded values as untrusted input.

CR (U+000D) is stripped on encode so Windows CRLF input round-trips as \n-only output.

Multi-section payloads

A GCX1 payload may contain multiple sections concatenated back-to-back. Each new section begins with its own GCX1 header line. Decoders detect section boundaries by scanning for the header tag after the current section's rows exhaust.

Multi-section is used by:

get_callers, get_call_chain, get_dependencies, get_dependents, find_implementations — emit <tool>.nodes then <tool>.edges.
get_editing_context — emits target, callers, dependencies, tests sections.
get_repo_outline — one section per top-level key (languages, communities, hotspots, most_imported, entry_points).

Per-tool field layouts (GCX1 v1)

`search_symbols`

field	type	description
id	string	node ID
kind	string	`function`, `method`, `type`, `interface`, `variable`, `contract`
name	string	short name
path	string	file path
line	int	start line
sig	string	extracted signature, optional

Header meta: total, truncated.

`get_symbol_source`

field	type	description
id	string
kind	string
name	string
path	string
start_line	int
end_line	int
from_line	int	first line of returned source (may precede `start_line` by `context_lines`)
sig	string
etag	string	content hash for `if_none_match` caching
source	string	full source text, tab/newline-escaped

Exactly one row.

`batch_symbols`

field	type
id	string
kind	string
name	string
path	string
start_line	int
end_line	int
sig	string
source	string
error	string

`find_usages`

field	type	description
from	string	caller symbol ID
to	string	called symbol ID (the query subject)
edge_kind	string	`calls`, `references`, `implements`, ...
origin	string	tier: `lsp_resolved`, `lsp_dispatch`, `ast_resolved`, `ast_inferred`, `text_matched`
confidence	float	0..1
from_name	string	caller short name
from_path	string	caller file path
from_line	int	caller start line

`get_file_summary`

field	type
id	string
kind	string
name	string
line	int
sig	string

Header meta: total_nodes, total_edges, truncated, etag.

`get_callers` / `get_call_chain` / `get_dependencies` / `get_dependents` / `find_implementations`

Two sections: <tool>.nodes then <tool>.edges.

.nodes fields: id, kind, name, path, line.
.edges fields: from, to, kind, origin, confidence, label.

`get_editing_context`

Four sections. Fields:

.target: id, kind, name, path, start_line, end_line, sig, etag. One row.
.callers: id, kind, name, path, line.
.dependencies: same as .callers.
.tests: path.

`smart_context`

Two sections: .task (one row, field task) and .symbols with fields id, kind, name, path, line, score, reason.

`analyze`

Kind-polymorphic header tag (analyze.dead_code, analyze.hotspots, analyze.cycles, analyze.<other>):

analyze.dead_code: id, kind, name, path, line, reason.
analyze.hotspots: id, name, path, line, fan_in, fan_out, cross_cut, score.
analyze.cycles: size, severity, nodes (comma-separated).
Anything else falls through to the generic fallback encoder.

`contracts`

contracts.list: id, type, method, path, service, providers, consumers (comma-separated lists).
contracts.orphans (only when action=check): contract_id, side, repo, symbol.

Workspace-aware MCP shapes

GCX1 v1 also defines three protocol-level shapes that travel alongside tool responses: a tool-definitions registry section, a tool-request envelope, and an error envelope. Every MCP tool definition carries an explicit scope, and so the legality of an inbound call can be decided by combining that scope with the request's repo parameter. All three shapes are first-class GCX1 sections and must round-trip byte-identically across gcx-go and gcx-ts.

`tool_definitions`

Section for the per-tool scope registry. Layout:

GCX1 tool=tool_definitions fields=name,scope
<name>\t<scope>\n
...

name is the MCP tool name (one row per tool).
scope is one of the three string literals repo, workspace, fan-out. Anything else is a schema error in both codecs.
Rows are emitted in ascending name order so the bytes are reproducible regardless of the encoder's input order.

A definition without scope (empty cell, missing column, or unknown value) is a schema error and both codecs reject it on encode and on decode.

`tool_request`

Envelope for one inbound MCP call. Layout:

GCX1 tool=tool_request fields=tool,scope,repo
<tool>\t<scope>\t<repo-cell>\n

Exactly one row. The repo cell is a union shape decided by scope:

scope	`repo` cell
`repo`	a non-empty repo name (plain string, e.g. `gortex`)
`workspace`	empty string (the `repo` parameter is absent)
`fan-out`	a compact JSON-array literal, e.g. `["*"]` or `["gortex","gortex-cloud"]`

Rationale for the cell encoding choices:

scope=repo → plain string. A single repo name is the most common case and never needs structure; a plain string keeps the cell tokenizer-friendly.
scope=workspace → empty. The repo parameter MUST NOT be present for workspace-level tools. The empty cell — already how GCX1 represents an absent column under the "fewer values than declared fields default to empty" rule — is the correct on-wire signal for that absence.
scope=fan-out → compact JSON array. This re-uses the generic-fallback nested-value rule already used elsewhere in GCX1 ("nested values inside a cell serialise to compact JSON"). Callers decode the cell with JSON.parse (TypeScript) or json.Unmarshal (Go) without learning a new escape format. Alternative encodings considered:
- Comma-joined string (e.g. gortex,gortex-cloud): rejected because some namespaces legitimately contain commas (gRPC method paths, generic type parameters).
- Repeated cells across multiple rows: rejected because the request envelope is single-row by contract; multi-row would overload the section's identity.
- Tab-joined string: rejected because tab is the GCX1 column delimiter; any in-cell use would force an escape and break the "tabs never appear in cells" property the format relies on for fast scanning.
Compact JSON wins on three axes simultaneously: it is unambiguous (every list value round-trips), it composes with the existing generic-fallback rule, and it stays on a single physical line.

The ["*"] sentinel is a literal two-character string * inside a JSON array — it is the only legal way to spell "fan out across every repo in this workspace". Omitting repo for a fan-out tool is a protocol error, surfaced as an error section with code missing_repo_list (see below).

`error`

Envelope for protocol-level rejections returned by the server in lieu of a tool result. Layout:

GCX1 tool=error fields=code,message,detail
<code>\t<message>\t<detail>\n

Exactly one row. code MUST be non-empty; message and detail are free-form strings (escape rules apply per the standard table). The codes defined in GCX1 v1:

code	when
`unknown_repo`	a fan-out request lists a name not present in the active workspace (resolved Q1)
`missing_repo_list`	a `scope: fan-out` request omits `repo` in workspace mode
`missing_repo`	a `scope: repo` request omits `repo` in workspace mode
`repo_not_allowed`	a `scope: workspace` request includes `repo` (any value)
`wrong_repo_shape`	the `repo` parameter has the wrong type for the tool's declared scope

Both codecs expose these as named constants (ErrCodeUnknownRepo / ERR_CODE_UNKNOWN_REPO, etc.) so call sites do not stringly type the code value.

Conformance

The fixtures under gcx-ts/test/golden/scope_*.gcx cover one fixture per scope kind (repo, workspace, fan-out with ["*"], fan-out with a named subset) plus the two named protocol-error shapes. The Go-side gcx-go parity test (scope_golden_test.go) re-encodes the same logical inputs and asserts byte-for-byte equality against the committed fixtures. Any drift between gcx-go and gcx-ts MUST fail that test before any other CI step.

Generic fallback

Any tool without a hand-tuned encoder routes through the generic fallback. The fallback inspects the canonical JSON shape:

Input shape	Output
`{}` object	one section, one row, fields = sorted keys
`[]` array of objects	one section, one row per element, fields = union of keys (sorted)
`[]` array of scalars	one section, field `value`, one row per element
scalar	one section, field `value`, one row

Nested values (arrays / objects) inside a cell serialise to compact JSON so the cell stays on a single physical line. Decoders may re-hydrate by JSON.parse on such cells.

Versioning

The literal header prefix GCX1 is stable for the lifetime of version 1.
A decoder that sees a different prefix (e.g., GCX2) must treat the payload as unknown and MAY fall back to JSON by re-issuing the MCP call without format: "gcx".
Field layouts for declared tools are frozen within GCX1. Additions ship as GCX2 — renaming a tool's field set is a breaking change.

Rationale

Tab delimiter (not comma): symbol names routinely contain commas ((int, string)) and parentheses. Tab is rare in source and absent from identifiers. Escape pressure stays low.
Newline-terminated rows: tokenizer-friendly and transport-transparent (no binary framing). SSE / chunked HTTP can forward one row per frame without re-parsing.
Minimal escape alphabet: two-byte \t / \n / \\ keeps the hot path cheap. Code payloads rarely contain raw tabs or unescaped backslashes, so escape overhead is a rounding error in practice.
Header-based metadata: total, truncated, etag live on the header rather than a per-row phantom column. That keeps the row schema flat and lets the encoder skip meta work when the tool doesn't care.

Reference implementations

Go encoder / decoder: MIT-licensed standalone module at github.com/gortexhq/gcx-go (go get github.com/gortexhq/gcx-go) — header + row + escape primitives + generic fallback. Per-tool hand-tuned encoders live in internal/mcp/gcx.go.
TypeScript decoder: MIT-licensed standalone package at github.com/gortexhq/gcx-ts (npm: @gortex/wire).

Benchmark

See bench/wire-format/. The harness scores bytes, tiktoken tokens, gzip bytes, and round-trip integrity across 20 representative tool responses and emits a markdown scorecard. Rerun after any change to the upstream gcx-go module or internal/mcp/gcx.go to catch regressions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCX1 — Gortex Compact Wire Format

Goals

Non-goals

Grammar (EBNF)

Header

Rows

Comments

Escape rules

Multi-section payloads

Per-tool field layouts (GCX1 v1)

`search_symbols`

`get_symbol_source`

`batch_symbols`

`find_usages`

`get_file_summary`

`get_callers` / `get_call_chain` / `get_dependencies` / `get_dependents` / `find_implementations`

`get_editing_context`

`smart_context`

`analyze`

`contracts`

Workspace-aware MCP shapes

`tool_definitions`

`tool_request`

`error`

Conformance

Generic fallback

Versioning

Rationale

Reference implementations

Benchmark

FilesExpand file tree

wire-format.md

Latest commit

History

wire-format.md

File metadata and controls

GCX1 — Gortex Compact Wire Format

Goals

Non-goals

Grammar (EBNF)

Header

Rows

Comments

Escape rules

Multi-section payloads

Per-tool field layouts (GCX1 v1)

search_symbols

get_symbol_source

batch_symbols

find_usages

get_file_summary

get_callers / get_call_chain / get_dependencies / get_dependents / find_implementations

get_editing_context

smart_context

analyze

contracts

Workspace-aware MCP shapes

tool_definitions

tool_request

error

Conformance

Generic fallback

Versioning

Rationale

Reference implementations

Benchmark

`search_symbols`

`get_symbol_source`

`batch_symbols`

`find_usages`

`get_file_summary`

`get_callers` / `get_call_chain` / `get_dependencies` / `get_dependents` / `find_implementations`

`get_editing_context`

`smart_context`

`analyze`

`contracts`

`tool_definitions`

`tool_request`

`error`