Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/link-authoring-contract.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@inkeep/open-knowledge": minor
---

Write-time broken-link validation in the Open Knowledge MCP. `write`/`edit` now return `brokenLinks` in the same response — outbound links that don't resolve are surfaced at write time, report-only so the write still lands and you can author a doc before its link target exists. Validation covers **every** local link, not just docs: the `./`-onto-a-content-root-path doubling footgun and missing `[[wiki]]` / markdown doc targets (`no-such-doc`), root-escaping paths from one `../` too many (`unresolvable`), and links to assets or source files (`[src](../../foo.py)`) that don't exist on disk at the resolved path (`no-such-file`). That last reason closes the gap a real codebase-wiki run hit: wrong-depth source-file links that 404 silently because the doc-only link graph never tracked them. The platform and pack skills are updated to point agents at `brokenLinks` as the primary write-time check and to clarify that a same-pass forward-reference reports as `no-such-doc` until its target lands (the `links({ kind: "dead" })` audit is the authoritative end-state check).
2 changes: 1 addition & 1 deletion docs/content/reference/core-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ The set of files the engine treats as your knowledge base is the configured cont

## Links and backlinks

Internal cross-references are written with **standard markdown links**: `[text](./relative/path.md)` or `[text](/absolute/from/content-root.md)`. Whenever document A links to document B, OpenKnowledge automatically records the inverse on B: a **backlink** from B back to A.
Internal cross-references are written with **standard markdown links**. The recommended form is **relative** — `[text](./sibling.md)`, `[text](../folder/doc.md)` — which stays portable across GitHub, Obsidian, VS Code, and published sites. A **root-absolute** form (`[text](/folder/doc.md)`, where the leading slash means the content root) is equally valid and convenient for cross-folder links. The two never mix: never glue `./` onto a content-root path, since `./folder/doc.md` written from a doc already inside `folder/` resolves to the doubled, broken `folder/folder/doc.md` — `write`/`edit` flag exactly this in their `brokenLinks` response. Whenever document A links to document B, OpenKnowledge automatically records the inverse on B: a **backlink** from B back to A.

You never write backlinks by hand. They are computed from the links you already write, and together they form the **link graph**: the network of relationships across your knowledge base.

Expand Down
286 changes: 286 additions & 0 deletions packages/app/tests/integration/link-authoring-contract.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
import { afterAll, beforeAll, expect, test } from 'bun:test';
import { randomUUID } from 'node:crypto';
import { mkdirSync, readFileSync, writeFileSync } from 'node:fs';
import { join } from 'node:path';
import { HARNESS_BOOT_TIMEOUT_MS } from './harness-boot-timeout';
import { awaitFileWatcherIndexed, createTestServer, type TestServer } from './test-harness.ts';

const MCP_PROTOCOL_VERSION = '2025-06-18';

interface InitializedSession {
sessionId: string;
protocolVersion: string;
}

interface BrokenLink {
href: string;
resolvedTo: string | null;
reason: 'no-such-doc' | 'no-such-file' | 'unresolvable';
}

async function openMcpSession(port: number): Promise<InitializedSession> {
const init = await fetch(`http://127.0.0.1:${port}/mcp`, {
method: 'POST',
headers: { accept: 'application/json, text/event-stream', 'content-type': 'application/json' },
body: JSON.stringify({
jsonrpc: '2.0',
id: 1,
method: 'initialize',
params: {
protocolVersion: MCP_PROTOCOL_VERSION,
capabilities: {},
clientInfo: { name: 'Claude', version: '1.0.0' },
},
}),
});
expect(init.status).toBe(200);
const sessionId = init.headers.get('mcp-session-id');
expect(sessionId).toBeTruthy();
const initBody = (await init.json()) as { result?: { protocolVersion?: string } };
const protocolVersion = initBody.result?.protocolVersion ?? MCP_PROTOCOL_VERSION;

const initialized = await fetch(`http://127.0.0.1:${port}/mcp`, {
method: 'POST',
headers: {
accept: 'application/json, text/event-stream',
'content-type': 'application/json',
'mcp-session-id': sessionId as string,
'mcp-protocol-version': protocolVersion,
},
body: JSON.stringify({ jsonrpc: '2.0', method: 'notifications/initialized' }),
});
expect(initialized.status).toBe(202);
return { sessionId: sessionId as string, protocolVersion };
}

let nextId = 100;

async function callTool(
port: number,
session: InitializedSession,
name: string,
args: Record<string, unknown>,
cwd: string,
): Promise<{ isError?: boolean; structuredContent?: Record<string, unknown> }> {
nextId += 1;
const res = await fetch(`http://127.0.0.1:${port}/mcp`, {
method: 'POST',
headers: {
accept: 'application/json, text/event-stream',
'content-type': 'application/json',
'mcp-session-id': session.sessionId,
'mcp-protocol-version': session.protocolVersion,
},
body: JSON.stringify({
jsonrpc: '2.0',
id: nextId,
method: 'tools/call',
params: { name, arguments: { ...args, cwd } },
}),
});
expect(res.status).toBe(200);
const body = (await res.json()) as {
result?: { isError?: boolean; structuredContent?: Record<string, unknown> };
error?: unknown;
};
if (body.error) throw new Error(`tools/call error: ${JSON.stringify(body.error)}`);
return body.result ?? {};
}

function docResult(structured: Record<string, unknown> | undefined): Record<string, unknown> {
expect(structured).toBeDefined();
const doc = structured?.document as Record<string, unknown> | undefined;
expect(doc).toBeDefined();
return doc as Record<string, unknown>;
}

let server: TestServer;

beforeAll(async () => {
server = await createTestServer({ debounce: 50, maxDebounce: 200 });
}, HARNESS_BOOT_TIMEOUT_MS);

afterAll(async () => {
await server.cleanup();
});

test('write surfaces broken outbound links (doubling + escape-root + broken wiki) in the same response', async () => {
const session = await openMcpSession(server.port);
const folder = `wiki-${randomUUID().slice(0, 8)}`;
const docName = `${folder}/OVERVIEW`;
const content = [
'# Wiki Overview',
'',
`See [tasks](./${folder}/modules/tasks) for the task module.`,
'A bad [escape](../../../way-out.md) link.',
'And a [[Ghost Page]] wiki reference.',
'',
].join('\n');

const result = await callTool(
server.port,
session,
'write',
{ document: { path: docName, content, position: 'replace' } },
server.contentDir,
);
expect(result.isError ?? false).toBe(false);

const doc = docResult(result.structuredContent);
const broken = doc.brokenLinks as BrokenLink[];
expect(broken).toEqual([
{
href: `./${folder}/modules/tasks`,
resolvedTo: `${folder}/${folder}/modules/tasks`,
reason: 'no-such-doc',
},
{ href: '../../../way-out.md', resolvedTo: null, reason: 'unresolvable' },
{ href: '[[Ghost Page]]', resolvedTo: 'Ghost Page', reason: 'no-such-doc' },
]);

const stored = readFileSync(join(server.contentDir, `${docName}.md`), 'utf-8');
expect(stored).toContain(`[tasks](./${folder}/modules/tasks)`);
expect(stored).toContain('[escape](../../../way-out.md)');
expect(stored).toContain('[[Ghost Page]]');
});

test('a write whose links all resolve returns brokenLinks: [] (positive confirmation)', async () => {
const session = await openMcpSession(server.port);
const docName = `clean-${randomUUID().slice(0, 8)}`;
const content = [
'# Clean',
'',
`Back to [self](./${docName.split('/').pop()}.md).`,
'An [external](https://example.com) site and an [anchor](#clean).',
'',
].join('\n');

const result = await callTool(
server.port,
session,
'write',
{ document: { path: docName, content, position: 'replace' } },
server.contentDir,
);
expect(result.isError ?? false).toBe(false);
const doc = docResult(result.structuredContent);
expect(doc.brokenLinks).toEqual([]);
});

test('write validates links to any file on disk, not just docs (source-file depth)', async () => {
const session = await openMcpSession(server.port);
const uid = randomUUID().slice(0, 8);
const relFile = `src/probe-${uid}.py`;
mkdirSync(join(server.contentDir, 'src'), { recursive: true });
writeFileSync(join(server.contentDir, relFile), 'def probe(): ...\n');

const docName = `wiki-${uid}/modules/m`;
const content = [
'# Module',
'',
`Correct depth: [probe](../../${relFile}).`,
`Over-deep: [probe again](../../../${relFile}).`,
`Missing: [gone](../../src/missing-${uid}.py).`,
'',
].join('\n');

const result = await callTool(
server.port,
session,
'write',
{ document: { path: docName, content, position: 'replace' } },
server.contentDir,
);
expect(result.isError ?? false).toBe(false);
const broken = docResult(result.structuredContent).brokenLinks as BrokenLink[];
expect(broken).toEqual([
{ href: `../../../${relFile}`, resolvedTo: null, reason: 'unresolvable' },
{
href: `../../src/missing-${uid}.py`,
resolvedTo: `src/missing-${uid}.py`,
reason: 'no-such-file',
},
]);
});

test('edit (body find/replace) reports a broken link introduced by the edit', async () => {
const session = await openMcpSession(server.port);
const docName = `edited-${randomUUID().slice(0, 8)}`;
await callTool(
server.port,
session,
'write',
{ document: { path: docName, content: '# Edited\n\nPlaceholder.\n', position: 'replace' } },
server.contentDir,
);

const edited = await callTool(
server.port,
session,
'edit',
{
document: {
path: docName,
find: 'Placeholder.',
replace: 'See [gone](./does-not-exist.md).',
},
},
server.contentDir,
);
expect(edited.isError ?? false).toBe(false);
const doc = docResult(edited.structuredContent);
expect(doc.brokenLinks).toEqual([
{ href: './does-not-exist.md', resolvedTo: 'does-not-exist', reason: 'no-such-doc' },
]);
});

test('the HTTP /api/agent-write-md response carries brokenLinks directly', async () => {
const docName = `http-${randomUUID().slice(0, 8)}`;
const res = await fetch(`http://127.0.0.1:${server.port}/api/agent-write-md`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
docName,
position: 'replace',
markdown: 'Broken [ref](./nope.md) here.\n',
}),
});
expect(res.status).toBe(200);
const body = (await res.json()) as { brokenLinks?: BrokenLink[] };
expect(body.brokenLinks).toEqual([
{ href: './nope.md', resolvedTo: 'nope', reason: 'no-such-doc' },
]);
});

test('a link to a doc that actually exists is not flagged (admitted-set membership)', async () => {
const session = await openMcpSession(server.port);
const suffix = randomUUID().slice(0, 8);
const target = `guides-${suffix}/install`;
const sourceDoc = `guides-${suffix}/index`;

await callTool(
server.port,
session,
'write',
{ document: { path: target, content: '# Install\n\nSteps.\n', position: 'replace' } },
server.contentDir,
);
await awaitFileWatcherIndexed(server, target);

const result = await callTool(
server.port,
session,
'write',
{
document: {
path: sourceDoc,
content: `# Index\n\nSee [install](./install.md).\n`,
position: 'replace',
},
},
server.contentDir,
);
expect(result.isError ?? false).toBe(false);
const doc = docResult(result.structuredContent);
expect(doc.brokenLinks).toEqual([]);
});
12 changes: 10 additions & 2 deletions packages/app/tests/integration/mcp-move-real-roundtrip.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -135,11 +135,19 @@ describe('MCP move tool — real roundtrip against live OK server (QA-004 / QA-0
expect(result.isError).toBeUndefined();
const structured = result.structuredContent as {
ok: boolean;
renamed: Array<{ fromDocName: string; toDocName: string }>;
renamed: Array<{
fromDocName: string;
toDocName: string;
}>;
rewrittenDocs: Array<{ docName: string; rewrites: number }>;
};
expect(structured.ok).toBe(true);
expect(structured.renamed).toEqual([{ fromDocName: 'auth', toDocName: 'sso' }]);
expect(structured.renamed).toEqual([
{
fromDocName: 'auth',
toDocName: 'sso',
},
]);
const rewrittenNames = structured.rewrittenDocs.map((d) => d.docName);
expect(rewrittenNames).toContain('index');

Expand Down
6 changes: 6 additions & 0 deletions packages/core/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -554,8 +554,13 @@ export {
BacklinkEntrySchema,
type BacklinksSuccess,
BacklinksSuccessSchema,
BROKEN_LINK_REASONS,
type BranchInfoResponse,
BranchInfoResponseSchema,
type BrokenLink,
type BrokenLinkReason,
BrokenLinkSchema,
BrokenLinksSchema,
type CheckoutFailureReason,
CheckoutFailureReasonSchema,
type CheckoutRequest,
Expand Down Expand Up @@ -1083,6 +1088,7 @@ export {
type AnchorLinkTarget,
type AssetLinkTarget,
assertNeverLinkTarget,
buildAbsoluteMarkdownHref,
buildRelativeMarkdownHref,
type ClassifiedLinkTarget,
classifyMarkdownHref,
Expand Down
Loading