Summary
On /sync-gbrain, the code stage fails with Invalid source id because deriveCodeSourceId (in bin/gstack-gbrain-sync.ts) produces a slug that:
- Contains a literal
. — gbrain's sources add validator rejects it. The error message says: "Must be 1-32 lowercase alnum chars with optional interior hyphens (e.g. wiki, yc-media)."
- Exceeds the 32-char limit for any non-trivial GitHub org/repo combination.
Memory and brain-sync stages still succeed, so the failure is non-fatal — but the per-repo code source never gets registered, which means gbrain code-def/code-refs/code-callers never work for that repo. The CLAUDE.md guidance block written by /sync-gbrain Step 4 ends up advertising tools that don't function against the cwd code corpus.
Repro
Any github HTTPS remote whose host/org/repo exceeds 19 chars (so gstack-code- + slug > 32) reproduces it. Example:
$ git remote get-url origin
https://github.com/EXAMPLE_ORG/example-repo-name.git
$ /sync-gbrain
[gbrain-sync] mode=incremental engine=unknown
gstack-gbrain-sync (incremental):
ERR code source registration failed: gbrain sources add gstack-code-github.com-EXAMPLE_ORG-example-repo-name failed:
Invalid source id "gstack-code-github.com-EXAMPLE_ORG-example-repo-name". Must be 1-32 lowercase alnum chars with optional interior hyphens (e.g. "wiki", "yc-media").
OK memory ingest pass complete
OK brain-sync curated artifacts pushed
(.gbrain-sync-state.json records the same failure under last_stages[0].summary.)
Root cause
Two compounding issues in bin/gstack-gbrain-sync.ts:160-175:
function deriveCodeSourceId(repoPath: string): string {
const remote = canonicalizeRemote(originUrl());
if (remote) {
return `gstack-code-${remote.replace(/[\/\s]+/g, "-").replace(/-+/g, "-")}`;
}
// Fallback for repos without a remote.
const base = repoPath.split("/").pop() || "repo";
return `gstack-code-${base.toLowerCase().replace(/[^a-z0-9-]+/g, "-").replace(/-+/g, "-")}`;
}
- The remote-path branch only replaces
/ and whitespace. canonicalizeRemote returns github.com/... with the dot intact, and the dot survives into the slug. The fallback branch uses [^a-z0-9-]+ which correctly strips dots — the two branches disagree on what's a legal char.
- Neither branch enforces gbrain's 32-char limit. Even after fixing the regex,
gstack-code-github-com-EXAMPLE_ORG-example-repo is 47 chars, which still fails. The doc comment at line 163 in fact shows this: it claims github.com/garrytan/gstack becomes gstack-code-github-com-garrytan-gstack — that's 38 chars, also over the limit.
Suggested fix
In deriveCodeSourceId:
- Use the same
[^a-z0-9-]+ strip as the fallback branch in both code paths (so dots, underscores, etc. all become hyphens).
- After slugification, if
gstack-code-${slug} exceeds 32 chars, truncate the slug and append a short hash of the full canonical remote (e.g. first 6 chars of a sha1) to keep IDs unique across orgs that share a repo basename. Reserve the 12 chars for the gstack-code- prefix; that leaves 20 chars for ${slug-prefix}-${hash6}.
Sketch:
function deriveCodeSourceId(repoPath: string): string {
const remote = canonicalizeRemote(originUrl());
const raw = remote
? remote.replace(/[^a-z0-9-]+/g, "-").replace(/-+/g, "-").replace(/^-|-$/g, "")
: (repoPath.split("/").pop() || "repo")
.toLowerCase().replace(/[^a-z0-9-]+/g, "-").replace(/-+/g, "-");
const PREFIX = "gstack-code-";
const MAX = 32 - PREFIX.length; // 20 chars left
if (raw.length <= MAX) return PREFIX + raw;
const hash = createHash("sha1").update(remote || repoPath).digest("hex").slice(0, 6);
const head = raw.slice(0, MAX - 1 - hash.length); // leave room for "-${hash}"
return PREFIX + head.replace(/-$/, "") + "-" + hash;
}
Tests worth adding
- Long github HTTPS remote → valid (≤32 chars, alnum+hyphens, no leading/trailing hyphen).
- Long github SSH remote (
git@github.com:org/repo.git) → matches HTTPS counterpart.
- Distinct orgs with same repo basename → distinct slugs (covered by the hash suffix).
- Empty origin (local-only repo) → falls back to basename.
- Update the doc comment example to a slug ≤32 chars so the example doesn't lie.
Environment
- gstack: 1.26.4.0
- gbrain: 0.18.2
- macOS
Summary
On
/sync-gbrain, the code stage fails withInvalid source idbecausederiveCodeSourceId(inbin/gstack-gbrain-sync.ts) produces a slug that:.— gbrain'ssources addvalidator rejects it. The error message says: "Must be 1-32 lowercase alnum chars with optional interior hyphens (e.g.wiki,yc-media)."Memory and brain-sync stages still succeed, so the failure is non-fatal — but the per-repo code source never gets registered, which means
gbrain code-def/code-refs/code-callersnever work for that repo. The CLAUDE.md guidance block written by/sync-gbrainStep 4 ends up advertising tools that don't function against the cwd code corpus.Repro
Any github HTTPS remote whose
host/org/repoexceeds 19 chars (sogstack-code-+ slug > 32) reproduces it. Example:(
.gbrain-sync-state.jsonrecords the same failure underlast_stages[0].summary.)Root cause
Two compounding issues in
bin/gstack-gbrain-sync.ts:160-175:/and whitespace.canonicalizeRemotereturnsgithub.com/...with the dot intact, and the dot survives into the slug. The fallback branch uses[^a-z0-9-]+which correctly strips dots — the two branches disagree on what's a legal char.gstack-code-github-com-EXAMPLE_ORG-example-repois 47 chars, which still fails. The doc comment at line 163 in fact shows this: it claimsgithub.com/garrytan/gstackbecomesgstack-code-github-com-garrytan-gstack— that's 38 chars, also over the limit.Suggested fix
In
deriveCodeSourceId:[^a-z0-9-]+strip as the fallback branch in both code paths (so dots, underscores, etc. all become hyphens).gstack-code-${slug}exceeds 32 chars, truncate the slug and append a short hash of the full canonical remote (e.g. first 6 chars of a sha1) to keep IDs unique across orgs that share a repo basename. Reserve the 12 chars for thegstack-code-prefix; that leaves 20 chars for${slug-prefix}-${hash6}.Sketch:
Tests worth adding
git@github.com:org/repo.git) → matches HTTPS counterpart.Environment