Skip to content

Add Mastra Cloud MCP deployment with Gemini support#22

Merged
venikman merged 9 commits into
mainfrom
feat/mastra-cloud-mcp-deploy
Apr 12, 2026
Merged

Add Mastra Cloud MCP deployment with Gemini support#22
venikman merged 9 commits into
mainfrom
feat/mastra-cloud-mcp-deploy

Conversation

@venikman
Copy link
Copy Markdown
Owner

@venikman venikman commented Apr 12, 2026

Summary

  • Deploy FPF Memory MCP server to Mastra Cloud at https://fpf-memory.server.mastra.cloud
  • Add chat_completions API style to synthesizer for Google Gemini OpenAI-compatible endpoint
  • Move src/mastra.ts to src/mastra/index.ts (required by mastra build convention)
  • Export direct Mastra instance with serverless mcpOptions for stateless HTTP MCP
  • Add GEMINI_AI_API_KEY env var fallback in synthesizer
  • Gitignore .mastra-project.json and package-lock.json (deploy-only artifacts)

MCP Endpoints (deployed)

Endpoint Purpose
/api/mcp/fpf_memory/mcp Streamable HTTP transport
/api/mcp/fpf_memory/sse SSE transport
/api/mcp/fpf_memory/tools Tool discovery (9 tools)
/api/mcp/v0/servers Server listing

Gemini Configuration

FPF_LOCAL_LLM_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai
FPF_LOCAL_LLM_MODEL=gemini-3.1-flash-lite
FPF_LOCAL_LLM_API_STYLE=chat_completions
GEMINI_AI_API_KEY=<your-key>

Test plan

  • bun run check — TypeScript passes
  • bun run test — 36/36 tests pass
  • bun run build — bundles 3 entry points
  • npx mastra server deploy — deployed, health check passed
  • Dashboard shows project Ready at https://fpf-memory.server.mastra.cloud
  • Browser validation of /health, /api/mcp/v0/servers, /api/mcp/fpf_memory/tools

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added HTTP endpoint support as an alternative transport mechanism alongside existing stdio support
    • Extended LM Studio integration to support OpenAI-style chat completions API style
  • Tests

    • Added test coverage for chat completions API style functionality and health checks

Copilot AI review requested due to automatic review settings April 12, 2026 10:00
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added HTTP and SSE transport endpoints to the MCP manifest, extended LM Studio synthesizer and runtime types to support an OpenAI-style chat_completions apiStyle (payload, endpoints, parsing, health checks), added/exported a default Mastra runtime instance, updated imports, tests, and dev dependencies/gitignore.

Changes

Cohort / File(s) Summary
Manifest & Transport Config
manifest.json
transport changed from "stdio"["stdio","http"]; added top-level http with path /api/mcp/fpf_memory/mcp and sse_path /api/mcp/fpf_memory/sse.
Package & Ignore
package.json, .gitignore
Added mastra ^1.5.0 to devDependencies; added .mastra-project.json and package-lock.json to .gitignore.
Mastra runtime surface
src/mastra/index.ts, src/server.ts, tests/mcp-server.test.ts
Adjusted import paths to parent ../...; createMastraRuntime init now includes server: { mcpOptions: { serverless: true } }; exported mastra = createMastraRuntime().mastra; server and test imports updated to ./mastra/index.js.
LM Studio synthesizer & types
src/mcp/tool-contracts.ts, src/runtime/lm-studio-synthesizer.ts, src/runtime/types.ts, tests/lm-studio-synthesizer.test.ts
Added chat_completions to enums/types; implemented OpenAI-style request payloads (model, messages, temperature), added ChatCompletionsPayload parsing (use choices[].message.content), updated endpoint resolution (/chat/completions, /v1/models), API key fallback to GEMINI_AI_API_KEY, and added tests for chat_completions.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant MCP_Server
  participant Synthesizer
  participant LLM_API
  Client->>MCP_Server: HTTP/SSE generation request
  MCP_Server->>Synthesizer: forward request (apiStyle: chat_completions)
  Synthesizer->>LLM_API: GET /v1/models (discover)
  LLM_API-->>Synthesizer: models list
  Synthesizer->>LLM_API: POST /v1/chat/completions (model, messages, temperature)
  LLM_API-->>Synthesizer: choices[0].message.content
  Synthesizer-->>MCP_Server: normalized generation result
  MCP_Server-->>Client: HTTP response / SSE event
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Hops along the HTTP track,

messages stacked in tidy packs,
chat_completions calls the tune,
Mastra boots at morning noon —
a happy rabbit claps its paws.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main objective: deploying MCP to Mastra Cloud and adding Gemini support, which aligns with the primary changes across manifest, synthesizer, and mastra runtime configuration.
Description check ✅ Passed The PR description provides comprehensive coverage of what, why, and test validation with clear enumeration of changes and deployment endpoints, but the template's structured checklist sections are largely incomplete or absent.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/mastra-cloud-mcp-deploy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +658 to 663
if (apiStyle === 'chat_completions') {
if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 buildModelsEndpoint produces invalid URL when chat_completions baseUrl includes /chat/completions path

When apiStyle is 'chat_completions' and the user-supplied baseUrl already includes the full /chat/completions path (e.g. http://localhost:1234/v1/chat/completions), buildModelsEndpoint returns http://localhost:1234/v1/chat/completions/models instead of the correct http://localhost:1234/v1/models.

The sibling function buildGenerationEndpoint correctly handles this case at src/runtime/lm-studio-synthesizer.ts:602-603 with if (trimmed.endsWith('/chat/completions')) { return trimmed; }, and the responses style has an equivalent guard in buildModelsEndpoint at line 674 (url.pathname.endsWith('/responses')), and the lmstudio_chat style handles /api/v1/chat at line 666. The new chat_completions branch (lines 658-663) is missing equivalent logic to strip /chat/completions and derive the models path. This causes health check model discovery to hit an incorrect endpoint.

Suggested change
if (apiStyle === 'chat_completions') {
if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
}
if (apiStyle === 'chat_completions') {
if (url.pathname.endsWith('/chat/completions')) {
return `${url.origin}${url.pathname.replace(/\/chat\/completions$/, '/models')}`;
}
if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/lm-studio-synthesizer.ts (1)

295-300: ⚠️ Potential issue | 🔴 Critical

Guard the Gemini key fallback by target host.

Line 299 and Line 315 fall back to GEMINI_AI_API_KEY whenever FPF_LOCAL_LLM_API_KEY is unset. If baseUrl points at LM Studio or any other OpenAI-compatible backend, this forwards the Gemini bearer token to the wrong server. Restrict that fallback to Gemini’s OpenAI-compatible host, or require an explicit apiKey for non-Gemini targets.

🔒 Proposed fix
+function resolveApiKey(
+  explicitApiKey: string | undefined,
+  env: NodeJS.ProcessEnv,
+  baseUrl: string,
+  apiStyle: LmStudioApiStyle,
+): string | undefined {
+  const trimmedExplicit = explicitApiKey?.trim();
+  if (trimmedExplicit) {
+    return trimmedExplicit;
+  }
+
+  const configuredKey = env.FPF_LOCAL_LLM_API_KEY?.trim();
+  if (configuredKey) {
+    return configuredKey;
+  }
+
+  const url = safeUrl(baseUrl);
+  if (
+    apiStyle === 'chat_completions' &&
+    url?.hostname === 'generativelanguage.googleapis.com'
+  ) {
+    return env.GEMINI_AI_API_KEY?.trim();
+  }
+
+  return undefined;
+}
+
 export function createSynthesizerFromEnv(
   env: NodeJS.ProcessEnv = process.env,
   fetchImpl?: typeof fetch,
 ): LocalAnswerSynthesizer | undefined {
@@
   return new LmStudioSynthesizer({
     baseUrl,
     model,
     apiStyle,
-    apiKey: env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim(),
+    apiKey: resolveApiKey(undefined, env, baseUrl, apiStyle),
     timeoutMs: Number.isFinite(timeoutMs) ? timeoutMs : DEFAULT_LM_STUDIO_TIMEOUT_MS,
     fetchImpl,
     env,
   });
 }
@@
-  const apiKey = options.apiKey?.trim() || env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim();
+  const apiKey = resolveApiKey(options.apiKey, env, baseUrl, apiStyle);

Also applies to: 309-315

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/lm-studio-synthesizer.ts` around lines 295 - 300, The code is
currently falling back to env.GEMINI_AI_API_KEY whenever
env.FPF_LOCAL_LLM_API_KEY is unset, which may leak a Gemini bearer token to
non-Gemini backends; update the LmStudioSynthesizer instantiation to only use
GEMINI_AI_API_KEY when the target host is Gemini (guard by inspecting baseUrl
for Gemini’s OpenAI-compatible host(s) or a gemini-specific apiStyle), otherwise
do not fall back and require an explicit apiKey (or leave undefined) and surface
a clear error; apply the same guarded fallback logic to the other occurrence in
the same file (the second LmStudioSynthesizer/config block).
🧹 Nitpick comments (1)
src/mastra/index.ts (1)

30-33: Avoid eager duplicate runtime initialization.

export const mastra = createMastraRuntime().mastra initializes one runtime at import time, and createHonoMastraRuntime() initializes another one for normal startup. Reusing a shared default runtime avoids duplicated setup work and side effects.

♻️ Proposed refactor
 export function createMastraRuntime(env: NodeJS.ProcessEnv = process.env) {
@@
   return {
     logger,
     observability,
     mastra,
   };
 }
 
+const defaultRuntime = createMastraRuntime();
+
 /**
  * Direct Mastra instance export required by `mastra build` / `mastra deploy`.
  */
-export const mastra = createMastraRuntime().mastra;
+export const mastra = defaultRuntime.mastra;
 
 export async function createHonoMastraRuntime(env: NodeJS.ProcessEnv = process.env) {
-  const runtime = createMastraRuntime(env);
+  const runtime = env === process.env ? defaultRuntime : createMastraRuntime(env);
   const app = new Hono<{
     Bindings: HonoBindings;
     Variables: HonoVariables;
   }>();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mastra/index.ts` around lines 30 - 33, The code eagerly constructs a
runtime by calling createMastraRuntime() at module import, causing a second
runtime when createHonoMastraRuntime() runs; change to a single shared singleton
runtime: create and export a single default runtime instance (e.g.,
defaultMastraRuntime) and export its .mastra (export const mastra =
defaultMastraRuntime.mastra), then have createHonoMastraRuntime() reuse or
accept that default runtime instead of creating a new one so
createMastraRuntime() is only invoked once.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/runtime/lm-studio-synthesizer.ts`:
- Around line 658-663: buildGenerationEndpoint currently maps an explicit
chat-completions baseUrl into ".../chat/completions/models", breaking model
discovery; update the apiStyle === 'chat_completions' branch in
buildGenerationEndpoint so that if url.pathname equals '/chat/completions' (with
or without trailing slash) or is empty it returns `${url.origin}/v1/models` (the
same normalization used for '/responses'), otherwise return `${trimmed}/models`;
refer to the buildGenerationEndpoint function and ensure runLmStudioHealthCheck
will receive the normalized `/v1/models` URL.

---

Outside diff comments:
In `@src/runtime/lm-studio-synthesizer.ts`:
- Around line 295-300: The code is currently falling back to
env.GEMINI_AI_API_KEY whenever env.FPF_LOCAL_LLM_API_KEY is unset, which may
leak a Gemini bearer token to non-Gemini backends; update the
LmStudioSynthesizer instantiation to only use GEMINI_AI_API_KEY when the target
host is Gemini (guard by inspecting baseUrl for Gemini’s OpenAI-compatible
host(s) or a gemini-specific apiStyle), otherwise do not fall back and require
an explicit apiKey (or leave undefined) and surface a clear error; apply the
same guarded fallback logic to the other occurrence in the same file (the second
LmStudioSynthesizer/config block).

---

Nitpick comments:
In `@src/mastra/index.ts`:
- Around line 30-33: The code eagerly constructs a runtime by calling
createMastraRuntime() at module import, causing a second runtime when
createHonoMastraRuntime() runs; change to a single shared singleton runtime:
create and export a single default runtime instance (e.g., defaultMastraRuntime)
and export its .mastra (export const mastra = defaultMastraRuntime.mastra), then
have createHonoMastraRuntime() reuse or accept that default runtime instead of
creating a new one so createMastraRuntime() is only invoked once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cbd85547-4450-4aed-8e84-8cdeb580299b

📥 Commits

Reviewing files that changed from the base of the PR and between 6887b98 and ff8fbcb.

⛔ Files ignored due to path filters (2)
  • bun.lock is excluded by !**/*.lock
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • .mastra-project.json
  • manifest.json
  • package.json
  • src/mastra/index.ts
  • src/mcp/tool-contracts.ts
  • src/runtime/lm-studio-synthesizer.ts
  • src/runtime/types.ts
  • src/server.ts
  • tests/mcp-server.test.ts

Comment thread src/runtime/lm-studio-synthesizer.ts
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Mastra Cloud deployment support for the FPF Memory MCP server and extends the local LLM synthesizer to support Google Gemini’s OpenAI-compatible chat_completions API style, while aligning the Mastra entrypoint/export conventions required for mastra build/deploy.

Changes:

  • Add chat_completions support to the LM Studio synthesizer (request/response parsing, endpoint building, env key fallback for Gemini).
  • Update Mastra runtime wiring to support serverless/stateless HTTP MCP deployments and adjust imports to the src/mastra/index.ts entrypoint.
  • Update MCP manifest to include HTTP transport configuration and add Mastra project metadata + dependency.

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/mcp-server.test.ts Updates Mastra runtime import path to new entrypoint layout.
src/server.ts Updates server to import Mastra runtime from src/mastra/index.ts.
src/runtime/types.ts Extends runtime type unions to include chat_completions.
src/runtime/lm-studio-synthesizer.ts Implements chat_completions API style, Gemini API key fallback, and endpoint/payload parsing updates.
src/mcp/tool-contracts.ts Extends Zod schema enum to allow chat_completions.
src/mastra/index.ts Adds serverless MCP options and exports a direct Mastra instance for Mastra build/deploy conventions.
package.json Adds mastra CLI as a dev dependency.
manifest.json Adds HTTP transport + paths for MCP endpoints.
bun.lock Lockfile updates for the new mastra dependency tree.
.mastra-project.json Adds Mastra Cloud project metadata for deployment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +658 to +662
if (apiStyle === 'chat_completions') {
if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildGenerationEndpoint explicitly supports passing a baseUrl that already ends with /chat/completions, but buildModelsEndpoint doesn’t mirror that behavior. With apiStyle === 'chat_completions' and baseUrl set to a full generation endpoint, this will produce an invalid models URL like .../chat/completions/models. Consider stripping /chat/completions (similar to the /responses handling) when computing the models endpoint.

Copilot uses AI. Check for mistakes.
Comment on lines +561 to +565
if (apiStyle === 'chat_completions') {
return {
model,
temperature: 0.1,
messages: [
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New chat_completions behavior (request payload shape, endpoint construction, and response text extraction from choices[].message.content) is not covered by existing tests. Since this file already has dedicated tests, please add cases validating chat_completions for both endpoint building and parsing/extraction to prevent regressions across providers (e.g., Gemini OpenAI-compatible endpoints).

Copilot uses AI. Check for mistakes.
Comment thread .mastra-project.json Outdated
Comment on lines +2 to +5
"projectId": "52e5f16d-40b5-4f27-9896-5537b883c4b4",
"projectName": "fpf-memory",
"projectSlug": "fpf-memory",
"organizationId": "org_O1KNSZHGZ4FXT065PGP9BSJEXT"
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file hard-codes a specific Mastra Cloud projectId / organizationId. If this is environment- or account-specific (similar to other deployment tool metadata files), committing it can unintentionally couple the repo to one cloud project and may expose internal identifiers. Consider documenting why it must be in-repo or ignoring it and configuring these values via deployment-time configuration instead.

Suggested change
"projectId": "52e5f16d-40b5-4f27-9896-5537b883c4b4",
"projectName": "fpf-memory",
"projectSlug": "fpf-memory",
"organizationId": "org_O1KNSZHGZ4FXT065PGP9BSJEXT"
"projectId": "REPLACE_WITH_MASTRA_PROJECT_ID",
"projectName": "fpf-memory",
"projectSlug": "fpf-memory",
"organizationId": "REPLACE_WITH_MASTRA_ORGANIZATION_ID"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the chat_completions API style within the LM Studio synthesizer, encompassing payload extraction, endpoint generation, and expanded environment variable support for API keys. Additionally, it configures the Mastra runtime for serverless MCP deployment and updates project metadata. Feedback was provided to simplify the text extraction logic in extractGenerationText to improve readability and reduce redundancy.

Comment thread src/runtime/lm-studio-synthesizer.ts Outdated
Comment on lines +487 to +493
'output' in payload ? (payload.output ?? []) : [],
);
if (preferredMessage) {
return preferredMessage;
}

return findAnyGenerationText(payload.output ?? []);
return findAnyGenerationText('output' in payload ? (payload.output ?? []) : []);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for extracting generation text from the output property is repeated twice using ternary operators. This can be simplified by extracting the output array once, which also improves readability and makes the type narrowing clearer for TypeScript.

  const output = 'output' in payload ? (payload.output ?? []) : [];
  const preferredMessage = findStructuredMessageText(output);
  if (preferredMessage) {
    return preferredMessage;
  }

  return findAnyGenerationText(output);

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff8fbcb574

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/runtime/lm-studio-synthesizer.ts Outdated
model,
apiStyle,
apiKey: env.FPF_LOCAL_LLM_API_KEY?.trim(),
apiKey: env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Limit Gemini key fallback to Gemini-targeted requests

This fallback unconditionally reuses GEMINI_AI_API_KEY whenever FPF_LOCAL_LLM_API_KEY is unset, regardless of which baseUrl is configured. In mixed environments, requests to non-Gemini OpenAI-compatible hosts will carry the Gemini bearer token, leaking that credential to unrelated endpoints and potentially causing auth failures (401/403) on providers that expect a different token. Restrict this fallback to explicit Gemini-targeted configurations (for example, compatible API style + Google endpoint) or require opt-in.

Useful? React with 👍 / 👎.

if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize chat-completions base URL before model discovery

For chat_completions, this branch appends /models to any non-root path, so a valid full generation URL like .../v1/chat/completions becomes .../v1/chat/completions/models. runLmStudioHealthCheck then probes an invalid models endpoint and can report discovery failure even when generation is correctly configured. Handle /chat/completions similarly to the /responses rewrite so model discovery targets .../v1/models.

Useful? React with 👍 / 👎.

@venikman
Copy link
Copy Markdown
Owner Author

@codex[agent] review

@venikman
Copy link
Copy Markdown
Owner Author

@claude[agent] review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12d55649cd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +658 to +662
if (apiStyle === 'chat_completions') {
if (url.pathname === '/' || url.pathname === '') {
return `${url.origin}/v1/models`;
}
return `${trimmed}/models`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize chat-completions base URL before model lookup

When apiStyle is chat_completions, this branch always returns ${trimmed}/models for non-root paths. If users provide a base URL that already includes /chat/completions (which buildGenerationEndpoint explicitly supports), runLmStudioHealthCheck will query .../chat/completions/models instead of the provider’s models endpoint (typically .../models). In that configuration, model discovery falsely fails even though generation works, so lm-check reports a broken setup.

Useful? React with 👍 / 👎.

@Codex
Copy link
Copy Markdown
Contributor

Codex AI commented Apr 12, 2026

@codex[agent] review

Handled the review feedback: chat_completions models endpoint now strips full generation URLs and added coverage for payload parsing and endpoint building in tests (5c48d72). The Mastra project file is already removed/ignored (9d640ea).

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/runtime/lm-studio-synthesizer.ts (1)

291-300: ⚠️ Potential issue | 🟠 Major

Don't reuse GEMINI_AI_API_KEY for arbitrary backends.

With this fallback, any non-empty GEMINI_AI_API_KEY gets sent to whatever FPF_LOCAL_LLM_BASE_URL points at when FPF_LOCAL_LLM_API_KEY is unset. That leaks a provider-specific secret to localhost or other OpenAI-compatible hosts. Gate the fallback on a Gemini base URL, or require the explicit local key for non-Gemini backends.

🔐 Suggested fix
+function isGeminiOpenAiBaseUrl(baseUrl: string): boolean {
+  const url = safeUrl(baseUrl.trim());
+  return url?.hostname === 'generativelanguage.googleapis.com'
+    && /\/openai(?:$|\/)/.test(url.pathname);
+}
+
+function resolveApiKey(
+  baseUrl: string,
+  env: NodeJS.ProcessEnv,
+  explicitApiKey?: string,
+): string | undefined {
+  const configuredApiKey = explicitApiKey?.trim() || env.FPF_LOCAL_LLM_API_KEY?.trim();
+  if (configuredApiKey) {
+    return configuredApiKey;
+  }
+
+  return isGeminiOpenAiBaseUrl(baseUrl) ? env.GEMINI_AI_API_KEY?.trim() : undefined;
+}
+
 export function createSynthesizerFromEnv(
   env: NodeJS.ProcessEnv = process.env,
   fetchImpl?: typeof fetch,
 ): LocalAnswerSynthesizer | undefined {
@@
   return new LmStudioSynthesizer({
     baseUrl,
     model,
     apiStyle,
-    apiKey: env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim(),
+    apiKey: resolveApiKey(baseUrl, env),
     timeoutMs: Number.isFinite(timeoutMs) ? timeoutMs : DEFAULT_LM_STUDIO_TIMEOUT_MS,
     fetchImpl,
     env,
   });
 }
@@
-  const apiKey = options.apiKey?.trim() || env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim();
+  const apiKey = resolveApiKey(baseUrl, env, options.apiKey);

Also applies to: 313-315

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/runtime/lm-studio-synthesizer.ts` around lines 291 - 300, The current
fallback that sets apiKey to env.FPF_LOCAL_LLM_API_KEY?.trim() ||
env.GEMINI_AI_API_KEY?.trim() leaks a provider-specific secret to arbitrary
LmStudioSynthesizer backends; change the logic so that GEMINI_AI_API_KEY is only
used when the target is actually Gemini (e.g., when baseUrl equals the Gemini
default or when apiStyle/normalizeLmStudioApiStyle indicates Gemini), otherwise
require FPF_LOCAL_LLM_API_KEY and leave apiKey undefined; apply the same
guarded-fallback change to the other occurrence(s) around the
LmStudioSynthesizer construction at the later block (the repeated apiKey usage
referenced in the comment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/runtime/lm-studio-synthesizer.ts`:
- Around line 291-300: The current fallback that sets apiKey to
env.FPF_LOCAL_LLM_API_KEY?.trim() || env.GEMINI_AI_API_KEY?.trim() leaks a
provider-specific secret to arbitrary LmStudioSynthesizer backends; change the
logic so that GEMINI_AI_API_KEY is only used when the target is actually Gemini
(e.g., when baseUrl equals the Gemini default or when
apiStyle/normalizeLmStudioApiStyle indicates Gemini), otherwise require
FPF_LOCAL_LLM_API_KEY and leave apiKey undefined; apply the same
guarded-fallback change to the other occurrence(s) around the
LmStudioSynthesizer construction at the later block (the repeated apiKey usage
referenced in the comment).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6689008f-5572-4528-8f75-de0efdf70223

📥 Commits

Reviewing files that changed from the base of the PR and between 9d640ea and 5c48d72.

📒 Files selected for processing (2)
  • src/runtime/lm-studio-synthesizer.ts
  • tests/lm-studio-synthesizer.test.ts

venikman and others added 5 commits April 12, 2026 09:34
- Move src/mastra.ts to src/mastra/index.ts for mastra build convention
- Export direct Mastra instance for mastra deploy
- Add serverless mcpOptions for stateless MCP over HTTP
- Add chat_completions API style to synthesizer for Google Gemini
  OpenAI-compatible endpoint support
- Add GEMINI_AI_API_KEY env var fallback in synthesizer
- Update manifest.json with HTTP transport endpoints
- Add mastra CLI v1.5.0 as devDependency
- Add .mastra-project.json for Mastra Cloud project linking

Deployed to https://fpf-memory.server.mastra.cloud
MCP endpoint: /api/mcp/fpf_memory/mcp

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contains environment-specific Mastra Cloud project/org IDs that
should not be committed. Generated by `mastra server deploy`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Project uses Bun as its package manager. The package-lock.json was
a temporary workaround for a bun install resolution hang during
mastra build — no longer needed since the deploy pipeline works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: venikman <22530205+venikman@users.noreply.github.com>
Bun is the project package manager, but mastra build's bun install
hangs on zod@4 resolution in the output directory. A local
package-lock.json forces npm as fallback during deploy only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@venikman venikman force-pushed the feat/mastra-cloud-mcp-deploy branch from a0afb1a to cfe983f Compare April 12, 2026 13:35
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/runtime/lm-studio-synthesizer.ts`:
- Around line 299-300: The code currently falls back to GEMINI_AI_API_KEY for
apiKey regardless of target host; change the apiKey selection so
GEMINI_AI_API_KEY is used only when the chosen baseUrl clearly targets Gemini
(check the baseUrl variable or FPF_LOCAL_LLM_BASE_URL for a Gemini/Google
hostname substring or known Gemini URL pattern) and otherwise avoid using the
Gemini secret; apply the same guarded logic to the other occurrence where apiKey
is set (the second apiKey selection in this file) so both places only use
GEMINI_AI_API_KEY when baseUrl indicates a Gemini endpoint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 97a04ee4-dc3d-413f-b673-ce640faa8baa

📥 Commits

Reviewing files that changed from the base of the PR and between a0afb1a and cfe983f.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • .gitignore
  • manifest.json
  • package.json
  • src/mastra/index.ts
  • src/mcp/tool-contracts.ts
  • src/runtime/lm-studio-synthesizer.ts
  • src/runtime/types.ts
  • src/server.ts
  • tests/lm-studio-synthesizer.test.ts
  • tests/mcp-server.test.ts
✅ Files skipped from review due to trivial changes (4)
  • package.json
  • .gitignore
  • tests/mcp-server.test.ts
  • src/server.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/mastra/index.ts
  • tests/lm-studio-synthesizer.test.ts
  • src/runtime/types.ts

Comment thread src/runtime/lm-studio-synthesizer.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cfe983fe3a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mastra/index.ts
/**
* Direct Mastra instance export required by `mastra build` / `mastra deploy`.
*/
export const mastra = createMastraRuntime().mastra;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer Mastra runtime creation until it is needed

Creating mastra at module load (createMastraRuntime()) eagerly initializes the logger and observability stack even when callers only import createHonoMastraRuntime. In this codebase those initializers call resolveLogPath, which performs mkdirSync/openSync; in read-only build/deploy environments this can throw during import before the server starts, and it also bypasses any env object a caller intended to pass later. Keeping this export lazy (or isolating it to a deploy-only entrypoint) avoids import-time side effects and environment drift.

Useful? React with 👍 / 👎.

Public surface (deployed): ask_fpf, query_fpf_spec, get_fpf_index_status
Expert surface (local only): refresh, trace, inspect, read_doc, anchor, citations

Deployed server defaults to public. Set FPF_MCP_SURFACE=full for all tools.
Stdio transport always exposes all tools for local Codex/Claude use.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 475eff4077

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mastra/index.ts Outdated
export function createMastraRuntime(env: NodeJS.ProcessEnv = process.env) {
const logger = getRuntimeLogger(env);
const observability = getRuntimeObservability(env);
const mcpServer = env.FPF_MCP_SURFACE === 'full' ? fpfMcpServer : fpfPublicMcpServer;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep full MCP surface as default for Hono runtime

createMastraRuntime now selects fpfPublicMcpServer unless FPF_MCP_SURFACE is explicitly set to "full", which means the default HTTP server path (createHonoMastraRuntime() from src/server.ts) exposes only the 3 public tools. In normal local/server startup where that env var is unset, HTTP clients lose core debug/retrieval tools (refresh_fpf_index, trace_fpf_path, read_fpf_doc, etc.) even though the stdio path still exposes them, creating an unexpected capability regression. Make the restricted surface opt-in instead of the default.

Useful? React with 👍 / 👎.

venikman and others added 3 commits April 12, 2026 10:32
Manifest, README, AGENTS.md, docs/mcp-interface.md, and Codex config
now distinguish public tools (deployed HTTP) from expert tools
(local stdio only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…action

- Only use GEMINI_AI_API_KEY when baseUrl points to *.googleapis.com,
  preventing credential leakage to non-Gemini endpoints
- Extract output array once in extractGenerationText instead of
  repeating the ternary check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Flips the default so local HTTP server exposes all 9 tools.
Deployed server sets FPF_MCP_SURFACE=public to restrict to 3 tools.
Addresses review feedback about local dev losing expert tools.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 228a1a6dc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +605 to +606
if (trimmed.endsWith('/chat/completions')) {
return trimmed;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle query strings when matching chat-completions base URLs

buildGenerationEndpoint checks trimmed.endsWith('/chat/completions') on the full URL string, so a valid base URL that includes query params (for example Azure-style .../chat/completions?api-version=...) is not recognized as already complete. The function then appends another /chat/completions (and buildModelsEndpoint similarly appends /models after the query), yielding malformed endpoints and failing both generation and lm-check in that configuration. Fresh evidence: with https://example.openai.azure.com/.../chat/completions?api-version=2024-05-01-preview, the code produces ...?api-version=.../chat/completions and ...?api-version=.../models.

Useful? React with 👍 / 👎.

@venikman venikman merged commit afd7e9d into main Apr 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants