feat(compass-collection): Create Mock Data Generator Prompt and Batching Logic in Compass - CLOUDP-381914#7892
feat(compass-collection): Create Mock Data Generator Prompt and Batching Logic in Compass - CLOUDP-381914#7892
Conversation
There was a problem hiding this comment.
Ported from mms: MockDataSchemaGenerationPrompt.buildUserPrompt()
There was a problem hiding this comment.
Ported from mms: MockDataSchemaGenerationPrompt.java
| * Splits a schema into smaller chunks for processing. | ||
| * Ported from NaturalLanguageQueryGenerator.splitSchemaIntoChunks() | ||
| */ | ||
| export function splitSchemaIntoChunks( |
There was a problem hiding this comment.
Ported from mms: NaturalLanguageQueryGenerator.splitSchemaIntoChunks()
| * Merges multiple chunk responses into a single response. | ||
| * Ported from NaturalLanguageQueryGenerator.mergeChunkResponses() | ||
| */ | ||
| export function mergeChunkResponses( |
There was a problem hiding this comment.
Ported from mms: NaturalLanguageQueryGenerator.mergeChunkResponses()
There was a problem hiding this comment.
Pull request overview
This PR introduces a new mock-data-generator module in compass-generative-ai to support generating mock data schema mappings (faker.js field mappings) with prompt content, tool schema definitions, and schema batching utilities, while keeping existing exports working via re-exports.
Changes:
- Added Zod tool schema/types and prompt text for mock-data-schema generation in a new
mock-data-generator/module. - Implemented schema batching utilities (split/merge/size validation) and added unit tests for them.
- Updated
atlas-ai-service.tsto consume the moved schema types/shapes from the new module and re-export them for compatibility.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/compass-generative-ai/src/mock-data-generator/schema.ts | Adds Zod tool schema + moved response types with compatibility aliases. |
| packages/compass-generative-ai/src/mock-data-generator/schema-batching.ts | Adds chunking/merging/limits helpers for large schemas. |
| packages/compass-generative-ai/src/mock-data-generator/schema-batching.spec.ts | Adds unit tests for batching utilities. |
| packages/compass-generative-ai/src/mock-data-generator/prompt.ts | Adds the system prompt used to generate faker mappings. |
| packages/compass-generative-ai/src/mock-data-generator/format-schema-for-prompt.ts | Adds user-prompt formatting for schema + validation rules. |
| packages/compass-generative-ai/src/mock-data-generator/index.ts | Exposes the new module’s public surface area. |
| packages/compass-generative-ai/src/atlas-ai-service.ts | Switches mock-data schema response shape/type to import from the new module and re-exports for compatibility. |
packages/compass-generative-ai/src/mock-data-generator/format-schema-for-prompt.ts
Show resolved
Hide resolved
packages/compass-generative-ai/src/mock-data-generator/schema-batching.ts
Outdated
Show resolved
Hide resolved
packages/compass-generative-ai/src/mock-data-generator/schema-batching.spec.ts
Outdated
Show resolved
Hide resolved
| if (fieldsPerChunk <= 0) { | ||
| throw new Error('fieldsPerChunk must be a positive integer'); | ||
| } | ||
|
|
There was a problem hiding this comment.
We can use lodash chunk to clean this:
return chunk(Object.entries(rawSchema), fieldsPerChunk).map(
(chunk) => Object.fromEntries(chunk)
);
|
|
||
| ${schemaJson} | ||
|
|
||
| ${validationRulesPhrase} |
There was a problem hiding this comment.
Do we want to add validation rules to the prompt example as well?
There was a problem hiding this comment.
Original prompt didn't include this, so how about we keep it the same for now? We tested the original prompt for accuracy and such, so don't want to include too many changes at once. I will implement your other suggestions though
| documentSchema: RawSchema, | ||
| validationRules?: Record<string, unknown> | null | ||
| ): string { | ||
| const schemaJson = JSON.stringify(documentSchema, null, 2); |
There was a problem hiding this comment.
You can use toJSString function from mongodb-query-parser to achieve this (check utils/gen-ai-prompt).
|
|
||
| Documents in the collection are described by the following schema: | ||
|
|
||
| ${schemaJson} |
There was a problem hiding this comment.
Maybe wrap this in code fence (so that it is wrapped in backticks)
Description
Checklist
Motivation and Context
Part of the broader effort to migrate all Atlas AI calls to the EDU Knowledge Server.
Types of changes