new skill: monitor-metrics#32
Closed
naitik-mixpanel wants to merge 2 commits into
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
monitor-metricsskillWhat
Adds a new skill,
monitor-metrics, for single-metric diagnosis against themixpanel-mcpconnector (Mixpanel US). It is deliberately scoped to one metric, one diagnosis, one attribution pass — not a portfolio sweep.The skill answers three distinct questions, each as its own command:
metric-anomaly— point-in-time anomaly detection (Z-score + IQR, time-bucketed) over 7-day hourly and 30-day daily series. Flags spikes, drops, and clusters; does not test for trend drift. (2 queries)metric-drift— trend-level drift detection (mean-shift + variance-ratio) over 60-day daily and 16-week weekly windows, with a built-in outlier-contamination check so it can run standalone. Owns shape classification (step / slope / oscillating). (2 queries)metric-rca— root-cause attribution that runs on top of an existing anomaly/drift diagnosis. Fans out across five segmentation branches (component decomposition, default-property breakdowns, distinct-id outliers, cohort comparison, calendar/market context), ranks findings by concentration × deviation, and appends them to the diagnosis board. Does not run cold.Shared skill-level scaffolding handles the parts every command depends on:
Step 0 — input validation (project + metric), with
Get-Business-Contextcalled once per session to resolve project nicknames/acronyms before falling back toGet-Projects.Step 1 — metric ingestion into a normalized "metric series" object. Path A (saved Mixpanel Metric, preferred) lifts the full definition via
Get-Metric; Path B rebuilds the query body fromGet-Query-Schemafor reports / dashboard tiles / prose.Step 1.5 — project profile resolution: cheap metadata-only filter validation (
List-Properties,Get-Property-Values) and an instrumentation health check (Get-Issues) that surfaces schema/null/type-drift issues into the verdict without aborting.Step 2 / Step 3 — diagnosis payload handoff (held in conversation memory, not disk), an opt-in board prompt, and automatic RCA append to the existing board.
Output contract is enforced across commands: compact verdict-first cards (headline → confidence → next step), always-on trend charts (annotations only when something is flagged), and explicit scope limits on every card.
Files added:
SKILL.md— routing, shared Steps 0–3, output contractcommands/metric-anomaly.mdcommands/metric-drift.mdcommands/metric-rca.mdWhy
CSAs and PMs repeatedly hit the same three questions when a metric moves — is this point weird, has the baseline shifted, and where is the movement coming from — but those are statistically different tests with different customer conversations attached (an anomaly is an incident, drift is a trend, RCA is the segmentation story). Conflating them produces muddy verdicts. Separating detection from attribution, and forcing RCA to consume a prior diagnosis rather than run cold, keeps each answer clean and the cost predictable (anomaly is the cheap default at 2 queries).
It also fills a clear gap between existing skills:
analyze-reportreads what a chart shows but doesn't chase causes;weekly-pulseandgtm-customer-intelligenceoperate at portfolio/adoption scope.monitor-metricsis the single-metric statistical-detection + RCA workflow that sits between them, with explicit handoffs noted in "When not to use this skill."Type
Skill review
/review-skillpassed (no blockers or majors)Testing
Sample Test case: https://claude.ai/share/f01d3424-437f-4e49-89b9-58b1accc6abd