Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions crm-query/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
---
name: crm-query
description: Filter, list, and count HubSpot CRM records from the terminal, and discover portal schema (object types, properties, pipelines, owners). Use when the user asks to filter records by criteria, count records, look up what object types or properties exist, or resolve owner names to IDs. For aggregations (GROUP BY, SUM, AVG), time series, or cross-object filters in a single call, use crm-reports. For basic lookup by ID/email/domain/name, use crm-lookup. For deal pipeline snapshots and win/loss analysis, use sales-reporting. Website traffic analytics, marketing email metrics, and LLM-based content analysis have no CLI equivalent.
triggers:
- "filter contacts"
- "filter deals"
- "filter records"
- "count contacts"
- "count deals"
- "how many contacts"
- "how many deals"
- "how many records"
- "what object types"
- "list all properties"
- "object schema"
- "what properties exist"
- "find owner"
- "list owners"
- "resolve owner"
- "contacts with no deals"
- "companies with no activity"
---

## Foundations

Read [`bulk-operations/SKILL.md`](../bulk-operations/SKILL.md) first — JSONL piping, batch-read, pagination, and the safety flow for downstream writes live there. `hubspot <command> --help` is authoritative; trust it over this file if they conflict.

For basic lookup by ID, email, domain, or name fragment, see [`crm-lookup/SKILL.md`](../crm-lookup/SKILL.md). For aggregations (GROUP BY, SUM, AVG, time series, cross-object filters), see [`crm-reports/SKILL.md`](../crm-reports/SKILL.md). For deal pipeline snapshots and win/loss analysis, see [`sales-reporting/SKILL.md`](../sales-reporting/SKILL.md).

## Resources

| File | When to use |
|---|---|
| `resources/filter-operators.md` | Full filter syntax: operators, AND/OR rules, date windows, null/token patterns. |
| `resources/aggregation-patterns.md` | jq recipes for counting, summing, averaging, and time-series grouping. |

## What the CLI cannot do

Tell the user upfront if their request falls into one of these unsupported areas:

- **Aggregations, GROUP BY, time series, cross-object filters** — use `crm-reports` instead (`hubspot reports create "<sql>"`).
- **Website traffic analytics** (`web_analytics.*`) — no CLI equivalent; direct user to HubSpot Reports in the UI.
- **Marketing email metrics** (`EXT_EMAIL_*`, campaign sends/opens/clicks) — no CLI equivalent; direct user to HubSpot Email Analytics in the UI.
- **LLM analysis of call/email/deal content** — no CLI equivalent; available in HubSpot Breeze AI.

## 1. Discover CRM schema

Properties, pipeline stages, and custom object types are **portal-specific** — always discover at runtime rather than hardcoding.

```bash
# all object types in this portal (standard + custom)
hubspot objects types

# all properties for an object type
hubspot properties list --type contacts
hubspot properties list --type deals
hubspot properties list --type companies

# search for properties by name fragment (pipe to grep)
hubspot properties list --type contacts | grep -i "lifecycle"
hubspot properties list --type deals | grep -i "close"

# details for a specific property (label, type, options/enum values)
hubspot properties get --type contacts lifecyclestage

# association types available from an object
hubspot associations types --from-type contacts
hubspot associations types --from-type deals

# deal pipelines and their stages (IDs are portal-specific)
hubspot pipelines list --type deals --format jsonl
hubspot pipelines stages --type deals --pipeline default --format jsonl

# grab a stage ID by its label
QUALIFIED=$(hubspot pipelines stages --type deals --pipeline default --format jsonl \
| jq -r 'select(.label=="Qualified To Buy") | .id')
```

## 2. Search and filter records

`search` returns ≤ 100 records per call. Paginate (see bulk-operations) before counting or aggregating — a result of exactly 100 is almost always truncated.

```bash
# exact match on enum / string
hubspot objects search --type contacts \
--filter "lifecyclestage=marketingqualifiedlead" \
--properties email,firstname,lastname,hubspot_owner_id

# multiple AND conditions in one --filter
hubspot objects search --type deals \
--filter "hs_is_closed!=true AND dealstage=qualifiedtobuy" \
--properties dealname,amount,dealstage,closedate

# OR conditions — multiple --filter flags
hubspot objects search --type contacts \
--filter "lifecyclestage=lead" \
--filter "lifecyclestage=marketingqualifiedlead" \
--properties email,firstname,lifecyclestage

# date range — ISO dates (YYYY-MM-DD) work in comparisons
TODAY=$(date +%Y-%m-%d)
LAST_30=$(date -v-30d +%Y-%m-%d 2>/dev/null || date -d '30 days ago' +%Y-%m-%d)
hubspot objects search --type contacts \
--filter "createdate>=$LAST_30 AND createdate<=$TODAY" \
--properties email,firstname,createdate

# null / missing property
hubspot objects search --type contacts --filter "!email" --properties firstname,lastname
hubspot objects search --type contacts --filter "email" --properties firstname,lastname,email

# token match (whole words only — "acme" matches "Acme Corp" but not "AcmeTech")
hubspot objects search --type deals --filter "dealname~acme" --properties dealname,dealstage
```

See `resources/filter-operators.md` for the full operator list and more examples.

## 3. Count records

```bash
# quick count (≤100 — if the number is exactly 100, paginate for the real count)
hubspot objects search --type contacts --filter "lifecyclestage=lead" --properties email \
| jq -s 'length'

# accurate count for any number of records — paginate all, count lines
bash bulk-operations/resources/pagination-loop.sh contacts /tmp/leads.jsonl email \
'--filter' 'lifecyclestage=lead'
wc -l < /tmp/leads.jsonl
```

Never report a count of 100 without first paginating — `search` silently truncates.

## 4. Aggregate, group, and cross-object queries

For aggregations (COUNT, SUM, AVG, GROUP BY, time series) and cross-object filters, use `crm-reports`:

```bash
# count by dimension — use crm-reports
hubspot reports create \
"SELECT dealstage, COUNT(*), SUM(amount_in_home_currency) FROM DEAL GROUP BY dealstage" \
--intent "Deals by stage"

# cross-object filter — use crm-reports
hubspot reports create \
"SELECT dealname, amount_in_home_currency FROM DEAL WHERE COMPANY.industry = 'RETAIL'" \
--intent "Deals at retail companies"
```

See [`crm-reports/SKILL.md`](../crm-reports/SKILL.md) for the full command reference and SQL syntax.

**jq fallback for simple grouping** (when you already have records paginated locally):

All properties come back as **strings** — use `tonumber` for arithmetic. See `resources/aggregation-patterns.md` for the full cookbook.

```bash
jq -rs '
group_by(.properties.dealstage)
| map({stage: .[0].properties.dealstage, count: length})
| sort_by(-.count)[]
| "\(.stage)\t\(.count)"' /tmp/deals.jsonl \
| column -t -s$'\t'
```

## 5. Records with no association

`crm-reports` handles most cross-object queries, but checking for the **absence** of an association requires a client-side pass:

```bash
# deals with no associated contact
bash bulk-operations/resources/pagination-loop.sh deals /tmp/deals.jsonl dealname,dealstage
jq -r '.id' /tmp/deals.jsonl \
| while IFS= read -r id; do
count=$(hubspot associations list --from deals:$id --to contacts 2>/dev/null | wc -l | tr -d ' ')
[ "$count" -eq 0 ] && jq -c "select(.id == \"$id\")" /tmp/deals.jsonl
done
```

> For large datasets (>500 records), this loop spawns one process per record and will be slow. Tell the user and paginate a filtered subset first to reduce the count.

## 6. Owner lookup and resolution

```bash
# dump owners once — reuse across queries
hubspot owners list --format jsonl > /tmp/owners.jsonl

# find owner ID by email (for use in --filter)
jq -r 'select(.email == "john@acme.com") | .id' /tmp/owners.jsonl

# find owner ID by name fragment
jq -r 'select((.firstName + " " + .lastName) | ascii_downcase | contains("john smith")) | .id' \
/tmp/owners.jsonl

# resolve owner IDs to names in a result set
hubspot objects search --type deals --filter "hs_is_closed!=true" \
--properties dealname,amount,hubspot_owner_id \
| jq -c --slurpfile owners /tmp/owners.jsonl '
. + {owner_name: (
($owners[0][] | select(.id == .properties.hubspot_owner_id)
| .firstName + " " + .lastName) // "unknown"
)}'
```

## Known constraints

- `search` returns ≤ 100 records per page; paginate before counting or aggregating.
- Multiple `--filter` flags are OR'd; use `AND` inside a single flag for AND conditions.
- `~` is token/word-boundary matching only — use `jq | ascii_downcase | contains(...)` for substring.
- Date properties accept ISO format (`YYYY-MM-DD`) in `--filter` comparisons.
- `hubspot owners list` returns CRM users only; there is no `teams` object — group by `hubspot_owner_id` client-side.
- Aggregations and cross-object filters: use `crm-reports` (see "What the CLI cannot do" above).
- Website analytics, marketing email metrics, and LLM content analysis have no CLI equivalent.
151 changes: 151 additions & 0 deletions crm-query/resources/aggregation-patterns.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# Aggregation Patterns (jq)

Run these after collecting all matching records into a JSONL file with the pagination loop. A bare `search` caps at 100 rows — always paginate before aggregating.

```bash
# general form
bash bulk-operations/resources/pagination-loop.sh <type> /tmp/out.jsonl <props> [extra_flags...]
```

All property values in JSONL output are **strings**. Use `tonumber` for arithmetic and `// 0` / `// null` to handle nulls.

---

## Count by dimension

```bash
bash bulk-operations/resources/pagination-loop.sh deals /tmp/deals.jsonl dealstage

jq -rs '
group_by(.properties.dealstage)
| map({stage: .[0].properties.dealstage, count: length})
| sort_by(-.count)[]
| "\(.stage)\t\(.count)"' /tmp/deals.jsonl \
| column -t -s$'\t'
```

## Count + sum by dimension

```bash
bash bulk-operations/resources/pagination-loop.sh deals /tmp/deals.jsonl dealstage,amount \
'--filter' 'hs_is_closed!=true'

jq -rs '
group_by(.properties.dealstage)
| map({
stage: .[0].properties.dealstage,
count: length,
total: ([.[].properties.amount | select(. != null) | tonumber] | add // 0 | round)
})
| sort_by(-.total)[]
| "\(.stage)\tcount:\(.count)\tvalue:$\(.total)"' /tmp/deals.jsonl \
| column -t -s$'\t'
```

## Average by dimension

```bash
jq -rs '
group_by(.properties.dealtype)
| map(
. as $group |
{
type: ($group[0].properties.dealtype // "unknown"),
count: ($group | length),
avg: (
[ $group[].properties.amount | select(. != null) | tonumber ]
| if length > 0 then (add / length | round) else 0 end
)
}
)
| .[]
| "\(.type)\tcount:\(.count)\tavg:$\(.avg)"' /tmp/deals.jsonl \
| column -t -s$'\t'
```

## Min / max

```bash
jq -rs '
map(select(.properties.amount != null))
| {
max: (max_by(.properties.amount | tonumber)
| {name: .properties.dealname, amount: (.properties.amount | tonumber)}),
min: (min_by(.properties.amount | tonumber)
| {name: .properties.dealname, amount: (.properties.amount | tonumber)})
}' /tmp/deals.jsonl
```

## Time series — group by month

```bash
bash bulk-operations/resources/pagination-loop.sh deals /tmp/won.jsonl amount,closedate \
'--filter' 'hs_is_closed_won=true'

jq -rs '
group_by(.properties.closedate[0:7])
| map({
month: .[0].properties.closedate[0:7],
count: length,
revenue: ([.[].properties.amount | select(. != null) | tonumber] | add // 0 | round)
})
| sort_by(.month)[]
| "\(.month)\tdeals:\(.count)\trevenue:$\(.revenue)"' /tmp/won.jsonl \
| column -t -s$'\t'
```

## Multi-dimension grouping

```bash
bash bulk-operations/resources/pagination-loop.sh deals /tmp/deals.jsonl \
dealstage,hubspot_owner_id,amount '--filter' 'hs_is_closed!=true'

jq -rs '
group_by([.properties.dealstage, .properties.hubspot_owner_id])
| map({
stage: .[0].properties.dealstage,
owner: .[0].properties.hubspot_owner_id,
count: length,
total: ([.[].properties.amount | select(. != null) | tonumber] | add // 0 | round)
})
| sort_by([.stage, -.total])[]
| "\(.stage)\towner:\(.owner)\tcount:\(.count)\tvalue:$\(.total)"' \
| column -t -s$'\t'
```

## Records with / without a property set

```bash
jq -rs '
{
with_phone: (map(select(.properties.phone != null and .properties.phone != "")) | length),
without_phone: (map(select(.properties.phone == null or .properties.phone == "")) | length)
}' /tmp/contacts.jsonl
```

## Resolve owner IDs to names

Dump owners once, then join by ID:

```bash
hubspot owners list --format jsonl > /tmp/owners.jsonl

# join when reading back results (two-file jq input)
jq -rs '
(.[1:][0] | map({(.id): (.firstName + " " + .lastName)}) | add) as $names |
.[0][] |
. + {owner_name: ($names[.properties.hubspot_owner_id] // "unknown")}
' /tmp/deals.jsonl /tmp/owners.jsonl
```

Or inline with `--slurpfile` when piping:

```bash
hubspot objects search --type deals --filter "hs_is_closed!=true" \
--properties dealname,hubspot_owner_id \
| jq -c --slurpfile owners /tmp/owners.jsonl '
. + {owner_name: (
($owners[0][] | select(.id == .properties.hubspot_owner_id)
| .firstName + " " + .lastName) // "unknown"
)}'
```
Loading