Skip to content

Latest commit

 

History

History
286 lines (207 loc) · 10.5 KB

File metadata and controls

286 lines (207 loc) · 10.5 KB

Server Integration Reference

This document describes the payload the SDK sends to your reporting endpoint, what every field means, and how to interpret hashed values.


Transport

The SDK POSTs a JSON batch to the endpoint you configure in ObserverSDK.init(). Two delivery mechanisms are tried in order:

  1. navigator.sendBeacon — non-blocking, fires on page unload, no response is read. The server must accept this and return 204 or 200.
  2. fetch with keepalive: true — used as a fallback when sendBeacon fails or is unavailable.

Headers

Header Value
Content-Type application/json
Content-Encoding gzip (when CompressionStream is available — modern browsers)

When Content-Encoding: gzip is present the body is a gzip-compressed JSON string. When absent it is raw JSON. The server should handle both transparently (standard HTTP compression negotiation).


Batch envelope

Every POST body is a single EventBatch object:

{
  "events": [ ...QueueEntry[] ],
  "sentAt": 1718000000000,
  "sdkVersion": "0.1.0"
}
Field Type Description
events QueueEntry[] One entry per distinct endpoint shape observed since the last flush
sentAt number Unix timestamp (ms) when the batch was sent
sdkVersion string SDK version string, useful for schema compatibility checks

QueueEntry

{
  "count": 3,
  "dedupeKey": "{\"method\":\"GET\",\"domain\":\"api.example.com\",\"path\":\"/users\",\"keys\":[\"page\"]}",
  "data": { ...CapturedRequest }
}
Field Type Description
count number How many requests matched this endpoint shape during the flush window
dedupeKey string Stable JSON string that uniquely identifies an endpoint shape (see below)
data CapturedRequest The representative request — the first one seen for this shape in the window

dedupeKey structure

The dedupe key is a JSON string, not an object. Parse it if you need the fields:

{ "method": "POST", "domain": "api.example.com", "path": "/graphql", "keys": [], "op": "GetUsers" }
Field Always present Description
method yes HTTP method, uppercased
domain yes Hostname (plus :port for non-standard ports)
path yes URL pathname
keys yes Sorted array of query parameter key names (values are not included)
op GraphQL only operationName of the GraphQL operation

Two requests with ?page=1 and ?page=2 share the same dedupe key (keys: ["page"]). Two requests to POST /graphql with different operationName values produce different dedupe keys because op differs.


CapturedRequest

{
  "method": "POST",
  "protocol": "https",
  "domain": "api.example.com",
  "path": "/graphql",
  "queryParams": { "version": "<64-char-hash>" },
  "requestHeaders": {
    "content-type": "<hash>",
    "authorization": "<hash> <hash>"
  },
  "requestBody": { ...HashedBody },
  "responseStatus": 200,
  "responseHeaders": {
    "content-type": "<hash>",
    "cache-control": "<hash> <hash> <hash>"
  },
  "responseBody": { ...HashedBody },
  "timestamp": 1718000000000,
  "duration": 142,
  "graphqlOperationName": "GetUsers"
}
Field Type Description
method string HTTP method, as sent (e.g. "GET", "POST")
protocol string "http" or "https"
domain string Hostname, with port if non-standard (e.g. "localhost:3000")
path string URL pathname (e.g. "/api/v2/users")
queryParams Record<string, string> Query parameters — keys preserved, values hashed as single tokens
requestHeaders Record<string, string> All request headers — keys lowercased, values space-split hashed
requestBody HashedBody | null Parsed and hashed request body (see below), null if no body
responseStatus number HTTP status code (e.g. 200, 404)
responseHeaders Record<string, string> All response headers — keys lowercased, values space-split hashed
responseBody HashedBody | null Parsed and hashed response body, null if no body
timestamp number Unix timestamp (ms) when the request was initiated
duration number Round-trip time in milliseconds
graphqlOperationName string (optional) Present only for GraphQL requests that included a named operationName

Hashes — what they are and what you can do with them

All sensitive values are replaced with SHA-256 hashes encoded as 64-character lowercase hex strings.

"Bearer eyJhbGci..."  →  "a3f2c8...b4e1 d9f0a1...c2e3"

What you can do with a hash:

  • Detect change — if the hash for authorization on two requests is the same, the caller used the same token. If it differs, the token changed.
  • Correlate across sessions — the same input always produces the same hash (SHA-256 is deterministic), so you can correlate across browser sessions or users without seeing the raw value.
  • Count distinct valuesCOUNT(DISTINCT hash) on a query param tells you how many distinct values were used for that param across all captures.

What you cannot do:

  • Reverse a hash back to the original value — this is intentional.
  • Distinguish "was this header present with a value" from "was this header absent" — absent headers are not included in the record at all.

HashedBody variants

requestBody and responseBody are one of five shapes, identified by the type field.

json — standard JSON body

{
  "type": "json",
  "data": {
    "userId": "<64-char-hash>",
    "role": "<64-char-hash>",
    "active": true,
    "score": "<64-char-hash>"
  }
}

The JSON tree structure is preserved exactly — object keys, nesting depth, array lengths are all intact. Only leaf values are hashed:

  • Strings → hashed
  • Numbers → hashed (converted to string first, then hashed)
  • Booleans → kept as-is (true / false)
  • Null → kept as-is (null)

This means you can infer the shape of the payload (which fields exist, what the nesting looks like, whether a field is an array vs an object) without seeing any actual values.


graphql — GraphQL JSON body (application/json with a query field)

{
  "type": "graphql",
  "operationName": "GetUsers",
  "data": {
    "query": "<64-char-hash>",
    "operationName": "<64-char-hash>",
    "variables": {
      "limit": "<64-char-hash>",
      "offset": "<64-char-hash>"
    }
  }
}

The operationName field at the top level of the body is lifted out and preserved as a plain string — it is not hashed. This is the key field for mapping distinct GraphQL operations. Everything else (query, variables values) is hashed via the same JSON tree walk as the json type.

operationName is undefined (field absent) when the request did not include an operation name.

The top-level graphqlOperationName field on CapturedRequest mirrors this value for convenience so you do not need to dig into requestBody.

This variant is also produced when the Content-Type is application/graphql (raw query string, RFC 9506), in which case operationName is always absent and data is a space-split hash of the raw body string.


form — URL-encoded form body (application/x-www-form-urlencoded)

{
  "type": "form",
  "data": {
    "username": "<64-char-hash>",
    "password": "<64-char-hash>",
    "remember_me": "<64-char-hash>"
  }
}

Field names are preserved. Each value is hashed as a single token (no space-splitting). This tells you which form fields were submitted without revealing any values.


text — plain text or unrecognised content type

{
  "type": "text",
  "data": "<hash1> <hash2> <hash3>"
}

The raw body string is split on spaces and each token is hashed independently. The number of space-separated tokens is preserved — "Bearer eyJ..." becomes two hashes, "plain-token" becomes one. This maintains structural information (e.g. recognising a two-part Authorization scheme) without revealing values.

Used for text/plain, text/xml, and any content type not otherwise recognised.


binary — binary or multipart body

{
  "type": "binary",
  "data": null
}

No content is captured. This is returned for multipart/form-data, application/octet-stream, application/pdf, and common image/audio/video MIME types. The presence of type: "binary" still tells you a body was sent and what content type category it was.


null body

When requestBody or responseBody is null (not the binary variant — actually null), it means no body was present at all (e.g. a GET request with no body, or a 204 No Content response).


Header value hashing

All header keys are lowercased. Header values use space-split hashing: the value is split on spaces and each token is hashed individually. The hashes are then joined back with spaces.

"Authorization: Bearer eyJhbGci..."
→ requestHeaders["authorization"] = "<hash(Bearer)> <hash(eyJhbGci...)>"

This preserves the structural shape of the value. A two-token Bearer <jwt> scheme is distinguishable from a one-token API key. The same Bearer prefix always hashes to the same value, so you can detect that a request used bearer auth without seeing the token.


Filtered traffic

The SDK does not capture:

  • Requests to the reporting endpoint itself (anti-recursion)
  • Requests to known analytics and monitoring domains (Google Analytics, Sentry, Segment, Datadog, Hotjar, and ~30 others)

Any request not appearing in the batch was either filtered out or occurred outside the flush window.


Example: mapping endpoint shapes for test generation

A typical server workflow:

  1. Receive a batch and store each QueueEntry keyed by dedupeKey.
  2. For REST endpoints: the combination of method + domain + path + keys is a stable identifier for an endpoint shape across sessions.
  3. For GraphQL: use method + domain + path + op to track individual operations. The operationName on CapturedRequest is the plain-text name you can use directly as a test identifier.
  4. Use count to understand call frequency within a session.
  5. Use the preserved JSON tree structure in requestBody.data and responseBody.data to infer the request/response schema — field names and nesting are intact even though values are hashed.
  6. Use responseStatus to understand the expected outcome of each operation.