|
| 1 | +# Message Queue Contract |
| 2 | + |
| 3 | +How queue payloads are defined, located, and bound to topics across domains. |
| 4 | + |
| 5 | +## Problem |
| 6 | + |
| 7 | +Queue payloads are Go structs serialized with `encoding/json` (`submitqueue/entity`, `runway/entity`), so the wire shape is defined only by Go source. Three gaps: |
| 8 | + |
| 9 | +- **No language-neutral contract.** Some payloads cross a domain boundary — a client written in another language has nothing to compile or validate against. |
| 10 | +- **No topic-to-payload binding.** `consumer.TopicRegistry` maps a `TopicKey` to a backend, topic name, and subscription — but not to the payload schema. That knowledge lives implicitly in whichever controller (de)serializes. |
| 11 | +- **No audience distinction.** Nothing separates private wiring between our own services from a published cross-domain contract. |
| 12 | + |
| 13 | +## Decisions |
| 14 | + |
| 15 | +### Contract language: Protobuf |
| 16 | + |
| 17 | +Payloads are defined as **proto3 messages**. The `.proto` is the language-neutral authority, and the Go binding is generated from it. This is the same mechanism the RPC contracts in `api/` already use, so queue payloads and RPC payloads share one toolchain, one set of shared field types, and one mental model. Generation runs through the repo's existing hermetic `protoc` Bazel rule (`tool/proto`); message-only contracts (no service) skip the gRPC/YARPC plugins. |
| 18 | + |
| 19 | +The decisive property is that the binding is *generated*, not hand-authored. There is no separate Go struct that can drift from the contract, and therefore no drift test to keep them in sync — the only binding is the generated one. |
| 20 | + |
| 21 | +### Wire format: protobuf JSON |
| 22 | + |
| 23 | +Payloads stay JSON on the wire; messages are serialized with **protobuf JSON (`protojson`)**, not binary proto. The MySQL-backed queue keeps storing self-describing JSON, exactly as before — only the source of the (de)serialization changes from a hand-written `encoding/json` struct to a generated message. |
| 24 | + |
| 25 | +protojson has its own conventions, which the contract adopts deliberately: |
| 26 | + |
| 27 | +- **Field names are the proto names (snake_case).** Serialized with `UseProtoNames`, so `queue_name` stays `queue_name` rather than protojson's default `queueName`. The wire matches the declared field names. |
| 28 | +- **Enums serialize as their value name in UPPER_SNAKE** (`REBASE`, `SQUASH_REBASE`, `MERGE`) — the proto-conventional spelling. |
| 29 | +- **64-bit integers serialize as strings.** A protojson rule for cross-language safety; relevant for any millisecond timestamp or count a future payload carries. |
| 30 | +- **Unknown fields are ignored on read** and zero-valued fields are omitted on write, which gives additive evolution for free: a consumer skips fields it does not yet know. |
| 31 | + |
| 32 | +### Location: audience decides |
| 33 | + |
| 34 | +A contract is **external** when something outside its owning domain depends on it — another domain's service, or a client written in another language. It is **internal** when only the owning domain's own services use it. The test is concrete: *does anything outside this domain compile or deserialize against it?* |
| 35 | + |
| 36 | +- **External** → `api/{domain}/messagequeue/`. The `api/` prefix is the published surface; outside code is expected to depend on it. |
| 37 | +- **Internal** → `{domain}/core/messagequeue/`, beside `core/topickey` and the controllers that use it. |
| 38 | + |
| 39 | +The `.proto`, its generated `protopb`, and any Go helpers co-locate in each home; only the home differs. |
| 40 | + |
| 41 | +### Visibility |
| 42 | + |
| 43 | +Bazel [`visibility`](https://bazel.build/concepts/visibility) enforces the split: `{domain}/core/messagequeue/` targets are scoped to the owning domain, so depending on one from outside is a build error; `api/` targets are public. No metadata keyword — the directory carries the distinction. |
| 44 | + |
| 45 | +### Topic binding: the `topics` option |
| 46 | + |
| 47 | +Each payload message declares a **`topics`** option: the canonical wire topic names that carry it (a message may list several — one payload can serve a queue pair). It is the single source of truth; the reverse index (topic → message) is derivable, not authored. This names the **wire topic**, distinct from the internal `consumer.TopicKey`. |
| 48 | + |
| 49 | +`topics` is a custom proto option — an extension of `google.protobuf.MessageOptions` defined once in `api/base/messagequeue` — so the binding is part of the language-neutral contract (any proto consumer can read it from the descriptor), not out-of-band Go wiring. It is the proto-native equivalent of a JSON Schema `x-topics` keyword. Go reads it back by reflection; a test asserts every `consumer.TopicKey` is carried by exactly one message and no option names an unknown topic. |
| 50 | + |
| 51 | +### Go binding: the generated `protopb` |
| 52 | + |
| 53 | +The generated message types in `protopb` are the Go binding, sitting beside `proto/` exactly as for the RPC contracts. The contract package adds only thin helpers — `protojson` (de)serialization and the `topics` reflection lookup. Shared field types (`change.Change`, `mergestrategy.MergeStrategy`) are themselves shared protos under `api/base/{change,mergestrategy}/proto`, imported by every contract that needs them. |
| 54 | + |
| 55 | +## Example |
| 56 | + |
| 57 | +Two illustrative payloads. `ExampleRequest` is carried on a single topic; |
| 58 | +`ExampleResult` shows the list form — one payload that serves a queue pair, so |
| 59 | +it repeats the `topics` option once per wire topic: |
| 60 | + |
| 61 | +```proto |
| 62 | +syntax = "proto3"; |
| 63 | +
|
| 64 | +package uber.example.messagequeue; |
| 65 | +
|
| 66 | +import "api/base/messagequeue/proto/messagequeue.proto"; |
| 67 | +
|
| 68 | +message ExampleRequest { |
| 69 | + option (uber.base.messagequeue.topics) = "example-request"; |
| 70 | +
|
| 71 | + string id = 1; // Client-owned correlation id. |
| 72 | + string mode = 2; // "fast" or "thorough". |
| 73 | + repeated string items = 3; |
| 74 | +} |
| 75 | +
|
| 76 | +message ExampleResult { |
| 77 | + // One shape, two queues: the same result is published to the check-result |
| 78 | + // topic for a dry run and the merge-result topic for a committing run. |
| 79 | + option (uber.base.messagequeue.topics) = "example-check-result"; |
| 80 | + option (uber.base.messagequeue.topics) = "example-merge-result"; |
| 81 | +
|
| 82 | + string id = 1; // Echoes the request's correlation id. |
| 83 | + bool success = 2; |
| 84 | +} |
| 85 | +``` |
| 86 | + |
| 87 | +A conforming `ExampleRequest` wire value: |
| 88 | + |
| 89 | +```json |
| 90 | +{ "id": "req-42", "mode": "fast", "items": ["a", "b"] } |
| 91 | +``` |
| 92 | + |
| 93 | +## Rejected |
| 94 | + |
| 95 | +- **JSON Schema for payloads.** A hand-authored schema duplicates the message definition and needs a drift test to stay in sync with a hand-authored Go struct. Proto generates the Go binding from the one definition, so the duplication — and the test guarding it — disappears; the contract also shares the toolchain and shared types with the RPC surface. |
| 96 | +- **Binary proto / Avro on the wire.** Binary loses the self-describing JSON the MySQL-backed queue relies on, and Avro's value is a schema registry for decoding binary, which we do not have. protojson keeps the wire as JSON while still generating the binding. |
| 97 | +- **One unified `api/` tree with audience as metadata.** Fine for inert schemas, but co-locating the generated binding pulls internal types into the published surface; a directory boundary matching audience is more honest. |
0 commit comments