Skip to content

Commit 780c1dd

Browse files
committed
feat: add shared writing guides, skills, and docs infrastructure
1 parent 40ddb5b commit 780c1dd

25 files changed

Lines changed: 4270 additions & 244 deletions

File tree

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
---
2+
name: dify-docs-api-reference
3+
description: >
4+
Use when editing, auditing, or creating OpenAPI specs for the Dify
5+
documentation repo. Applies to files in en/api-reference/. Covers
6+
formatting rules, error code conventions, example standards,
7+
operationId patterns.
8+
---
9+
10+
# Dify API Reference Documentation
11+
12+
## Before Starting
13+
14+
Read these shared guides:
15+
16+
1. `writing-guides/style-guide.md`
17+
2. `writing-guides/formatting-guide.md`
18+
3. `writing-guides/glossary.md`
19+
20+
**When auditing**, also load:
21+
- `references/audit-checklist.md`
22+
- `references/common-mistakes.md`
23+
24+
**When tracing code paths**, also load:
25+
- `references/codebase-paths.md`
26+
27+
## Reader Persona
28+
29+
Backend developers integrating Dify apps or knowledge bases into their own applications via REST APIs. Assume strong coding ability, familiarity with HTTP, authentication patterns, and JSON. Focus on precision: exact parameter types, required vs optional, error codes, and realistic examples. Don't explain what a REST API is.
30+
31+
## Code Fidelity (Non-Negotiable)
32+
33+
**Every detail in the spec MUST be verifiable against the codebase.** When the spec disagrees with the code, the spec is wrong.
34+
35+
### What must match the code exactly
36+
37+
- **Schema constraints**: `default`, `minimum`/`maximum`, `enum` must exactly match Pydantic `Field()` arguments. E.g., `le=101` -> `"maximum": 101` -- not 100.
38+
- **Required/optional**: `Field(default=...)` = optional, no default = required; `FetchUserArg(required=True)` = required.
39+
- **Error codes**: Only errors the endpoint actually raises. Trace `except` -> `raise` -> exception class -> `error_code` and `code` attributes. See [Error Responses](#error-responses).
40+
- **Response status codes**: Must match the code's `return ..., <status>` value.
41+
- **Response body fields**: Must match what the code actually returns. For streaming endpoints, verify the event type `enum` against actual events yielded by the task pipeline. Each event type must have a corresponding discriminator mapping entry.
42+
- **Error messages**: Must match the exception's `description` attribute or the string passed to werkzeug exceptions.
43+
44+
### How to verify
45+
46+
1. Read the controller method.
47+
2. For each parameter: find the Pydantic model or `request.args.get()`, note `Field()` arguments.
48+
3. **Trace string fields beyond the controller.** The controller may declare `str`, but the service layer may cast to `StrEnum`, `Literal`, or validate against a fixed list. Common patterns: `SomeEnum(value)` cast, `Literal["a", "b"]` downstream, explicit `if field not in ALLOWED_VALUES` checks. If any exist, the spec MUST have `enum`.
49+
4. For errors: trace `except` -> `raise` -> exception class -> `error_code` and `code` in `error.py`.
50+
5. For responses: check `return` statement. **Important:** Response converters (e.g., `convert_blocking_full_response`) may flatten, restructure, or inject fields not present in the Pydantic entity. Always read the converter.
51+
6. For service calls: read the service method to see what it returns or raises.
52+
53+
### Flagging suspected code bugs
54+
55+
The code is the source of truth, but the **code itself may have bugs**. When you encounter something irregular:
56+
57+
1. **Flag it explicitly** -- do NOT silently document the suspected bug.
58+
2. **Show the evidence** -- quote the exact code line and explain why it looks wrong.
59+
3. **Ask the user for a decision** -- (a) document as-is, or (b) treat as upstream bug.
60+
4. **Never auto-correct** -- do not silently write the "correct" value when the code says otherwise.
61+
62+
Common code smells: off-by-one in `le`/`ge`, response body with 204, inconsistent error handling across similar endpoints, missing error handlers that sibling endpoints have, `required` mismatches.
63+
64+
### Professional judgment
65+
66+
You are a professional API documentation writer. Beyond code fidelity:
67+
- **Challenge questionable decisions** with reasoning.
68+
- **Suggest improvements** to API consistency or developer experience (clearly separated from required fixes).
69+
- **Question conflicting instructions** -- push back with evidence.
70+
71+
## Spec Structure
72+
73+
| Spec File | App Type | `AppMode` values | Key Endpoints |
74+
|-----------|----------|------------------|---------------|
75+
| `openapi_chat.json` | Chat & Agent | `CHAT`, `AGENT_CHAT` | `/chat-messages`, conversations |
76+
| `openapi_chatflow.json` | Chatflow | `ADVANCED_CHAT` | Same as chat, mode `advanced-chat` |
77+
| `openapi_workflow.json` | Workflow | `WORKFLOW` | `/workflows/run`, workflow logs |
78+
| `openapi_completion.json` | Completion | `COMPLETION` | `/completion-messages` |
79+
| `openapi_knowledge.json` | Knowledge | *(N/A)* | datasets, documents, segments, metadata |
80+
81+
Shared endpoints (file upload, audio, feedback, app info, parameters, meta, site, end-user) appear in chat/chatflow/workflow/completion specs.
82+
83+
### App-Type Scoping (Critical)
84+
85+
The codebase uses shared controllers and Pydantic models across app modes. The **documentation separates** these into per-app-type specs. You MUST filter through the app type lens:
86+
87+
1. **Shared Pydantic models** -- only include fields relevant to this spec's app type.
88+
2. **Shared error handlers** -- only include errors triggerable under this spec's app type.
89+
3. **Internal-only fields** (e.g., `retriever_from`) -- omit from all specs.
90+
91+
**How to determine relevance:** Check the controller's `AppMode` guard. For fields: "does this field have any effect in this mode?" For errors: "can this error be triggered in this mode?" When in doubt, trace through `AppGenerateService.generate()`.
92+
93+
## Style Overrides
94+
95+
These rules are specific to API reference docs and override or extend the general style guide.
96+
97+
### Endpoint Summaries
98+
99+
Must start with an imperative verb. Title Case. Standard vocabulary:
100+
101+
| Verb | Method | When to use |
102+
|------|--------|-------------|
103+
| `Get` | GET | Single JSON resource by ID or fixed path |
104+
| `List` | GET | Collection (paginated array) |
105+
| `Download` | GET | Binary file content |
106+
| `Create` | POST | New persistent resource |
107+
| `Send` | POST | Message or request dispatch |
108+
| `Submit` | POST | Feedback or input on existing resource |
109+
| `Upload` | POST | File upload |
110+
| `Convert` | POST | Format transformation |
111+
| `Run` | POST | Execute workflow or process |
112+
| `Stop` | POST | Halt running task |
113+
| `Configure` | POST | Enable/disable setting |
114+
| `Rename` | POST | Rename existing resource |
115+
| `Update` | PUT/PATCH | Modify fields on existing resource |
116+
| `Delete` | DELETE | Remove resource |
117+
118+
**Do NOT use `Retrieve`** -- use `Get` or `List`. Verb-object order: `Upload File` not `File Upload`.
119+
120+
### operationId Convention
121+
122+
Pattern: `{verb}{AppType}{Resource}`
123+
124+
| App Type | Prefix | Examples |
125+
|----------|--------|---------|
126+
| Chat | `Chat` | `createChatMessage`, `listChatConversations` |
127+
| Chatflow | `Chatflow` | `createChatflowMessage` |
128+
| Workflow | `Workflow` | `runWorkflow`, `getWorkflowLogs` |
129+
| Completion | `Completion` | `createCompletionMessage` |
130+
| Knowledge | *(none)* | `createDataset`, `listDocuments` |
131+
132+
**Legacy operationIds**: Do NOT rename existing ones. Changing operationIds is a breaking change for SDK users. Apply this convention to **new endpoints only**.
133+
134+
### Descriptions
135+
136+
- **User-centric**: Write for developers, not the codebase. Name by what developers want to accomplish (e.g., "Download" not "Preview" for an endpoint serving raw file bytes).
137+
- **Terminology consistency**: All user-facing text within a spec must use consistent terms. Code-derived names (paths, fields, schema names) stay as-is. Watch for: "segment" vs "chunk" (use "chunk"), "dataset" vs "knowledge base" (use "knowledge base").
138+
- **Descriptions must add value**: `"Session identifier."` is a label, not a description. Instead: `"The \`user\` identifier provided in API requests."`.
139+
- **Nullable/conditional fields**: Explain when present or `null`.
140+
141+
### Cross-API Links
142+
143+
When a description mentions another endpoint, add a markdown link.
144+
Pattern: `/api-reference/{category}/{endpoint-name}` (kebab-case from endpoint summary).
145+
146+
## Parameters
147+
148+
- Every parameter MUST have a `description`.
149+
- **Schema constraints must exactly match code.** Transcribe `Field()` arguments verbatim.
150+
- Do NOT have `example` field on parameters.
151+
- **Do NOT repeat schema metadata in descriptions.** If `default: 20` is in schema, don't repeat in description.
152+
- **Do NOT repeat enum values in descriptions** unless explaining when to choose each value.
153+
- Mark `required` accurately based on code.
154+
- **Request fields**: Use `enum` for known value sets. Trace string fields through service layer for hidden enums.
155+
- **Response fields**: Do NOT use `enum`. Explain values in `description` instead (Mintlify renders duplicate "Available options" list).
156+
- **Backtick all values** in descriptions: literal values, field names, code references.
157+
- **Space between numbers and units**: `100 s`, `15 MB` -- not `100s`, `15MB`.
158+
- **Descriptions must be specific**: `"Available options."` is not acceptable.
159+
160+
## Responses
161+
162+
### Success Responses
163+
164+
Only 200/201 as the primary response. For multiple response modes (blocking/streaming), use markdown bullets in the 200 description.
165+
166+
**Every 200/201 JSON response MUST have at least one `examples` entry** with realistic values.
167+
168+
**Binary/file responses**: Use `content` with appropriate media type and `format: binary`. Use `audio/mpeg` for audio, `application/octet-stream` for generic files. Put details in response `description`, not endpoint description.
169+
170+
**Schema description duplication**: When using `$ref`, the schema definition MUST NOT have a top-level `description`. Mintlify renders both, causing duplication.
171+
172+
### Error Responses
173+
174+
Each endpoint MUST list its specific error codes, grouped by HTTP status.
175+
176+
#### Error Tracing Rules
177+
178+
1. **`BaseHTTPException` subclasses** (in `error.py`): Use `error_code` attribute as code name, `code` attribute as HTTP status.
179+
2. **Werkzeug built-in exceptions** (`BadRequest`, `NotFound`): Use generic codes -- `bad_request`, `not_found`. NOT the service-layer exception name.
180+
3. **Custom werkzeug `HTTPException` subclasses** (NOT `BaseHTTPException`): Global handler converts class name to snake_case via regex. E.g., `FilenameNotExistsError` -> `filename_not_exists_error`.
181+
4. **Fire-and-forget methods**: If a service method never raises, do NOT invent error responses.
182+
5. **No custom error handling**: If controller only uses `@validate_app_token` with no `try/except`, the only error is 401 (global auth). Do NOT add empty error sections.
183+
6. **Error messages**: Use the exact string from the exception's `description` attribute or werkzeug string argument.
184+
185+
#### Error Format
186+
187+
- **No `$ref` schema** in error responses -- omit `"schema"` entirely.
188+
- **Description** lists error codes as markdown bullets with backticked names.
189+
- **Examples** required for every error response (provides Mintlify dropdown selector).
190+
191+
## Schemas
192+
193+
- **Prefer inline** over `$ref` for simple objects.
194+
- Only use `$ref` for genuinely reused or complex schemas.
195+
- **Array items must define `properties`** -- no bare `"type": "object"`.
196+
- **`required` arrays on request schemas only** -- not response schemas.
197+
- **`oneOf` options**: Each must have a `title` property. Parent schema must NOT have `description`.
198+
199+
## Examples
200+
201+
- **Realistic values only.** Real-looking UUIDs, timestamps, text, metadata.
202+
- **Verify example values against code.** Enum-like fields must use values the code actually returns.
203+
- Request and response examples must correspond.
204+
- **Titles**: `"summary": "Request Example"` (single) or `"summary": "Request Example-Streaming mode"` (multiple). Error examples: use error code as summary.
205+
206+
## Tag Naming
207+
208+
- **Plural** for countable resources: `Chats`, `Files`, `Conversations`.
209+
- **Singular** for uncountable nouns or abbreviations: `Feedback`, `TTS`.
210+
- Title Case.
211+
212+
## Endpoint Ordering
213+
214+
**CRUD lifecycle**: POST create -> GET list/detail -> PUT/PATCH update -> DELETE.
215+
216+
Exception: Tags without a create operation (e.g., Conversations). GET list comes first; non-create POST placed after GETs but before PUT/DELETE.
217+
218+
## Post-Writing Verification
219+
220+
After completing the document, invoke `dify-docs-reader-test` to verify it from the reader's perspective.
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Audit Checklist (Per Endpoint)
2+
3+
Use this checklist when auditing or reviewing an OpenAPI spec against the Dify codebase.
4+
5+
## Pre-Audit
6+
7+
1. **Identify the spec's app type**: Determine which `AppMode` values this spec covers (see SKILL.md Spec Structure table). All subsequent checks are filtered through this app-type scope.
8+
2. **Compare routes**: Check `api/controllers/service_api/__init__.py` for registered routes, then each controller file.
9+
10+
## Per-Endpoint Checks
11+
12+
3. **App-type scoping**: For shared controllers/models, only include fields, parameters, and errors relevant to this spec's app type. Trace code paths to confirm relevance.
13+
4. **Missing endpoints**: Present in code but not in spec.
14+
5. **Ghost endpoints**: Present in spec but not in code.
15+
6. **Request schemas**: Verify params, types, required/optional, defaults, enums against every `Field()` argument.
16+
7. **Hidden enums on request string fields**: For every `string` field without `enum`, trace through the service layer to check for `StrEnum` casts, `Literal` types, or validation against fixed lists. Do NOT trust the controller-level type annotation alone.
17+
8. **Response schemas**: Verify fields, types, status codes. Check `return ..., <status>` and read response converters (they may flatten or inject fields).
18+
9. **Error codes -- completeness**: All errors the endpoint raises are documented. Trace every `except` -> `raise` chain; read service methods to confirm they actually raise.
19+
10. **Error codes -- correctness**: No phantom codes. Remove errors the controller does not raise.
20+
11. **Error code names**: Must match `error_code` attribute (custom exceptions) or werkzeug generic name (`bad_request`, `not_found`). Never use Python class names or service exception names.
21+
12. **Error messages**: Must match the `description` attribute or string argument. Copy from code verbatim.
22+
13. **Example values**: Match actual code output (e.g., enum values returned by the code). No unresolved `{message}` placeholders.
23+
14. **operationId convention**: Follows `{verb}{AppType}{Resource}` pattern for new endpoints; legacy IDs left as-is.
24+
15. **Description quality**: Useful explanations, not just field-name labels.
25+
16. **200/201 responses have examples**: Every JSON success response must have at least one `examples` entry with realistic values.
26+
17. **No schema description duplication**: `$ref` response schemas must not have a top-level `description` (Mintlify shows both).
27+
18. **Binary responses**: Use `content` with `format: binary` schema; details in response `description`.
28+
19. **`oneOf` options have `title`**: Each option object needs a descriptive `title`. Parent schema has no `description`.
29+
20. **`required` arrays on request schemas only**: Not on response schemas.
30+
21. **`enum` on request schemas only**: Not on response schemas (Mintlify renders duplicate "Available options").
31+
22. **Response array items have `properties`**: No bare `"type": "object"` -- Mintlify renders `object[]` with no expandable fields.
32+
23. **Terminology consistency**: No synonym mixing within a tag (e.g., "segment" vs "chunk").
33+
24. **Values backticked, number-unit spacing correct**: All literal values backticked; space between numbers and units.
34+
25. **Endpoint ordering**: Follows CRUD lifecycle (POST create -> GET list/detail -> PUT/PATCH update -> DELETE).
35+
26. **Tag naming**: Plural for countable resources, singular for uncountable nouns/abbreviations, Title Case.
36+
37+
## Two-Agent Workflow
38+
39+
- **Agent 1 (Fixer)**: Audits the spec and applies fixes using this checklist and all rules from SKILL.md.
40+
- **Agent 2 (Reviewer)**: Reads the fixed spec and verifies compliance. Reports remaining issues WITHOUT making edits. If issues are found, fix and optionally re-run the reviewer.
41+
42+
Always validate JSON (`python -m json.tool`) after fixes.
43+
44+
## Cross-Spec Propagation
45+
46+
Shared endpoints (file upload, audio, feedback, app info, parameters, meta, site, end-user) appear in chat, chatflow, completion, and workflow specs. When a fix is applied to one spec, check all sibling specs for the same issue.
47+
48+
## Verification Rigor
49+
50+
**Every reported issue must be correct.** False positives erode trust and waste time.
51+
52+
1. **Trace the full path.** Don't stop at the controller. Follow errors through global handlers (`external_api.py`), check whether service methods actually raise.
53+
2. **Check app-type relevance.** Don't flag `workflow_id` as missing from the chat spec.
54+
3. **Verify every claim has evidence.** You must have read the actual code line. No speculative claims.
55+
4. **Self-review before reporting.** Re-read each finding and ask:
56+
- "Did I read the actual code, or am I assuming?"
57+
- "Did I check global error handlers for bare `raise ValueError/Exception`?"
58+
- "Is this field/error relevant to THIS spec's app type?"
59+
- "Am I confusing the Python class name with the `error_code` attribute?"
60+
- "Did I check the service method body, or did I assume it raises?"
61+
5. **When uncertain, investigate further.** Report fewer verified issues rather than many unverified ones. Mark uncertain items as "unverified -- needs manual check."
62+
63+
### Common False-Positive Patterns
64+
65+
- Assuming bare `ValueError` is a 500 (global handler converts to 400 `invalid_param`)
66+
- Flagging shared-model fields as missing from a spec covering a different app type
67+
- Assuming a service method raises when it's actually fire-and-forget
68+
- Using the Python exception class name instead of the `error_code` attribute
69+
- Inventing errors for code paths that don't exist under the spec's app mode
70+
- Documenting an unreachable `except` clause (controller catches exception the service never raises for this endpoint)
71+
- Adding `enum` to a genuinely dynamic/provider-specific string field (e.g., `voice`, `embedding_model_name`)
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Codebase Paths
2+
3+
Mapping of concepts to file paths in the Dify codebase and docs repo.
4+
5+
## Dify Docs Repo
6+
7+
| What | Path |
8+
|------|------|
9+
| OpenAPI specs | `en/api-reference/openapi_*.json` |
10+
| Navigation config | `docs.json` |
11+
12+
## Dify Codebase
13+
14+
| What | Path |
15+
|------|------|
16+
| App controllers | `api/controllers/service_api/app/` |
17+
| Dataset controllers | `api/controllers/service_api/dataset/` |
18+
| App error definitions | `api/controllers/service_api/app/error.py` |
19+
| Dataset error definitions | `api/controllers/service_api/dataset/error.py` |
20+
| Auth/rate-limit wrapper | `api/controllers/service_api/wraps.py` |
21+
| Global error handlers | `api/libs/external_api.py` |
22+
| Route registration | `api/controllers/service_api/__init__.py` |
23+
24+
### Global Error Handlers
25+
26+
The handlers in `api/libs/external_api.py` are critical for error tracing:
27+
28+
- `ValueError` -> 400 `invalid_param`
29+
- `AppInvokeQuotaExceededError` -> 429 `too_many_requests`
30+
- Generic `Exception` -> 500
31+
32+
Always check these when tracing bare `raise ValueError(...)` or unhandled exceptions.
33+
34+
### Error Code Sources
35+
36+
| Error Type | Source |
37+
|------------|--------|
38+
| App-level errors | `api/controllers/service_api/app/error.py` |
39+
| Knowledge errors | `api/controllers/service_api/dataset/error.py` |
40+
| Auth/rate-limit | `api/controllers/service_api/wraps.py` |
41+
| Global handlers | `api/libs/external_api.py` |

0 commit comments

Comments
 (0)