Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
220 changes: 220 additions & 0 deletions .claude/skills/dify-docs-api-reference/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
---
name: dify-docs-api-reference
description: >
Use when editing, auditing, or creating OpenAPI specs for the Dify
documentation repo. Applies to files in en/api-reference/. Covers
formatting rules, error code conventions, example standards,
operationId patterns.
---

# Dify API Reference Documentation

## Before Starting

Read these shared guides:

1. `writing-guides/style-guide.md`
2. `writing-guides/formatting-guide.md`
3. `writing-guides/glossary.md`

**When auditing**, also load:
- `references/audit-checklist.md`
- `references/common-mistakes.md`

**When tracing code paths**, also load:
- `references/codebase-paths.md`

## Reader Persona

Backend developers integrating Dify apps or knowledge bases into their own applications via REST APIs. Assume strong coding ability, familiarity with HTTP, authentication patterns, and JSON. Focus on precision: exact parameter types, required vs optional, error codes, and realistic examples. Don't explain what a REST API is.

## Code Fidelity (Non-Negotiable)

**Every detail in the spec MUST be verifiable against the codebase.** When the spec disagrees with the code, the spec is wrong.

### What must match the code exactly

- **Schema constraints**: `default`, `minimum`/`maximum`, `enum` must exactly match Pydantic `Field()` arguments. E.g., `le=101` -> `"maximum": 101` -- not 100.
- **Required/optional**: `Field(default=...)` = optional, no default = required; `FetchUserArg(required=True)` = required.
- **Error codes**: Only errors the endpoint actually raises. Trace `except` -> `raise` -> exception class -> `error_code` and `code` attributes. See [Error Responses](#error-responses).
- **Response status codes**: Must match the code's `return ..., <status>` value.
- **Response body fields**: Must match what the code actually returns. For streaming endpoints, verify the event type `enum` against actual events yielded by the task pipeline. Each event type must have a corresponding discriminator mapping entry.
- **Error messages**: Must match the exception's `description` attribute or the string passed to werkzeug exceptions.

### How to verify

1. Read the controller method.
2. For each parameter: find the Pydantic model or `request.args.get()`, note `Field()` arguments.
3. **Trace string fields beyond the controller.** The controller may declare `str`, but the service layer may cast to `StrEnum`, `Literal`, or validate against a fixed list. Common patterns: `SomeEnum(value)` cast, `Literal["a", "b"]` downstream, explicit `if field not in ALLOWED_VALUES` checks. If any exist, the spec MUST have `enum`.
4. For errors: trace `except` -> `raise` -> exception class -> `error_code` and `code` in `error.py`.
5. For responses: check `return` statement. **Important:** Response converters (e.g., `convert_blocking_full_response`) may flatten, restructure, or inject fields not present in the Pydantic entity. Always read the converter.
6. For service calls: read the service method to see what it returns or raises.

### Flagging suspected code bugs

The code is the source of truth, but the **code itself may have bugs**. When you encounter something irregular:

1. **Flag it explicitly** -- do NOT silently document the suspected bug.
2. **Show the evidence** -- quote the exact code line and explain why it looks wrong.
3. **Ask the user for a decision** -- (a) document as-is, or (b) treat as upstream bug.
4. **Never auto-correct** -- do not silently write the "correct" value when the code says otherwise.

Common code smells: off-by-one in `le`/`ge`, response body with 204, inconsistent error handling across similar endpoints, missing error handlers that sibling endpoints have, `required` mismatches.

### Professional judgment

You are a professional API documentation writer. Beyond code fidelity:
- **Challenge questionable decisions** with reasoning.
- **Suggest improvements** to API consistency or developer experience (clearly separated from required fixes).
- **Question conflicting instructions** -- push back with evidence.

## Spec Structure

| Spec File | App Type | `AppMode` values | Key Endpoints |
|-----------|----------|------------------|---------------|
| `openapi_chat.json` | Chat & Agent | `CHAT`, `AGENT_CHAT` | `/chat-messages`, conversations |
| `openapi_chatflow.json` | Chatflow | `ADVANCED_CHAT` | Same as chat, mode `advanced-chat` |
| `openapi_workflow.json` | Workflow | `WORKFLOW` | `/workflows/run`, workflow logs |
| `openapi_completion.json` | Completion | `COMPLETION` | `/completion-messages` |
| `openapi_knowledge.json` | Knowledge | *(N/A)* | datasets, documents, segments, metadata |

Shared endpoints (file upload, audio, feedback, app info, parameters, meta, site, end-user) appear in chat/chatflow/workflow/completion specs.

### App-Type Scoping (Critical)

The codebase uses shared controllers and Pydantic models across app modes. The **documentation separates** these into per-app-type specs. You MUST filter through the app type lens:

1. **Shared Pydantic models** -- only include fields relevant to this spec's app type.
2. **Shared error handlers** -- only include errors triggerable under this spec's app type.
3. **Internal-only fields** (e.g., `retriever_from`) -- omit from all specs.

**How to determine relevance:** Check the controller's `AppMode` guard. For fields: "does this field have any effect in this mode?" For errors: "can this error be triggered in this mode?" When in doubt, trace through `AppGenerateService.generate()`.

## Style Overrides

These rules are specific to API reference docs and override or extend the general style guide.

### Endpoint Summaries

Must start with an imperative verb. Title Case. Standard vocabulary:

| Verb | Method | When to use |
|------|--------|-------------|
| `Get` | GET | Single JSON resource by ID or fixed path |
| `List` | GET | Collection (paginated array) |
| `Download` | GET | Binary file content |
| `Create` | POST | New persistent resource |
| `Send` | POST | Message or request dispatch |
| `Submit` | POST | Feedback or input on existing resource |
| `Upload` | POST | File upload |
| `Convert` | POST | Format transformation |
| `Run` | POST | Execute workflow or process |
| `Stop` | POST | Halt running task |
| `Configure` | POST | Enable/disable setting |
| `Rename` | POST | Rename existing resource |
| `Update` | PUT/PATCH | Modify fields on existing resource |
| `Delete` | DELETE | Remove resource |

**Do NOT use `Retrieve`** -- use `Get` or `List`. Verb-object order: `Upload File` not `File Upload`.

### operationId Convention

Pattern: `{verb}{AppType}{Resource}`

| App Type | Prefix | Examples |
|----------|--------|---------|
| Chat | `Chat` | `createChatMessage`, `listChatConversations` |
| Chatflow | `Chatflow` | `createChatflowMessage` |
| Workflow | `Workflow` | `runWorkflow`, `getWorkflowLogs` |
| Completion | `Completion` | `createCompletionMessage` |
| Knowledge | *(none)* | `createDataset`, `listDocuments` |

**Legacy operationIds**: Do NOT rename existing ones. Changing operationIds is a breaking change for SDK users. Apply this convention to **new endpoints only**.

### Descriptions

- **User-centric**: Write for developers, not the codebase. Name by what developers want to accomplish (e.g., "Download" not "Preview" for an endpoint serving raw file bytes).
- **Terminology consistency**: All user-facing text within a spec must use consistent terms. Code-derived names (paths, fields, schema names) stay as-is. Watch for: "segment" vs "chunk" (use "chunk"), "dataset" vs "knowledge base" (use "knowledge base").
- **Descriptions must add value**: `"Session identifier."` is a label, not a description. Instead: `"The \`user\` identifier provided in API requests."`.
- **Nullable/conditional fields**: Explain when present or `null`.

### Cross-API Links

When a description mentions another endpoint, add a markdown link.
Pattern: `/api-reference/{category}/{endpoint-name}` (kebab-case from endpoint summary).

## Parameters

- Every parameter MUST have a `description`.
- **Schema constraints must exactly match code.** Transcribe `Field()` arguments verbatim.
- Do NOT have `example` field on parameters.
- **Do NOT repeat schema metadata in descriptions.** If `default: 20` is in schema, don't repeat in description.
- **Do NOT repeat enum values in descriptions** unless explaining when to choose each value.
- Mark `required` accurately based on code.
- **Request fields**: Use `enum` for known value sets. Trace string fields through service layer for hidden enums.
- **Response fields**: Do NOT use `enum`. Explain values in `description` instead (Mintlify renders duplicate "Available options" list).
- **Backtick all values** in descriptions: literal values, field names, code references.
- **Space between numbers and units**: `100 s`, `15 MB` -- not `100s`, `15MB`.
- **Descriptions must be specific**: `"Available options."` is not acceptable.

## Responses

### Success Responses

Only 200/201 as the primary response. For multiple response modes (blocking/streaming), use markdown bullets in the 200 description.

**Every 200/201 JSON response MUST have at least one `examples` entry** with realistic values.

**Binary/file responses**: Use `content` with appropriate media type and `format: binary`. Use `audio/mpeg` for audio, `application/octet-stream` for generic files. Put details in response `description`, not endpoint description.

**Schema description duplication**: When using `$ref`, the schema definition MUST NOT have a top-level `description`. Mintlify renders both, causing duplication.

### Error Responses

Each endpoint MUST list its specific error codes, grouped by HTTP status.

#### Error Tracing Rules

1. **`BaseHTTPException` subclasses** (in `error.py`): Use `error_code` attribute as code name, `code` attribute as HTTP status.
2. **Werkzeug built-in exceptions** (`BadRequest`, `NotFound`): Use generic codes -- `bad_request`, `not_found`. NOT the service-layer exception name.
3. **Custom werkzeug `HTTPException` subclasses** (NOT `BaseHTTPException`): Global handler converts class name to snake_case via regex. E.g., `FilenameNotExistsError` -> `filename_not_exists_error`.
4. **Fire-and-forget methods**: If a service method never raises, do NOT invent error responses.
5. **No custom error handling**: If controller only uses `@validate_app_token` with no `try/except`, the only error is 401 (global auth). Do NOT add empty error sections.
6. **Error messages**: Use the exact string from the exception's `description` attribute or werkzeug string argument.

#### Error Format

- **No `$ref` schema** in error responses -- omit `"schema"` entirely.
- **Description** lists error codes as markdown bullets with backticked names.
- **Examples** required for every error response (provides Mintlify dropdown selector).

## Schemas

- **Prefer inline** over `$ref` for simple objects.
- Only use `$ref` for genuinely reused or complex schemas.
- **Array items must define `properties`** -- no bare `"type": "object"`.
- **`required` arrays on request schemas only** -- not response schemas.
- **`oneOf` options**: Each must have a `title` property. Parent schema must NOT have `description`.

## Examples

- **Realistic values only.** Real-looking UUIDs, timestamps, text, metadata.
- **Verify example values against code.** Enum-like fields must use values the code actually returns.
- Request and response examples must correspond.
- **Titles**: `"summary": "Request Example"` (single) or `"summary": "Request Example-Streaming mode"` (multiple). Error examples: use error code as summary.

## Tag Naming

- **Plural** for countable resources: `Chats`, `Files`, `Conversations`.
- **Singular** for uncountable nouns or abbreviations: `Feedback`, `TTS`.
- Title Case.

## Endpoint Ordering

**CRUD lifecycle**: POST create -> GET list/detail -> PUT/PATCH update -> DELETE.

Exception: Tags without a create operation (e.g., Conversations). GET list comes first; non-create POST placed after GETs but before PUT/DELETE.

## Post-Writing Verification

After completing the document, invoke `dify-docs-reader-test` to verify it from the reader's perspective.
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Audit Checklist (Per Endpoint)

Use this checklist when auditing or reviewing an OpenAPI spec against the Dify codebase.

## Pre-Audit

1. **Identify the spec's app type**: Determine which `AppMode` values this spec covers (see SKILL.md Spec Structure table). All subsequent checks are filtered through this app-type scope.
2. **Compare routes**: Check `api/controllers/service_api/__init__.py` for registered routes, then each controller file.

## Per-Endpoint Checks

3. **App-type scoping**: For shared controllers/models, only include fields, parameters, and errors relevant to this spec's app type. Trace code paths to confirm relevance.
4. **Missing endpoints**: Present in code but not in spec.
5. **Ghost endpoints**: Present in spec but not in code.
6. **Request schemas**: Verify params, types, required/optional, defaults, enums against every `Field()` argument.
7. **Hidden enums on request string fields**: For every `string` field without `enum`, trace through the service layer to check for `StrEnum` casts, `Literal` types, or validation against fixed lists. Do NOT trust the controller-level type annotation alone.
8. **Response schemas**: Verify fields, types, status codes. Check `return ..., <status>` and read response converters (they may flatten or inject fields).
9. **Error codes -- completeness**: All errors the endpoint raises are documented. Trace every `except` -> `raise` chain; read service methods to confirm they actually raise.
10. **Error codes -- correctness**: No phantom codes. Remove errors the controller does not raise.
11. **Error code names**: Must match `error_code` attribute (custom exceptions) or werkzeug generic name (`bad_request`, `not_found`). Never use Python class names or service exception names.
12. **Error messages**: Must match the `description` attribute or string argument. Copy from code verbatim.
13. **Example values**: Match actual code output (e.g., enum values returned by the code). No unresolved `{message}` placeholders.
14. **operationId convention**: Follows `{verb}{AppType}{Resource}` pattern for new endpoints; legacy IDs left as-is.
15. **Description quality**: Useful explanations, not just field-name labels.
16. **200/201 responses have examples**: Every JSON success response must have at least one `examples` entry with realistic values.
17. **No schema description duplication**: `$ref` response schemas must not have a top-level `description` (Mintlify shows both).
18. **Binary responses**: Use `content` with `format: binary` schema; details in response `description`.
19. **`oneOf` options have `title`**: Each option object needs a descriptive `title`. Parent schema has no `description`.
20. **`required` arrays on request schemas only**: Not on response schemas.
21. **`enum` on request schemas only**: Not on response schemas (Mintlify renders duplicate "Available options").
22. **Response array items have `properties`**: No bare `"type": "object"` -- Mintlify renders `object[]` with no expandable fields.
23. **Terminology consistency**: No synonym mixing within a tag (e.g., "segment" vs "chunk").
24. **Values backticked, number-unit spacing correct**: All literal values backticked; space between numbers and units.
25. **Endpoint ordering**: Follows CRUD lifecycle (POST create -> GET list/detail -> PUT/PATCH update -> DELETE).
26. **Tag naming**: Plural for countable resources, singular for uncountable nouns/abbreviations, Title Case.

## Two-Agent Workflow

- **Agent 1 (Fixer)**: Audits the spec and applies fixes using this checklist and all rules from SKILL.md.
- **Agent 2 (Reviewer)**: Reads the fixed spec and verifies compliance. Reports remaining issues WITHOUT making edits. If issues are found, fix and optionally re-run the reviewer.

Always validate JSON (`python -m json.tool`) after fixes.

## Cross-Spec Propagation

Shared endpoints (file upload, audio, feedback, app info, parameters, meta, site, end-user) appear in chat, chatflow, completion, and workflow specs. When a fix is applied to one spec, check all sibling specs for the same issue.

## Verification Rigor

**Every reported issue must be correct.** False positives erode trust and waste time.

1. **Trace the full path.** Don't stop at the controller. Follow errors through global handlers (`external_api.py`), check whether service methods actually raise.
2. **Check app-type relevance.** Don't flag `workflow_id` as missing from the chat spec.
3. **Verify every claim has evidence.** You must have read the actual code line. No speculative claims.
4. **Self-review before reporting.** Re-read each finding and ask:
- "Did I read the actual code, or am I assuming?"
- "Did I check global error handlers for bare `raise ValueError/Exception`?"
- "Is this field/error relevant to THIS spec's app type?"
- "Am I confusing the Python class name with the `error_code` attribute?"
- "Did I check the service method body, or did I assume it raises?"
5. **When uncertain, investigate further.** Report fewer verified issues rather than many unverified ones. Mark uncertain items as "unverified -- needs manual check."

### Common False-Positive Patterns

- Assuming bare `ValueError` is a 500 (global handler converts to 400 `invalid_param`)
- Flagging shared-model fields as missing from a spec covering a different app type
- Assuming a service method raises when it's actually fire-and-forget
- Using the Python exception class name instead of the `error_code` attribute
- Inventing errors for code paths that don't exist under the spec's app mode
- Documenting an unreachable `except` clause (controller catches exception the service never raises for this endpoint)
- Adding `enum` to a genuinely dynamic/provider-specific string field (e.g., `voice`, `embedding_model_name`)
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Codebase Paths

Mapping of concepts to file paths in the Dify codebase and docs repo.

## Dify Docs Repo

| What | Path |
|------|------|
| OpenAPI specs | `en/api-reference/openapi_*.json` |
| Navigation config | `docs.json` |

## Dify Codebase

| What | Path |
|------|------|
| App controllers | `api/controllers/service_api/app/` |
| Dataset controllers | `api/controllers/service_api/dataset/` |
| App error definitions | `api/controllers/service_api/app/error.py` |
| Dataset error definitions | `api/controllers/service_api/dataset/error.py` |
| Auth/rate-limit wrapper | `api/controllers/service_api/wraps.py` |
| Global error handlers | `api/libs/external_api.py` |
| Route registration | `api/controllers/service_api/__init__.py` |

### Global Error Handlers

The handlers in `api/libs/external_api.py` are critical for error tracing:

- `ValueError` -> 400 `invalid_param`
- `AppInvokeQuotaExceededError` -> 429 `too_many_requests`
- Generic `Exception` -> 500

Always check these when tracing bare `raise ValueError(...)` or unhandled exceptions.

### Error Code Sources

| Error Type | Source |
|------------|--------|
| App-level errors | `api/controllers/service_api/app/error.py` |
| Knowledge errors | `api/controllers/service_api/dataset/error.py` |
| Auth/rate-limit | `api/controllers/service_api/wraps.py` |
| Global handlers | `api/libs/external_api.py` |
Loading
Loading