moreh-dev · seongsu-dev · Feb 20, 2026 · Feb 20, 2026 · Feb 20, 2026 · Feb 20, 2026
@@ -57,45 +57,44 @@ The commit message should be structured as follows:
 
 ### E2E Test
 
-- **Version scope**:
-  - E2E tests cover only `vX.Y.Z` (release) and `vX.Y.Z-rc.N` (release candidate) version formats.
-  - Other version formats (e.g. dev builds, custom tags) are out of scope and should not be tested in E2E.
-
-- **Do not test resource specifications**:
-  - Do not validate individual fields of the YAML file declaring the resource (resource spec).
-  - Instead, create the resource and verify that its status reaches the expected state.
-
-- **Assume fully controlled cluster**:
-  - Do not check if components are already installed.
-  - Assume the cluster is fully controlled by the test and installed components are safe to overwrite or delete.
-
-- **Test suite layout**:
-  - Split tests by purpose under `test/e2e`, for example `test/e2e/performance` and `test/e2e/quality`.
-  - In each directory, define shared Ginkgo configuration (labels, timeouts, common hooks) in `suite_test.go`, and keep scenarios in separate `*_test.go` files.
-  - Shared configuration values must come from the `test/utils/settings` package instead of hard-coded constants in test files.
-
-- **Environment variable management**:
-  - Manage all E2E environment variables centrally in `test/e2e/envs/env_vars.go`.
-  - When a new environment variable is required:
-    - Add it to the `envVars` slice with default value, description, category, and type.
-    - Expose it via public variables (for example `TestModel`, `HFToken`) and access it only through those variables.
-    - Do not call `os.Getenv` directly in test code.
-  - Keep the documentation consistent: changes must pass the `validateEnvVars()` check.
-
-- **Resource templates and settings**:
-  - Manage Kubernetes resource specifications for Gateway, InferenceService, Jobs, and similar resources as Go templates (`.yaml.tmpl`) under `test/config/**`.
-  - Tests must read template paths and default values from constants in `test/utils/settings/constants.go`.
-  - When adding a new benchmark or performance test Job:
-    - Add the template file under an appropriate `test/config/<domain>` subdirectory.
-    - Define the corresponding path and default parameters in the `settings` package.
-
-- **Utility reuse**:
-  - Implement all cluster manipulation logic (namespace creation, Gateway create/delete, Heimdall install/uninstall, InferenceService(Template) create/delete, etc.) in the `test/utils` package and call only those helpers from tests.
-  - Follow this pattern for scenario flow:
-    - `BeforeAll`: create namespace → install Gateway → install Heimdall → create InferenceServiceTemplates → create InferenceServices → wait until they are Ready.
-    - `AfterAll`: if `envs.SkipCleanup` is `false`, clean up the above resources in reverse order.
-    - `It(...)`: render the Job template → create the Job with `kubectl create -f -` → wait for completion with `kubectl wait` → collect logs and perform domain-specific assertions.
-
-- **Makefile and workflow integration**:
-  - Provide separate Make targets per test purpose (for example `e2e-performance`, `e2e-quality`) so that CI can run them independently.
-  - GitHub Actions and other workflows should invoke these targets directly, and new test categories should follow the same pattern when adding additional targets and workflows.
+See [`test/AGENTS.md`](test/AGENTS.md).
+
+## Agent Self-Improvement
+
+After completing any non-trivial task, evaluate whether the work involved:
+- A recurring pattern that will likely appear again in future tasks, or
+- A mistake that was corrected through user feedback, or
+- A design decision that required deliberate reasoning to reach the right answer.
+
+If any of the above applies, **record it in the most relevant `AGENTS.md`** before closing the task — this file for general patterns, [`test/AGENTS.md`](test/AGENTS.md) for test-specific patterns, [`deploy/helm/AGENTS.md`](deploy/helm/AGENTS.md) for Helm chart patterns, and [`website/AGENTS.md`](website/AGENTS.md) for documentation patterns. Entries should be concise, actionable, and placed under the most relevant existing section. If no section fits, create one.
+
+The goal is to make every repeated task faster and every repeated mistake impossible.
+
+### Creating Sub-directory AGENTS.md Files
+
+When a directory accumulates enough domain-specific rules to warrant separation, create a dedicated `AGENTS.md` in that directory. Follow this checklist:
+
+1. **Create `AGENTS.md`** in the target directory with a header that links back to this root file:
+   ```markdown
+   # <Domain> — Agent Rules
+
+   Rules specific to the `<dir>/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md).
+   ```
+
+2. **Create a `CLAUDE.md` symlink** pointing to `AGENTS.md` in the same directory. Cursor reads `CLAUDE.md` as context; the symlink ensures both tools see the same content:
+   ```shell
+   cd <dir> && ln -s AGENTS.md CLAUDE.md
+   ```
+
+3. **Move the relevant sections** from the root `AGENTS.md` (or parent `AGENTS.md`) into the new file. Replace the moved content in the parent with a one-line reference:
+   ```markdown
+   ### E2E Test
+
+   See [`test/AGENTS.md`](test/AGENTS.md).
+   ```
+
+4. **Update the Agent Self-Improvement section** in the parent to mention the new file as a recording target.
+
+## Helm Charts
+
+See [`deploy/helm/AGENTS.md`](deploy/helm/AGENTS.md) for design principles and chart development rules.
@@ -19,7 +19,11 @@ help: ## Display this help.
 
 .PHONY: helm-lint
 helm-lint: ## Lint Helm charts.
-	@helm lint ./deploy/helm/*
+	@for chart in ./deploy/helm/*; do \
+	  if [ -d "$$chart" ] && [ -f "$$chart/Chart.yaml" ]; then \
+	    helm lint "$$chart"; \
+	  fi; \
+	done
 
 .PHONY: helm-docs
 helm-docs: ## Generate Helm chart documentation.

@@ -0,0 +1,130 @@
+# Helm Charts — Agent Rules
+
+Rules specific to the `deploy/helm/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md).
+
+## Design Principles
+
+### Minimum Necessary Complexity
+
+- **Do not add configuration options, fields, or abstractions for hypothetical future use cases.** Only add what the current task concretely requires.
+- Before introducing a new value field, ask: "Is there a real, current use case that cannot be handled without it?" If the answer is no, omit the field and handle the edge case through documentation instead.
+- Example: when considering whether to add a `minio.externalHost` field to support cross-namespace MinIO, the right answer was to document that users can point `loki.storage.s3.endpoint` to the external host directly — no new field needed.
+
+### Documentation over Code for Edge Cases
+
+- When a behavior difference only arises in a non-default, edge-case configuration, prefer documenting the workaround over adding a dedicated code path or configuration key.
+- Reserve code changes for cases where the default path is broken or the workaround is genuinely error-prone.
+
+### Reject Designs Before They Are Built
+
+- If an initial design is heading in the wrong direction (e.g., standalone prerequisites instead of sub-chart dependencies, `enabled: false` defaults, nested config instead of top-level sections), raise the issue and redesign before writing code. Retrofitting a wrong structure is always more costly.
+
+## Helm Chart Development
+
+### Sub-chart Integration
+
+- **All infrastructure components belong as sub-chart dependencies** of `moai-inference-framework`. Do not design them as standalone prerequisites that users install separately.
+- **Enablement convention**: Every sub-chart dependency must have both a `condition:` entry in `Chart.yaml` AND `enabled: true` in the default `values.yaml`. Setting `enabled: false` as the default breaks the "install everything in one chart" philosophy. Follow the same pattern as existing components (`keda`, `lws`, `odin`, etc.).
+
+  ```yaml
+  # Chart.yaml — always add condition: and use the official repository
+  - name: vector
+    version: 0.39.0
+    repository: https://helm.vector.dev
+    condition: vector.enabled
+
+  # values.yaml — always default to true
+  vector:
+    enabled: true
+  ```
+
+- **Official repositories**: Always use the chart's official upstream repository, not a mirror.
+  - loki: `https://grafana.github.io/helm-charts`
+  - vector: `https://helm.vector.dev`
+  - minio: `https://charts.min.io`
+
+### Dynamic Service Name References
+
+- **Do not use `fullnameOverride`** to fix service names. Instead, build references using `.Release.Name` so that names are always consistent with whatever release name the user chooses.
+
+  ```yaml
+  # templates/grafana/datasource-loki.yaml
+  url: http://{{ .Release.Name }}-loki-gateway.{{ include "common.names.namespace" . }}.svc.cluster.local
+
+  # templates/loki/credentials.yaml
+  BUCKET_HOST: {{ printf "%s-minio" .Release.Name | quote }}
+  ```
+
+- In sub-chart `customConfig` values rendered through `tpl`, use `{{ .Release.Name }}` directly — it is evaluated by the sub-chart's `tpl` call and resolves to the parent release name.
+
+  ```yaml
+  # values.yaml (vector customConfig) — .Release.Name evaluated by tpl
+  endpoint: "http://{{ .Release.Name }}-loki-gateway"
+  ```
+
+### Separation of Concerns in values.yaml
+
+- **Large infrastructure components must be top-level sections**, not nested under their consumers. For example, MinIO configuration belongs at `minio:`, not at `loki.minio:`. This allows MinIO to be independently enabled/disabled and reused by other components in the future.
+
+### MinIO Provisioning Pattern
+
+- Use the `minio/minio` chart (`https://charts.min.io`), not the bitnami chart.
+- Create buckets, users, and policies directly via the chart's top-level `buckets`, `users`, and `policies` fields (not under a `provisioning` key).
+- Create a **dedicated user per consuming service** with a policy scoped to only its bucket — do not use root credentials for service-to-service access.
+
+  ```yaml
+  minio:
+    policies:
+      - name: loki
+        statements:
+          - resources: ["arn:aws:s3:::loki/*"]
+            effect: Allow
+            actions: ["s3:*"]
+    users:
+      - accessKey: loki
+        secretKey: "loki123!"
+        policy: loki
+    buckets:
+      - name: loki
+  ```
+
+- Templates that read MinIO credentials must reference the `users` array directly:
+
+  ```yaml
+  # credentials.yaml
+  stringData:
+    AWS_ACCESS_KEY_ID:     {{ (index .Values.minio.users 0).accessKey | quote }}
+    AWS_SECRET_ACCESS_KEY: {{ (index .Values.minio.users 0).secretKey | quote }}
+  ```
+
+### Helm `tpl` Passthrough — Vector Label Syntax
+
+- The vector chart renders `customConfig` through Helm's `tpl` function (`{{ tpl (toYaml .Values.customConfig) . | indent 4 }}`). This means any `{{ }}` expression in `customConfig` is evaluated as a Go template at render time.
+- To pass **Vector's own field-template syntax** (`{{ field }}`) through `tpl` without evaluation, use Go raw string literals:
+
+  ```yaml
+  # values.yaml — correct
+  labels:
+    namespace: "{{`{{ namespace }}`}}"
+
+  # values.yaml — WRONG: tpl evaluates {{ namespace }} as a Go template function
+  labels:
+    namespace: "{{ namespace }}"
+  ```
+
+- **Before using `customConfig` with any sub-chart, always verify whether the chart applies `tpl` to it** by running `helm pull <chart> --version <ver> --untar` and inspecting the ConfigMap template.
+
+### YAML Anchors
+
+- **Do not use YAML anchors at the root level of `values.yaml`** (e.g., `_defaults: &defaults`). Helm treats unknown root-level keys as invalid and may emit warnings or errors. Instead, duplicate shared configuration explicitly for each component.
+
+### MIF Pod Label Keys
+
+When filtering or labeling logs, metrics, or other signals by MIF-specific pod attributes, use these label keys:
+
+| Concept           | Label key                    | Example value       |
+| :---------------- | :--------------------------- | :------------------ |
+| Pool              | `mif.moreh.io/pool`          | `heimdall`          |
+| Role              | `mif.moreh.io/role`          | `prefill`, `decode` |
+| App name          | `app.kubernetes.io/name`     | `vllm`              |
+| Inference service | `app.kubernetes.io/instance` | `llama-3-2-1b`      |
@@ -0,0 +1 @@
+AGENTS.md
@@ -23,5 +23,14 @@ dependencies:
 - name: node-feature-discovery
   repository: oci://registry.k8s.io/nfd/charts
   version: 0.18.3
-digest: sha256:d7f75e788dca4192775595637ec123afa390e09eebcaef9e9c0e40ff46c23e23
-generated: "2026-02-19T16:14:15.495286+09:00"
+- name: minio
+  repository: https://charts.min.io
+  version: 5.4.0
+- name: loki
+  repository: https://grafana.github.io/helm-charts
+  version: 6.30.0
+- name: vector
+  repository: https://helm.vector.dev
+  version: 0.39.0
+digest: sha256:85af11696c630ed9ac9ef85a7a18c8b821187a76e949e535019fc5b91d929ee8
+generated: "2026-02-20T18:51:13.630416372+09:00"
@@ -42,3 +42,15 @@ dependencies:
     version: 0.18.3
     repository: oci://registry.k8s.io/nfd/charts
     condition: nfd.enabled
+  - name: minio
+    version: 5.4.0
+    repository: https://charts.min.io
+    condition: minio.enabled
+  - name: loki
+    version: 6.30.0
+    repository: https://grafana.github.io/helm-charts
+    condition: loki.enabled
+  - name: vector
+    version: 0.39.0
+    repository: https://helm.vector.dev
+    condition: vector.enabled