diff --git a/AGENTS.md b/AGENTS.md index 62f3e98..35cb411 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -57,45 +57,44 @@ The commit message should be structured as follows: ### E2E Test -- **Version scope**: - - E2E tests cover only `vX.Y.Z` (release) and `vX.Y.Z-rc.N` (release candidate) version formats. - - Other version formats (e.g. dev builds, custom tags) are out of scope and should not be tested in E2E. - -- **Do not test resource specifications**: - - Do not validate individual fields of the YAML file declaring the resource (resource spec). - - Instead, create the resource and verify that its status reaches the expected state. - -- **Assume fully controlled cluster**: - - Do not check if components are already installed. - - Assume the cluster is fully controlled by the test and installed components are safe to overwrite or delete. - -- **Test suite layout**: - - Split tests by purpose under `test/e2e`, for example `test/e2e/performance` and `test/e2e/quality`. - - In each directory, define shared Ginkgo configuration (labels, timeouts, common hooks) in `suite_test.go`, and keep scenarios in separate `*_test.go` files. - - Shared configuration values must come from the `test/utils/settings` package instead of hard-coded constants in test files. - -- **Environment variable management**: - - Manage all E2E environment variables centrally in `test/e2e/envs/env_vars.go`. - - When a new environment variable is required: - - Add it to the `envVars` slice with default value, description, category, and type. - - Expose it via public variables (for example `TestModel`, `HFToken`) and access it only through those variables. - - Do not call `os.Getenv` directly in test code. - - Keep the documentation consistent: changes must pass the `validateEnvVars()` check. - -- **Resource templates and settings**: - - Manage Kubernetes resource specifications for Gateway, InferenceService, Jobs, and similar resources as Go templates (`.yaml.tmpl`) under `test/config/**`. - - Tests must read template paths and default values from constants in `test/utils/settings/constants.go`. - - When adding a new benchmark or performance test Job: - - Add the template file under an appropriate `test/config/` subdirectory. - - Define the corresponding path and default parameters in the `settings` package. - -- **Utility reuse**: - - Implement all cluster manipulation logic (namespace creation, Gateway create/delete, Heimdall install/uninstall, InferenceService(Template) create/delete, etc.) in the `test/utils` package and call only those helpers from tests. - - Follow this pattern for scenario flow: - - `BeforeAll`: create namespace → install Gateway → install Heimdall → create InferenceServiceTemplates → create InferenceServices → wait until they are Ready. - - `AfterAll`: if `envs.SkipCleanup` is `false`, clean up the above resources in reverse order. - - `It(...)`: render the Job template → create the Job with `kubectl create -f -` → wait for completion with `kubectl wait` → collect logs and perform domain-specific assertions. - -- **Makefile and workflow integration**: - - Provide separate Make targets per test purpose (for example `e2e-performance`, `e2e-quality`) so that CI can run them independently. - - GitHub Actions and other workflows should invoke these targets directly, and new test categories should follow the same pattern when adding additional targets and workflows. +See [`test/AGENTS.md`](test/AGENTS.md). + +## Agent Self-Improvement + +After completing any non-trivial task, evaluate whether the work involved: +- A recurring pattern that will likely appear again in future tasks, or +- A mistake that was corrected through user feedback, or +- A design decision that required deliberate reasoning to reach the right answer. + +If any of the above applies, **record it in the most relevant `AGENTS.md`** before closing the task — this file for general patterns, [`test/AGENTS.md`](test/AGENTS.md) for test-specific patterns, [`deploy/helm/AGENTS.md`](deploy/helm/AGENTS.md) for Helm chart patterns, and [`website/AGENTS.md`](website/AGENTS.md) for documentation patterns. Entries should be concise, actionable, and placed under the most relevant existing section. If no section fits, create one. + +The goal is to make every repeated task faster and every repeated mistake impossible. + +### Creating Sub-directory AGENTS.md Files + +When a directory accumulates enough domain-specific rules to warrant separation, create a dedicated `AGENTS.md` in that directory. Follow this checklist: + +1. **Create `AGENTS.md`** in the target directory with a header that links back to this root file: + ```markdown + # — Agent Rules + + Rules specific to the `/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md). + ``` + +2. **Create a `CLAUDE.md` symlink** pointing to `AGENTS.md` in the same directory. Cursor reads `CLAUDE.md` as context; the symlink ensures both tools see the same content: + ```shell + cd && ln -s AGENTS.md CLAUDE.md + ``` + +3. **Move the relevant sections** from the root `AGENTS.md` (or parent `AGENTS.md`) into the new file. Replace the moved content in the parent with a one-line reference: + ```markdown + ### E2E Test + + See [`test/AGENTS.md`](test/AGENTS.md). + ``` + +4. **Update the Agent Self-Improvement section** in the parent to mention the new file as a recording target. + +## Helm Charts + +See [`deploy/helm/AGENTS.md`](deploy/helm/AGENTS.md) for design principles and chart development rules. diff --git a/Makefile b/Makefile index 14e9cbd..dcd9ef2 100644 --- a/Makefile +++ b/Makefile @@ -19,7 +19,11 @@ help: ## Display this help. .PHONY: helm-lint helm-lint: ## Lint Helm charts. - @helm lint ./deploy/helm/* + @for chart in ./deploy/helm/*; do \ + if [ -d "$$chart" ] && [ -f "$$chart/Chart.yaml" ]; then \ + helm lint "$$chart"; \ + fi; \ + done .PHONY: helm-docs helm-docs: ## Generate Helm chart documentation. diff --git a/deploy/helm/AGENTS.md b/deploy/helm/AGENTS.md new file mode 100644 index 0000000..e0c386f --- /dev/null +++ b/deploy/helm/AGENTS.md @@ -0,0 +1,130 @@ +# Helm Charts — Agent Rules + +Rules specific to the `deploy/helm/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md). + +## Design Principles + +### Minimum Necessary Complexity + +- **Do not add configuration options, fields, or abstractions for hypothetical future use cases.** Only add what the current task concretely requires. +- Before introducing a new value field, ask: "Is there a real, current use case that cannot be handled without it?" If the answer is no, omit the field and handle the edge case through documentation instead. +- Example: when considering whether to add a `minio.externalHost` field to support cross-namespace MinIO, the right answer was to document that users can point `loki.storage.s3.endpoint` to the external host directly — no new field needed. + +### Documentation over Code for Edge Cases + +- When a behavior difference only arises in a non-default, edge-case configuration, prefer documenting the workaround over adding a dedicated code path or configuration key. +- Reserve code changes for cases where the default path is broken or the workaround is genuinely error-prone. + +### Reject Designs Before They Are Built + +- If an initial design is heading in the wrong direction (e.g., standalone prerequisites instead of sub-chart dependencies, `enabled: false` defaults, nested config instead of top-level sections), raise the issue and redesign before writing code. Retrofitting a wrong structure is always more costly. + +## Helm Chart Development + +### Sub-chart Integration + +- **All infrastructure components belong as sub-chart dependencies** of `moai-inference-framework`. Do not design them as standalone prerequisites that users install separately. +- **Enablement convention**: Every sub-chart dependency must have both a `condition:` entry in `Chart.yaml` AND `enabled: true` in the default `values.yaml`. Setting `enabled: false` as the default breaks the "install everything in one chart" philosophy. Follow the same pattern as existing components (`keda`, `lws`, `odin`, etc.). + + ```yaml + # Chart.yaml — always add condition: and use the official repository + - name: vector + version: 0.39.0 + repository: https://helm.vector.dev + condition: vector.enabled + + # values.yaml — always default to true + vector: + enabled: true + ``` + +- **Official repositories**: Always use the chart's official upstream repository, not a mirror. + - loki: `https://grafana.github.io/helm-charts` + - vector: `https://helm.vector.dev` + - minio: `https://charts.min.io` + +### Dynamic Service Name References + +- **Do not use `fullnameOverride`** to fix service names. Instead, build references using `.Release.Name` so that names are always consistent with whatever release name the user chooses. + + ```yaml + # templates/grafana/datasource-loki.yaml + url: http://{{ .Release.Name }}-loki-gateway.{{ include "common.names.namespace" . }}.svc.cluster.local + + # templates/loki/credentials.yaml + BUCKET_HOST: {{ printf "%s-minio" .Release.Name | quote }} + ``` + +- In sub-chart `customConfig` values rendered through `tpl`, use `{{ .Release.Name }}` directly — it is evaluated by the sub-chart's `tpl` call and resolves to the parent release name. + + ```yaml + # values.yaml (vector customConfig) — .Release.Name evaluated by tpl + endpoint: "http://{{ .Release.Name }}-loki-gateway" + ``` + +### Separation of Concerns in values.yaml + +- **Large infrastructure components must be top-level sections**, not nested under their consumers. For example, MinIO configuration belongs at `minio:`, not at `loki.minio:`. This allows MinIO to be independently enabled/disabled and reused by other components in the future. + +### MinIO Provisioning Pattern + +- Use the `minio/minio` chart (`https://charts.min.io`), not the bitnami chart. +- Create buckets, users, and policies directly via the chart's top-level `buckets`, `users`, and `policies` fields (not under a `provisioning` key). +- Create a **dedicated user per consuming service** with a policy scoped to only its bucket — do not use root credentials for service-to-service access. + + ```yaml + minio: + policies: + - name: loki + statements: + - resources: ["arn:aws:s3:::loki/*"] + effect: Allow + actions: ["s3:*"] + users: + - accessKey: loki + secretKey: "loki123!" + policy: loki + buckets: + - name: loki + ``` + +- Templates that read MinIO credentials must reference the `users` array directly: + + ```yaml + # credentials.yaml + stringData: + AWS_ACCESS_KEY_ID: {{ (index .Values.minio.users 0).accessKey | quote }} + AWS_SECRET_ACCESS_KEY: {{ (index .Values.minio.users 0).secretKey | quote }} + ``` + +### Helm `tpl` Passthrough — Vector Label Syntax + +- The vector chart renders `customConfig` through Helm's `tpl` function (`{{ tpl (toYaml .Values.customConfig) . | indent 4 }}`). This means any `{{ }}` expression in `customConfig` is evaluated as a Go template at render time. +- To pass **Vector's own field-template syntax** (`{{ field }}`) through `tpl` without evaluation, use Go raw string literals: + + ```yaml + # values.yaml — correct + labels: + namespace: "{{`{{ namespace }}`}}" + + # values.yaml — WRONG: tpl evaluates {{ namespace }} as a Go template function + labels: + namespace: "{{ namespace }}" + ``` + +- **Before using `customConfig` with any sub-chart, always verify whether the chart applies `tpl` to it** by running `helm pull --version --untar` and inspecting the ConfigMap template. + +### YAML Anchors + +- **Do not use YAML anchors at the root level of `values.yaml`** (e.g., `_defaults: &defaults`). Helm treats unknown root-level keys as invalid and may emit warnings or errors. Instead, duplicate shared configuration explicitly for each component. + +### MIF Pod Label Keys + +When filtering or labeling logs, metrics, or other signals by MIF-specific pod attributes, use these label keys: + +| Concept | Label key | Example value | +| :---------------- | :--------------------------- | :------------------ | +| Pool | `mif.moreh.io/pool` | `heimdall` | +| Role | `mif.moreh.io/role` | `prefill`, `decode` | +| App name | `app.kubernetes.io/name` | `vllm` | +| Inference service | `app.kubernetes.io/instance` | `llama-3-2-1b` | diff --git a/deploy/helm/CLAUDE.md b/deploy/helm/CLAUDE.md new file mode 120000 index 0000000..47dc3e3 --- /dev/null +++ b/deploy/helm/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/deploy/helm/moai-inference-framework/Chart.lock b/deploy/helm/moai-inference-framework/Chart.lock index cb00edb..a899860 100644 --- a/deploy/helm/moai-inference-framework/Chart.lock +++ b/deploy/helm/moai-inference-framework/Chart.lock @@ -23,5 +23,14 @@ dependencies: - name: node-feature-discovery repository: oci://registry.k8s.io/nfd/charts version: 0.18.3 -digest: sha256:d7f75e788dca4192775595637ec123afa390e09eebcaef9e9c0e40ff46c23e23 -generated: "2026-02-19T16:14:15.495286+09:00" +- name: minio + repository: https://charts.min.io + version: 5.4.0 +- name: loki + repository: https://grafana.github.io/helm-charts + version: 6.30.0 +- name: vector + repository: https://helm.vector.dev + version: 0.39.0 +digest: sha256:85af11696c630ed9ac9ef85a7a18c8b821187a76e949e535019fc5b91d929ee8 +generated: "2026-02-20T18:51:13.630416372+09:00" diff --git a/deploy/helm/moai-inference-framework/Chart.yaml b/deploy/helm/moai-inference-framework/Chart.yaml index 43e39f0..538d7b0 100644 --- a/deploy/helm/moai-inference-framework/Chart.yaml +++ b/deploy/helm/moai-inference-framework/Chart.yaml @@ -42,3 +42,15 @@ dependencies: version: 0.18.3 repository: oci://registry.k8s.io/nfd/charts condition: nfd.enabled + - name: minio + version: 5.4.0 + repository: https://charts.min.io + condition: minio.enabled + - name: loki + version: 6.30.0 + repository: https://grafana.github.io/helm-charts + condition: loki.enabled + - name: vector + version: 0.39.0 + repository: https://helm.vector.dev + condition: vector.enabled diff --git a/deploy/helm/moai-inference-framework/README.md b/deploy/helm/moai-inference-framework/README.md index 8abc6f9..bebc015 100644 --- a/deploy/helm/moai-inference-framework/README.md +++ b/deploy/helm/moai-inference-framework/README.md @@ -17,7 +17,10 @@ Moreh Inference Framework | Repository | Name | Version | |------------|------|---------| +| https://charts.min.io | minio | 5.4.0 | +| https://grafana.github.io/helm-charts | loki | 6.30.0 | | https://helm.mittwald.de | replicator(kubernetes-replicator) | 2.12.2 | +| https://helm.vector.dev | vector | 0.39.0 | | https://kedacore.github.io/charts | keda | 2.18.0 | | https://moreh-dev.github.io/helm-charts | odin | v0.6.0 | | https://moreh-dev.github.io/helm-charts | odin-crd | v0.6.0 | @@ -52,7 +55,74 @@ Moreh Inference Framework | global | object | `{"imagePullSecrets":[]}` | global values are shared across all sub-charts if the value's key matches. | | global.imagePullSecrets | list | `[]` | Image pull secrets. | | keda.enabled | bool | `true` | Enable kedacore/keda. Set to false if already deployed. | +| loki.backend.extraArgs[0] | string | `"-config.expand-env=true"` | | +| loki.backend.extraEnvFrom[0].secretRef.name | string | `"loki-bucket"` | | +| loki.backend.extraEnvFrom[1].configMapRef.name | string | `"loki-bucket"` | | +| loki.backend.nodeSelector."node-role.kubernetes.io/control-plane" | string | `""` | | +| loki.backend.persistence.volumeClaimsEnabled | bool | `false` | | +| loki.backend.replicas | int | `1` | | +| loki.enabled | bool | `true` | Enable grafana/loki. | +| loki.gateway.extraArgs[0] | string | `"-config.expand-env=true"` | | +| loki.gateway.extraEnvFrom[0].secretRef.name | string | `"loki-bucket"` | | +| loki.gateway.extraEnvFrom[1].configMapRef.name | string | `"loki-bucket"` | | +| loki.gateway.nodeSelector."node-role.kubernetes.io/control-plane" | string | `""` | | +| loki.gateway.replicas | int | `1` | | +| loki.loki.auth_enabled | bool | `false` | | +| loki.loki.commonConfig.replication_factor | int | `1` | | +| loki.loki.image.tag | string | `"3.5.1"` | | +| loki.loki.schemaConfig.configs[0].from | string | `"2024-06-24"` | | +| loki.loki.schemaConfig.configs[0].index.period | string | `"24h"` | | +| loki.loki.schemaConfig.configs[0].index.prefix | string | `"loki_index_"` | | +| loki.loki.schemaConfig.configs[0].object_store | string | `"s3"` | | +| loki.loki.schemaConfig.configs[0].schema | string | `"v13"` | | +| loki.loki.schemaConfig.configs[0].store | string | `"tsdb"` | | +| loki.loki.storage.bucketNames.admin | string | `"loki"` | | +| loki.loki.storage.bucketNames.chunks | string | `"loki"` | | +| loki.loki.storage.bucketNames.ruler | string | `"loki"` | | +| loki.loki.storage.s3.accessKeyId | string | `"${AWS_ACCESS_KEY_ID}"` | | +| loki.loki.storage.s3.endpoint | string | `"http://${BUCKET_HOST}:${BUCKET_PORT}"` | | +| loki.loki.storage.s3.region | string | `"${BUCKET_REGION}"` | | +| loki.loki.storage.s3.s3ForcePathStyle | bool | `true` | | +| loki.loki.storage.s3.secretAccessKey | string | `"${AWS_SECRET_ACCESS_KEY}"` | | +| loki.loki.storage_config.tsdb_shipper.active_index_directory | string | `"/var/loki/tsdb-index"` | | +| loki.loki.storage_config.tsdb_shipper.cache_location | string | `"/var/loki/tsdb-cache"` | | +| loki.loki.storage_config.tsdb_shipper.cache_ttl | string | `"168h"` | | +| loki.loki.structuredConfig.compactor.delete_request_store | string | `"s3"` | | +| loki.loki.structuredConfig.compactor.retention_enabled | bool | `true` | | +| loki.loki.structuredConfig.limits_config.ingestion_burst_size_mb | int | `60` | | +| loki.loki.structuredConfig.limits_config.ingestion_rate_mb | int | `30` | | +| loki.loki.structuredConfig.limits_config.max_entries_limit_per_query | int | `50000` | | +| loki.loki.structuredConfig.limits_config.max_query_series | int | `10000` | | +| loki.loki.structuredConfig.limits_config.per_stream_rate_limit | string | `"30MB"` | | +| loki.loki.structuredConfig.limits_config.per_stream_rate_limit_burst | string | `"60MB"` | | +| loki.loki.structuredConfig.limits_config.retention_period | string | `"2160h"` | | +| loki.loki.structuredConfig.limits_config.split_queries_by_interval | string | `"24h"` | | +| loki.read.extraArgs[0] | string | `"-config.expand-env=true"` | | +| loki.read.extraEnvFrom[0].secretRef.name | string | `"loki-bucket"` | | +| loki.read.extraEnvFrom[1].configMapRef.name | string | `"loki-bucket"` | | +| loki.read.nodeSelector."node-role.kubernetes.io/control-plane" | string | `""` | | +| loki.read.replicas | int | `1` | | +| loki.write.extraArgs[0] | string | `"-config.expand-env=true"` | | +| loki.write.extraEnvFrom[0].secretRef.name | string | `"loki-bucket"` | | +| loki.write.extraEnvFrom[1].configMapRef.name | string | `"loki-bucket"` | | +| loki.write.nodeSelector."node-role.kubernetes.io/control-plane" | string | `""` | | +| loki.write.persistence.volumeClaimsEnabled | bool | `false` | | +| loki.write.replicas | int | `1` | | | lws.enabled | bool | `true` | Enable kubernetes-sigs/lws. Set to false if already deployed. | +| minio.buckets[0].name | string | `"loki"` | | +| minio.enabled | bool | `true` | Enable minio/minio as the S3-compatible object storage backend for Loki. Set to false if MinIO is already deployed; in that case, configure loki storage to point to the existing MinIO service. | +| minio.mode | string | `"standalone"` | | +| minio.persistence.enabled | bool | `false` | | +| minio.policies[0].name | string | `"loki"` | | +| minio.policies[0].statements[0].actions[0] | string | `"s3:*"` | | +| minio.policies[0].statements[0].effect | string | `"Allow"` | | +| minio.policies[0].statements[0].resources[0] | string | `"arn:aws:s3:::loki/*"` | | +| minio.resources.requests.memory | string | `"2Gi"` | | +| minio.rootPassword | string | `"minio123!"` | MinIO root password. Override with a strong password in production. | +| minio.rootUser | string | `"minio"` | MinIO root user. | +| minio.users[0].accessKey | string | `"loki"` | | +| minio.users[0].policy | string | `"loki"` | | +| minio.users[0].secretKey | string | `"loki123!"` | Password for the loki MinIO user. Override with a strong password in production. | | nameOverride | string | `""` | Chart name override. | | namespaceOverride | string | `""` | Namespace override. | | nfd.enabled | bool | `true` | Enable kubernetes-sigs/node-feature-discovery. Set to false if already deployed. | @@ -81,6 +151,40 @@ Moreh Inference Framework | prometheus-stack.thanosRuler.enabled | bool | `false` | | | prometheus-stack.windowsMonitoring.enabled | bool | `false` | | | replicator.enabled | bool | `true` | Enable mittwald/kubernetes-replicator. Set to false if already deployed. | +| vector.customConfig.api.address | string | `"0.0.0.0:8686"` | | +| vector.customConfig.api.enabled | bool | `true` | | +| vector.customConfig.data_dir | string | `"/vector-data"` | | +| vector.customConfig.sinks.loki.encoding.codec | string | `"json"` | | +| vector.customConfig.sinks.loki.endpoint | string | `"http://{{ .Release.Name }}-loki-gateway"` | | +| vector.customConfig.sinks.loki.inputs[0] | string | `"mif_log_transform"` | | +| vector.customConfig.sinks.loki.labels.app | string | `"{{`{{ app }}`}}"` | | +| vector.customConfig.sinks.loki.labels.inference_service | string | `"{{`{{ inference_service }}`}}"` | | +| vector.customConfig.sinks.loki.labels.level | string | `"{{`{{ level }}`}}"` | | +| vector.customConfig.sinks.loki.labels.namespace | string | `"{{`{{ namespace }}`}}"` | | +| vector.customConfig.sinks.loki.labels.node_name | string | `"{{`{{ node_name }}`}}"` | | +| vector.customConfig.sinks.loki.labels.pool_name | string | `"{{`{{ pool_name }}`}}"` | | +| vector.customConfig.sinks.loki.labels.role | string | `"{{`{{ role }}`}}"` | | +| vector.customConfig.sinks.loki.type | string | `"loki"` | | +| vector.customConfig.sources.mif_logs.extra_label_selector | string | `"mif.moreh.io/log.collect=true"` | | +| vector.customConfig.sources.mif_logs.type | string | `"kubernetes_logs"` | | +| vector.customConfig.transforms.mif_log_transform.inputs[0] | string | `"mif_logs"` | | +| vector.customConfig.transforms.mif_log_transform.source | string | `".namespace = .kubernetes.pod_namespace\n.node_name = \"$VECTOR_SELF_NODE_NAME\"\n.app = get(.kubernetes.pod_labels, [\"app.kubernetes.io/name\"]) ?? \"\"\n.inference_service = get(.kubernetes.pod_labels, [\"app.kubernetes.io/instance\"]) ?? \"\"\n.pool_name = get(.kubernetes.pod_labels, [\"mif.moreh.io/pool\"]) ?? \"\"\n.role = get(.kubernetes.pod_labels, [\"mif.moreh.io/role\"]) ?? \"\"\n\nlog_format = get(.kubernetes.pod_labels, [\"mif.moreh.io/log.format\"]) ?? \"\"\n\nif log_format == \"json\" {\n structured, err = parse_json(.message)\n if err == null {\n . = merge!(., structured)\n msg, err = get(., [\"msg\"])\n if err == null {\n .message = msg\n del(.msg)\n }\n time, err = get(., [\"time\"])\n if err == null {\n .timestamp = time\n del(.time)\n }\n }\n}\n\ndel(.file)\ndel(.source_type)\ndel(.stream)\ndel(.kubernetes)\n"` | | +| vector.customConfig.transforms.mif_log_transform.type | string | `"remap"` | | +| vector.enabled | bool | `true` | Enable vector/vector as a DaemonSet log collector. | +| vector.role | string | `"Agent"` | | +| vector.tolerations[0].effect | string | `"NoExecute"` | | +| vector.tolerations[0].key | string | `"node.kubernetes.io/unschedulable"` | | +| vector.tolerations[0].operator | string | `"Exists"` | | +| vector.tolerations[0].tolerationSeconds | int | `5` | | +| vector.tolerations[1].effect | string | `"NoSchedule"` | | +| vector.tolerations[1].key | string | `"node-role.kubernetes.io/compute"` | | +| vector.tolerations[1].operator | string | `"Equal"` | | +| vector.tolerations[1].value | string | `"true"` | | +| vector.tolerations[2].effect | string | `"NoSchedule"` | | +| vector.tolerations[2].key | string | `"amd.com/gpu"` | | +| vector.tolerations[2].operator | string | `"Exists"` | | +| vector.updateStrategy.rollingUpdate.maxUnavailable | int | `10` | | +| vector.updateStrategy.type | string | `"RollingUpdate"` | | ---------------------------------------------- Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2) diff --git a/deploy/helm/moai-inference-framework/charts/loki-6.30.0.tgz b/deploy/helm/moai-inference-framework/charts/loki-6.30.0.tgz new file mode 100644 index 0000000..b5e7322 Binary files /dev/null and b/deploy/helm/moai-inference-framework/charts/loki-6.30.0.tgz differ diff --git a/deploy/helm/moai-inference-framework/charts/minio-5.4.0.tgz b/deploy/helm/moai-inference-framework/charts/minio-5.4.0.tgz new file mode 100644 index 0000000..22f8d73 Binary files /dev/null and b/deploy/helm/moai-inference-framework/charts/minio-5.4.0.tgz differ diff --git a/deploy/helm/moai-inference-framework/charts/vector-0.39.0.tgz b/deploy/helm/moai-inference-framework/charts/vector-0.39.0.tgz new file mode 100644 index 0000000..9c71305 Binary files /dev/null and b/deploy/helm/moai-inference-framework/charts/vector-0.39.0.tgz differ diff --git a/deploy/helm/moai-inference-framework/templates/grafana/datasource-loki.yaml b/deploy/helm/moai-inference-framework/templates/grafana/datasource-loki.yaml new file mode 100644 index 0000000..87852f5 --- /dev/null +++ b/deploy/helm/moai-inference-framework/templates/grafana/datasource-loki.yaml @@ -0,0 +1,26 @@ +{{- $ps := index .Values "prometheus-stack" }} +{{- if and .Values.loki.enabled $ps.enabled $ps.grafana.enabled }} +--- +# Grafana datasource ConfigMap for Loki. +# The grafana-sidecar discovers ConfigMaps labelled grafana_datasource=1 and +# provisions them as datasources automatically. +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "common.names.name" . }}-datasource-loki + namespace: {{ include "common.names.namespace" . }} + labels: + grafana_datasource: "1" + {{- include "mif.labels" . | nindent 4 }} +data: + loki-datasource.yaml: | + apiVersion: 1 + datasources: + - name: Loki + type: loki + access: proxy + url: http://{{ .Release.Name }}-loki-gateway.{{ include "common.names.namespace" . }}.svc.cluster.local + isDefault: false + jsonData: + maxLines: 5000 +{{- end }} diff --git a/deploy/helm/moai-inference-framework/templates/loki/credentials.yaml b/deploy/helm/moai-inference-framework/templates/loki/credentials.yaml new file mode 100644 index 0000000..a026a04 --- /dev/null +++ b/deploy/helm/moai-inference-framework/templates/loki/credentials.yaml @@ -0,0 +1,29 @@ +{{- if .Values.loki.enabled }} +--- +# ConfigMap consumed by Loki components via extraEnvFrom. +# The values are referenced in loki.storage.s3 as ${BUCKET_HOST} etc., +# which Loki resolves at startup because each component runs with -config.expand-env=true. +apiVersion: v1 +kind: ConfigMap +metadata: + name: loki-bucket + namespace: {{ include "common.names.namespace" . }} + labels: + {{- include "mif.labels" . | nindent 4 }} +data: + BUCKET_HOST: {{ printf "%s-minio" .Release.Name | quote }} + BUCKET_PORT: "9000" + BUCKET_REGION: "" + BUCKET_NAME: "loki" +--- +apiVersion: v1 +kind: Secret +metadata: + name: loki-bucket + namespace: {{ include "common.names.namespace" . }} + labels: + {{- include "mif.labels" . | nindent 4 }} +stringData: + AWS_ACCESS_KEY_ID: {{ (index .Values.minio.users 0).accessKey | quote }} + AWS_SECRET_ACCESS_KEY: {{ (index .Values.minio.users 0).secretKey | quote }} +{{- end }} diff --git a/deploy/helm/moai-inference-framework/values.yaml b/deploy/helm/moai-inference-framework/values.yaml index d9c19e3..59ded47 100644 --- a/deploy/helm/moai-inference-framework/values.yaml +++ b/deploy/helm/moai-inference-framework/values.yaml @@ -86,7 +86,7 @@ replicator: nfd: # -- Enable kubernetes-sigs/node-feature-discovery. Set to false if already deployed. enabled: true - + worker: # -- NFD Worker Tolerations to allow NFD workers to deploy to GPU nodes tolerations: @@ -104,6 +104,245 @@ nfd: operator: "Exists" effect: "NoSchedule" +minio: + # -- Enable minio/minio as the S3-compatible object storage backend for Loki. + # Set to false if MinIO is already deployed; in that case, configure loki storage + # to point to the existing MinIO service. + enabled: true + + mode: standalone + + # -- MinIO root user. + rootUser: minio + # -- MinIO root password. Override with a strong password in production. + rootPassword: "minio123!" + + persistence: + enabled: false + + # S3 policy granting the loki user access to the loki bucket only. + policies: + - name: loki + statements: + - resources: + - "arn:aws:s3:::loki/*" + effect: Allow + actions: + - "s3:*" + + # Dedicated loki user with restricted S3 access (not the root user). + # AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in the loki-bucket Secret are + # set from this user's credentials. + users: + - accessKey: loki + # -- Password for the loki MinIO user. Override with a strong password in production. + secretKey: "loki123!" + policy: loki + + buckets: + - name: loki + + resources: + requests: + memory: 2Gi + +loki: + # -- Enable grafana/loki. + enabled: true + + loki: + auth_enabled: false + commonConfig: + replication_factor: 1 + image: + tag: "3.5.1" + schemaConfig: + configs: + - from: "2024-06-24" + store: tsdb + object_store: s3 + schema: v13 + index: + prefix: loki_index_ + period: 24h + storage: + bucketNames: + chunks: loki + ruler: loki + admin: loki + s3: + s3ForcePathStyle: true + # ${...} references are resolved by Loki at startup via -config.expand-env=true. + endpoint: "http://${BUCKET_HOST}:${BUCKET_PORT}" + region: "${BUCKET_REGION}" + accessKeyId: "${AWS_ACCESS_KEY_ID}" + secretAccessKey: "${AWS_SECRET_ACCESS_KEY}" + storage_config: + tsdb_shipper: + active_index_directory: /var/loki/tsdb-index + cache_location: /var/loki/tsdb-cache + cache_ttl: 168h + structuredConfig: + compactor: + retention_enabled: true + delete_request_store: s3 + limits_config: + max_entries_limit_per_query: 50000 + split_queries_by_interval: 24h + ingestion_rate_mb: 30 + ingestion_burst_size_mb: 60 + per_stream_rate_limit: 30MB + per_stream_rate_limit_burst: 60MB + max_query_series: 10000 + retention_period: 2160h # 90 days + + # Each component runs with -config.expand-env=true so that ${BUCKET_HOST} etc. + # in the Loki config are expanded from the loki-bucket ConfigMap and Secret. + gateway: + replicas: 1 + extraArgs: + - -config.expand-env=true + extraEnvFrom: + - secretRef: + name: loki-bucket + - configMapRef: + name: loki-bucket + nodeSelector: + node-role.kubernetes.io/control-plane: "" + + read: + replicas: 1 + extraArgs: + - -config.expand-env=true + extraEnvFrom: + - secretRef: + name: loki-bucket + - configMapRef: + name: loki-bucket + nodeSelector: + node-role.kubernetes.io/control-plane: "" + + write: + replicas: 1 + extraArgs: + - -config.expand-env=true + extraEnvFrom: + - secretRef: + name: loki-bucket + - configMapRef: + name: loki-bucket + nodeSelector: + node-role.kubernetes.io/control-plane: "" + persistence: + volumeClaimsEnabled: false + + backend: + replicas: 1 + extraArgs: + - -config.expand-env=true + extraEnvFrom: + - secretRef: + name: loki-bucket + - configMapRef: + name: loki-bucket + nodeSelector: + node-role.kubernetes.io/control-plane: "" + persistence: + volumeClaimsEnabled: false + +vector: + # -- Enable vector/vector as a DaemonSet log collector. + enabled: true + + role: Agent + + tolerations: + - key: node.kubernetes.io/unschedulable + operator: Exists + effect: NoExecute + tolerationSeconds: 5 + - key: node-role.kubernetes.io/compute + operator: Equal + value: "true" + effect: NoSchedule + - key: amd.com/gpu + operator: Exists + effect: NoSchedule + + updateStrategy: + type: RollingUpdate + rollingUpdate: + maxUnavailable: 10 + + customConfig: + data_dir: /vector-data + api: + enabled: true + address: "0.0.0.0:8686" + sources: + mif_logs: + type: kubernetes_logs + # Only collect logs from pods that explicitly opt in to MIF log collection. + extra_label_selector: "mif.moreh.io/log.collect=true" + transforms: + mif_log_transform: + type: remap + inputs: + - mif_logs + source: | + .namespace = .kubernetes.pod_namespace + .node_name = "$VECTOR_SELF_NODE_NAME" + .app = get(.kubernetes.pod_labels, ["app.kubernetes.io/name"]) ?? "" + .inference_service = get(.kubernetes.pod_labels, ["app.kubernetes.io/instance"]) ?? "" + .pool_name = get(.kubernetes.pod_labels, ["mif.moreh.io/pool"]) ?? "" + .role = get(.kubernetes.pod_labels, ["mif.moreh.io/role"]) ?? "" + + log_format = get(.kubernetes.pod_labels, ["mif.moreh.io/log.format"]) ?? "" + + if log_format == "json" { + structured, err = parse_json(.message) + if err == null { + . = merge!(., structured) + msg, err = get(., ["msg"]) + if err == null { + .message = msg + del(.msg) + } + time, err = get(., ["time"]) + if err == null { + .timestamp = time + del(.time) + } + } + } + + del(.file) + del(.source_type) + del(.stream) + del(.kubernetes) + sinks: + loki: + type: loki + # {{ .Release.Name }} is evaluated by the vector chart's tpl rendering, + # producing the correct release-prefixed Loki gateway service name. + endpoint: "http://{{ .Release.Name }}-loki-gateway" + inputs: + - mif_log_transform + encoding: + codec: json + labels: + # {{`{{ field }}`}} is Go raw-string escaping required because the + # Vector chart renders customConfig through Helm's tpl function. + # tpl sees {{`{{ namespace }}`}} → evaluates raw string → outputs {{ namespace }} + # which is Vector's own field-template syntax. + namespace: "{{`{{ namespace }}`}}" + inference_service: "{{`{{ inference_service }}`}}" + pool_name: "{{`{{ pool_name }}`}}" + role: "{{`{{ role }}`}}" + node_name: "{{`{{ node_name }}`}}" + app: "{{`{{ app }}`}}" + level: "{{`{{ level }}`}}" + ecrTokenRefresher: # -- Enable ECR token refresher. enabled: true diff --git a/test/AGENTS.md b/test/AGENTS.md new file mode 100644 index 0000000..1bcb608 --- /dev/null +++ b/test/AGENTS.md @@ -0,0 +1,48 @@ +# Test — Agent Rules + +Rules specific to the `test/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md). + +## E2E Test + +- **Version scope**: + - E2E tests cover only `vX.Y.Z` (release) and `vX.Y.Z-rc.N` (release candidate) version formats. + - Other version formats (e.g. dev builds, custom tags) are out of scope and should not be tested in E2E. + +- **Do not test resource specifications**: + - Do not validate individual fields of the YAML file declaring the resource (resource spec). + - Instead, create the resource and verify that its status reaches the expected state. + +- **Assume fully controlled cluster**: + - Do not check if components are already installed. + - Assume the cluster is fully controlled by the test and installed components are safe to overwrite or delete. + +- **Test suite layout**: + - Split tests by purpose under `test/e2e`, for example `test/e2e/performance` and `test/e2e/quality`. + - In each directory, define shared Ginkgo configuration (labels, timeouts, common hooks) in `suite_test.go`, and keep scenarios in separate `*_test.go` files. + - Shared configuration values must come from the `test/utils/settings` package instead of hard-coded constants in test files. + +- **Environment variable management**: + - Manage all E2E environment variables centrally in `test/e2e/envs/env_vars.go`. + - When a new environment variable is required: + - Add it to the `envVars` slice with default value, description, category, and type. + - Expose it via public variables (for example `TestModel`, `HFToken`) and access it only through those variables. + - Do not call `os.Getenv` directly in test code. + - Keep the documentation consistent: changes must pass the `validateEnvVars()` check. + +- **Resource templates and settings**: + - Manage Kubernetes resource specifications for Gateway, InferenceService, Jobs, and similar resources as Go templates (`.yaml.tmpl`) under `test/config/**`. + - Tests must read template paths and default values from constants in `test/utils/settings/constants.go`. + - When adding a new benchmark or performance test Job: + - Add the template file under an appropriate `test/config/` subdirectory. + - Define the corresponding path and default parameters in the `settings` package. + +- **Utility reuse**: + - Implement all cluster manipulation logic (namespace creation, Gateway create/delete, Heimdall install/uninstall, InferenceService(Template) create/delete, etc.) in the `test/utils` package and call only those helpers from tests. + - Follow this pattern for scenario flow: + - `BeforeAll`: create namespace → install Gateway → install Heimdall → create InferenceServiceTemplates → create InferenceServices → wait until they are Ready. + - `AfterAll`: if `envs.SkipCleanup` is `false`, clean up the above resources in reverse order. + - `It(...)`: render the Job template → create the Job with `kubectl create -f -` → wait for completion with `kubectl wait` → collect logs and perform domain-specific assertions. + +- **Makefile and workflow integration**: + - Provide separate Make targets per test purpose (for example `e2e-performance`, `e2e-quality`) so that CI can run them independently. + - GitHub Actions and other workflows should invoke these targets directly, and new test categories should follow the same pattern when adding additional targets and workflows. diff --git a/test/CLAUDE.md b/test/CLAUDE.md new file mode 120000 index 0000000..47dc3e3 --- /dev/null +++ b/test/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/website/AGENTS.md b/website/AGENTS.md index 4135368..f6655d4 100644 --- a/website/AGENTS.md +++ b/website/AGENTS.md @@ -1,6 +1,6 @@ # Website (Docusaurus) — Agent Rules -This file defines rules for contributors and automation agents working in `website/`. +Rules specific to the `website/` directory. General contribution guidelines are in the root [`AGENTS.md`](/AGENTS.md). ## 1. Structure & Metadata @@ -30,8 +30,8 @@ This file defines rules for contributors and automation agents working in `websi https://docusaurus.io/docs/markdown-features/tabs ```mdx -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; @@ -57,6 +57,13 @@ image: - title: ` ``` title=""` - highlight: ` ```<language> {1,4-6}` +- **Expected output**: Blocks showing command output must always specify both a language type and a title on the same opening fence. Use `shell` for terminal output: + ````mdx + ```shell title="Expected output (one pod per node, all `Running`)" + NAME READY STATUS RESTARTS AGE + vector-xxxxx 1/1 Running 0 2m + ``` + ```` - **Variables**: - Format as `<variableName>` (camelCase, no quotes). - Highlight lines containing variables in code blocks (e.g., ` ```yaml {2} `). @@ -96,3 +103,9 @@ This is a warning ## 4. Content Guidelines - **Inference Deployment**: When documenting deployment of inference services (e.g., vLLM, SGLang), instructions MUST use the `InferenceService` resource with a preset. + +- **No duplicate installation steps**: Operation or feature docs must not repeat the values file example or `helm upgrade` command that already appears in `getting-started/prerequisites.mdx`. Instead, link directly to the relevant section: + ```mdx + See [Prerequisites](../getting-started/prerequisites.mdx#moai-inference-framework) for the required values and install command. + ``` + Duplication causes the two pages to diverge whenever the chart version or values change. diff --git a/website/CLAUDE.md b/website/CLAUDE.md new file mode 120000 index 0000000..47dc3e3 --- /dev/null +++ b/website/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/website/docs/getting-started/prerequisites.mdx b/website/docs/getting-started/prerequisites.mdx index 67293a5..f1f36da 100644 --- a/website/docs/getting-started/prerequisites.mdx +++ b/website/docs/getting-started/prerequisites.mdx @@ -4,8 +4,8 @@ sidebar_label: Prerequisites sidebar_position: 1 --- -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; This document introduces the prerequisites for the MoAI Inference Framework and provides instructions on how to install them. @@ -74,7 +74,9 @@ ecrTokenRefresher: secretAccessKey: <AWS_SECRET_ACCESS_KEY> ``` -In addition, if dependencies such as keda, kube-prometheus-stack, lws are already installed in your cluster, you should skip their installation by setting the corresponding values to `false` in the `moai-inference-framework-values.yaml` file. Refer to [moai-inference-framework README](https://github.com/moreh-dev/mif/tree/main/deploy/helm/moai-inference-framework) to see the full list of dependencies. +:::info +If dependencies such as `keda`, `kube-prometheus-stack`, or `lws` are already installed in your cluster, skip their installation by setting the corresponding values to `false`. Refer to [moai-inference-framework README](https://github.com/moreh-dev/mif/tree/main/deploy/helm/moai-inference-framework) for the full list of dependencies. +::: Then, deploy the `moai-inference-framework` chart using the following command: @@ -136,7 +138,7 @@ deviceConfig: spec: driver: enable: true - version: '6.4.3' + version: "6.4.3" blacklist: true image: <registry>/amdgpu-driver imageRegistrySecret: @@ -279,7 +281,7 @@ spec: app.kubernetes.io/instance: rdma-shared-device-plugin updateStrategy: rollingUpdate: - maxUnavailable: '30%' + maxUnavailable: "30%" template: metadata: labels: @@ -415,8 +417,8 @@ Create a `istiod-values.yaml` file and install the Istio control plane. ```yaml title="istiod-values.yaml" pilot: env: - PILOT_ENABLE_ALPHA_GATEWAY_API: 'true' - ENABLE_GATEWAY_API_INFERENCE_EXTENSION: 'true' + PILOT_ENABLE_ALPHA_GATEWAY_API: "true" + ENABLE_GATEWAY_API_INFERENCE_EXTENSION: "true" ``` ```shell diff --git a/website/docs/operations/log-collection.mdx b/website/docs/operations/log-collection.mdx new file mode 100644 index 0000000..f3a5bfb --- /dev/null +++ b/website/docs/operations/log-collection.mdx @@ -0,0 +1,263 @@ +--- +title: Log Collection (Loki + Vector) +sidebar_label: Log Collection +sidebar_position: 3 +--- + +This document explains how to enable centralized log collection for the MoAI Inference Framework using [Loki](https://grafana.com/oss/loki/) (log aggregation) and [Vector](https://vector.dev/) (log collection agent). + +## Overview + +``` +Inference Service Pods + │ (container logs) + ▼ + Vector DaemonSet ← runs on every node + │ transforms + labels logs + │ (namespace, inference_service, pool_name, …) + ▼ + Loki Gateway ← stores logs in MinIO (S3) + │ + ▼ + Grafana (LogQL) ← search and visualize logs +``` + +### Labels available for log search + +| Label | Source | Example value | +| :------------------ | :--------------------------------------------------------------------------------- | :---------------------- | +| `namespace` | `kubernetes.pod_namespace` | `default` | +| `inference_service` | pod label `app.kubernetes.io/instance` | `llama-3-2-1b` | +| `pool_name` | pod label `mif.moreh.io/pool` | `heimdall` | +| `role` | pod label `mif.moreh.io/role` | `prefill`, `decode` | +| `app` | pod label `app.kubernetes.io/name` | `vllm` | +| `node_name` | `VECTOR_SELF_NODE_NAME` env var (injected by Vector) | `gpu-node-01` | +| `level` | parsed from JSON log field `level` (pods with `mif.moreh.io/log.format=json` only) | `info`, `warn`, `error` | + +--- + +## Prerequisites + +- The `moai-inference-framework` Helm chart installed (or being installed). + +:::info +MinIO, Loki, and Vector are all **enabled by default** in the `moai-inference-framework` chart. No additional configuration is required to get started. +::: + +--- + +## Installation + +Log collection is installed as part of the `moai-inference-framework` Helm chart. See [Prerequisites](../getting-started/prerequisites.mdx#moai-inference-framework) for the required values and install command. + +--- + +## Verifying the installation + +Check that all Loki components are running. + +```shell +kubectl get pods -n mif -l app.kubernetes.io/name=loki +``` + +```shell title="Expected output (all pods Running)" +NAME READY STATUS RESTARTS AGE +loki-backend-0 1/1 Running 0 2m +loki-gateway-xxxxxxxxx-xxxxx 1/1 Running 0 2m +loki-read-xxxxxxxxx-xxxxx 1/1 Running 0 2m +loki-write-0 1/1 Running 0 2m +``` + +Check that Vector is running on all nodes. + +```shell +kubectl get pods -n mif -l app.kubernetes.io/name=vector +``` + +```shell title="Expected output (one pod per node, all Running)" +NAME READY STATUS RESTARTS AGE +vector-xxxxx 1/1 Running 0 2m +vector-yyyyy 1/1 Running 0 2m +``` + +Check Vector logs to confirm it is shipping to Loki without errors. + +```shell +kubectl logs -n mif -l app.kubernetes.io/name=vector --tail=50 +``` + +--- + +## Enabling log collection for a pod + +Vector collects logs only from pods that explicitly opt in. Two pod labels control this behavior. + +### Opt-in label + +Add the `mif.moreh.io/log.collect=true` label to a pod to include its logs in Vector's collection. Pods without this label are ignored entirely. + +```yaml +metadata: + labels: + mif.moreh.io/log.collect: "true" +``` + +### Log format label + +Add the `mif.moreh.io/log.format=json` label to enable structured JSON log parsing for a pod. When set, Vector parses each log line as JSON and promotes the following fields: + +| JSON field | Mapped to | +| :--------- | :-------------------- | +| `msg` | `message` | +| `time` | `timestamp` | +| `level` | `level` (Loki label) | +| others | merged into the event | + +Without this label, the log line is forwarded as-is without any JSON parsing. + +```yaml +metadata: + labels: + mif.moreh.io/log.collect: "true" + mif.moreh.io/log.format: "json" +``` + +:::info +The `level` Loki label is only populated for pods with `mif.moreh.io/log.format=json`. For plain-text pods, `level` remains empty. +::: + +--- + +## Searching logs in Grafana + +Open Grafana → **Explore** → select the **Loki** datasource. + +### By namespace + +```promql +{namespace="default"} +``` + +### By inference service name + +```promql +{inference_service="llama-3-2-1b"} +``` + +### By pool name + +```promql +{pool_name="heimdall"} +``` + +### By role + +```promql +{role="decode"} +``` + +### Combined filter + +```promql +{namespace="default", inference_service="llama-3-2-1b", role="prefill"} |= "error" +``` + +### Filter by log level + +```promql +{namespace="default", level="error"} +``` + +--- + +## Architecture details + +### Loki + +| Property | Value | +| :---------------- | :--------------------------------------------- | +| Helm chart | `moreh/loki` v6.30.0 (upstream grafana/loki) | +| App version | 3.5.1 | +| Storage backend | S3 (MinIO), TSDB index | +| Retention | 90 days (2160 h) | +| Ingestion limit | 30 MB/s, 60 MB burst | +| Max entries/query | 50 000 | +| Deployment | Distributed (gateway / read / write / backend) | + +### Vector + +| Property | Value | +| :---------- | :------------------------------------------------------------------------ | +| Helm chart | `vector/vector` v0.39.0 | +| Deployment | DaemonSet (Agent mode, one pod per node) | +| Log source | Pods labelled `mif.moreh.io/log.collect=true` (`kubernetes_logs`) | +| Log format | JSON parsing applied only to pods labelled `mif.moreh.io/log.format=json` | +| Tolerations | unschedulable, compute, `amd.com/gpu` | + +### MinIO + +| Property | Value | +| :--------------- | :----------------------------------------------------------- | +| Helm chart | `minio/minio` v5.4.0 | +| Mode | Standalone | +| Bucket | `loki` (created via post-install Job on startup) | +| Loki credentials | Dedicated `loki` user with S3 policy scoped to `loki` bucket | +| Resources | 1 CPU / 2 Gi memory (requests), 1.5 CPU / 3 Gi (limits) | +| Persistence | emptyDir (ephemeral by default) | +| Deployment | Single pod | + +### Component naming + +Service names are derived from the Helm release name. With the default release name `mif`: + +| Service | Name (same-namespace access) | +| :----------- | :--------------------------- | +| MinIO | `mif-minio` | +| Loki gateway | `mif-loki-gateway` | +| Loki read | `mif-loki-read` | +| Loki write | `mif-loki-write` | + +Vector connects to Loki using the release-prefixed service name since all components are co-located in the same namespace. + +--- + +## Using an external MinIO + +If MinIO is already deployed outside this chart, set `minio.enabled: false`. The `loki-bucket` ConfigMap and Secret are still generated from `minio.users[0].*` values regardless of whether the sub-chart is enabled. The `BUCKET_HOST` is derived from the Helm release name (`<release>-minio`). + +**Same namespace** — if the existing MinIO service name matches `<release>-minio`, no additional configuration is needed. Otherwise, set `minio.fullnameOverride` to override the name used as `BUCKET_HOST`: + +```yaml title="moai-inference-framework-values.yaml" {3,5-6} +minio: + enabled: false + fullnameOverride: <existing-minio-service-name> + users: + - accessKey: <lokiAccessKey> + secretKey: <lokiSecretKey> + policy: loki +``` + +**Different namespace** — set `minio.fullnameOverride` to the FQDN so that Loki can resolve it cross-namespace: + +```yaml title="moai-inference-framework-values.yaml" {5-6} +minio: + enabled: false + fullnameOverride: minio.minio.svc.cluster.local + users: + - accessKey: <lokiAccessKey> + secretKey: <lokiSecretKey> + policy: loki +``` + +--- + +## Disabling log collection + +```yaml title="moai-inference-framework-values.yaml" +minio: + enabled: false +loki: + enabled: false +vector: + enabled: false +``` diff --git a/website/docusaurus.config.ts b/website/docusaurus.config.ts index 477c0cb..e2f43fe 100644 --- a/website/docusaurus.config.ts +++ b/website/docusaurus.config.ts @@ -82,7 +82,7 @@ const config: Config = { ], }, prism: { - additionalLanguages: ['bash', 'toml', 'yaml'], + additionalLanguages: ['bash', 'toml', 'yaml', 'promql'], theme: themes.nightOwlLight, darkTheme: themes.vsDark, }, diff --git a/website/package-lock.json b/website/package-lock.json index fa9aa1f..e032ba0 100644 --- a/website/package-lock.json +++ b/website/package-lock.json @@ -278,6 +278,7 @@ "resolved": "https://registry.npmjs.org/@algolia/client-search/-/client-search-5.48.0.tgz", "integrity": "sha512-RB9bKgYTVUiOcEb5bOcZ169jiiVW811dCsJoLT19DcbbFmU4QaK0ghSTssij35QBQ3SCOitXOUrHcGgNVwS7sQ==", "license": "MIT", + "peer": true, "dependencies": { "@algolia/client-common": "5.48.0", "@algolia/requester-browser-xhr": "5.48.0", @@ -444,6 +445,7 @@ "resolved": "https://registry.npmjs.org/@babel/core/-/core-7.29.0.tgz", "integrity": "sha512-CGOfOJqWjg2qW/Mb6zNsDm+u5vFQ8DxXfbM09z69p5Z6+mE1ikP2jUXw+j42Pf1XTYED2Rni5f95npYeuwMDQA==", "license": "MIT", + "peer": true, "dependencies": { "@babel/code-frame": "^7.29.0", "@babel/generator": "^7.29.0", @@ -2479,6 +2481,7 @@ } ], "license": "MIT", + "peer": true, "engines": { "node": ">=18" }, @@ -2501,6 +2504,7 @@ } ], "license": "MIT", + "peer": true, "engines": { "node": ">=18" } @@ -2610,6 +2614,7 @@ "resolved": "https://registry.npmjs.org/postcss-selector-parser/-/postcss-selector-parser-7.1.1.tgz", "integrity": "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg==", "license": "MIT", + "peer": true, "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" @@ -3031,6 +3036,7 @@ "resolved": "https://registry.npmjs.org/postcss-selector-parser/-/postcss-selector-parser-7.1.1.tgz", "integrity": "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg==", "license": "MIT", + "peer": true, "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" @@ -3830,6 +3836,7 @@ "resolved": "https://registry.npmjs.org/@docusaurus/core/-/core-3.9.2.tgz", "integrity": "sha512-HbjwKeC+pHUFBfLMNzuSjqFE/58+rLVKmOU3lxQrpsxLBOGosYco/Q0GduBb0/jEMRiyEqjNT/01rRdOMWq5pw==", "license": "MIT", + "peer": true, "dependencies": { "@docusaurus/babel": "3.9.2", "@docusaurus/bundler": "3.9.2", @@ -4011,6 +4018,7 @@ "resolved": "https://registry.npmjs.org/@docusaurus/plugin-content-docs/-/plugin-content-docs-3.9.2.tgz", "integrity": "sha512-C5wZsGuKTY8jEYsqdxhhFOe1ZDjH0uIYJ9T/jebHwkyxqnr4wW0jTkB72OMqNjsoQRcb0JN3PcSeTwFlVgzCZg==", "license": "MIT", + "peer": true, "dependencies": { "@docusaurus/core": "3.9.2", "@docusaurus/logger": "3.9.2", @@ -5047,6 +5055,7 @@ "resolved": "https://registry.npmjs.org/@mdx-js/react/-/react-3.1.1.tgz", "integrity": "sha512-f++rKLQgUVYDAtECQ6fn/is15GkEH9+nZPM3MS0RcxVqoTfawHvDlSCH7JbMhAM6uJ32v3eXLvLmLvjGu7PTQw==", "license": "MIT", + "peer": true, "dependencies": { "@types/mdx": "^2.0.0" }, @@ -5519,6 +5528,7 @@ "resolved": "https://registry.npmjs.org/@svgr/core/-/core-8.1.0.tgz", "integrity": "sha512-8QqtOQT5ACVlmsvKOJNEaWmRPmcojMOzCz4Hs2BGG/toAp/K38LcsMRyLp349glq5AzJbCEeimEoxaX6v/fLrA==", "license": "MIT", + "peer": true, "dependencies": { "@babel/core": "^7.21.3", "@svgr/babel-preset": "8.1.0", @@ -6140,6 +6150,7 @@ "resolved": "https://registry.npmjs.org/@types/react/-/react-19.2.13.tgz", "integrity": "sha512-KkiJeU6VbYbUOp5ITMIc7kBfqlYkKA5KhEHVrGMmUUMt7NeaZg65ojdPk+FtNrBAOXNVM5QM72jnADjM+XVRAQ==", "license": "MIT", + "peer": true, "dependencies": { "csstype": "^3.2.2" } @@ -6488,6 +6499,7 @@ "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.15.0.tgz", "integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==", "license": "MIT", + "peer": true, "bin": { "acorn": "bin/acorn" }, @@ -6555,6 +6567,7 @@ "resolved": "https://registry.npmjs.org/ajv/-/ajv-8.17.1.tgz", "integrity": "sha512-B/gBuNg5SiMTrPkC+A2+cW0RszwxYmn6VYxB/inlBStS5nx6xHIt/ehKRhIMhqusl7a8LjQoZnjCs5vhwxOQ1g==", "license": "MIT", + "peer": true, "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", @@ -6600,6 +6613,7 @@ "resolved": "https://registry.npmjs.org/algoliasearch/-/algoliasearch-5.48.0.tgz", "integrity": "sha512-aD8EQC6KEman6/S79FtPdQmB7D4af/etcRL/KwiKFKgAE62iU8c5PeEQvpvIcBPurC3O/4Lj78nOl7ZcoazqSw==", "license": "MIT", + "peer": true, "dependencies": { "@algolia/abtesting": "1.14.0", "@algolia/client-abtesting": "5.48.0", @@ -7076,6 +7090,7 @@ } ], "license": "MIT", + "peer": true, "dependencies": { "baseline-browser-mapping": "^2.9.0", "caniuse-lite": "^1.0.30001759", @@ -8088,6 +8103,7 @@ "resolved": "https://registry.npmjs.org/postcss-selector-parser/-/postcss-selector-parser-7.1.1.tgz", "integrity": "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg==", "license": "MIT", + "peer": true, "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" @@ -8407,6 +8423,7 @@ "resolved": "https://registry.npmjs.org/cytoscape/-/cytoscape-3.33.1.tgz", "integrity": "sha512-iJc4TwyANnOGR1OmWhsS9ayRS3s+XQ185FmuHObThD+5AeJCakAAbWv8KimMTt08xCCLNgneQwFp+JRJOr9qGQ==", "license": "MIT", + "peer": true, "engines": { "node": ">=0.10" } @@ -8828,6 +8845,7 @@ "resolved": "https://registry.npmjs.org/d3-selection/-/d3-selection-3.0.0.tgz", "integrity": "sha512-fmTRWbNMmsmWq6xJV8D19U/gw/bwrHfNXxrIN+HfZgnzqTHp9jOmKMhsTUjXOJnZOdZY9Q28y4yebKzqDKlxlQ==", "license": "ISC", + "peer": true, "engines": { "node": ">=12" } @@ -10019,6 +10037,7 @@ "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.12.6.tgz", "integrity": "sha512-j3fVLgvTo527anyYyJOGTYJbG+vnnQYvE0m5mmkc1TK+nxAppkCLMIL0aZ4dblVCNoGShhm+kzE4ZUykBoMg4g==", "license": "MIT", + "peer": true, "dependencies": { "fast-deep-equal": "^3.1.1", "fast-json-stable-stringify": "^2.0.0", @@ -14602,6 +14621,7 @@ "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.12.6.tgz", "integrity": "sha512-j3fVLgvTo527anyYyJOGTYJbG+vnnQYvE0m5mmkc1TK+nxAppkCLMIL0aZ4dblVCNoGShhm+kzE4ZUykBoMg4g==", "license": "MIT", + "peer": true, "dependencies": { "fast-deep-equal": "^3.1.1", "fast-json-stable-stringify": "^2.0.0", @@ -15190,6 +15210,7 @@ } ], "license": "MIT", + "peer": true, "dependencies": { "nanoid": "^3.3.11", "picocolors": "^1.1.1", @@ -16093,6 +16114,7 @@ "resolved": "https://registry.npmjs.org/postcss-selector-parser/-/postcss-selector-parser-7.1.1.tgz", "integrity": "sha512-orRsuYpJVw8LdAwqqLykBj9ecS5/cRHlI5+nvTo8LcCKmzDmqVORXtOIYEEQuL9D4BxtA1lm5isAqzQZCoQ6Eg==", "license": "MIT", + "peer": true, "dependencies": { "cssesc": "^3.0.0", "util-deprecate": "^1.0.2" @@ -16918,6 +16940,7 @@ "resolved": "https://registry.npmjs.org/react/-/react-18.3.1.tgz", "integrity": "sha512-wS+hAgJShR0KhEvPJArfuPVN1+Hz1t0Y6n5jLrGQbkb4urgPE/0Rve+1kMB1v/oWgHgm4WIcV+i7F2pTVj+2iQ==", "license": "MIT", + "peer": true, "dependencies": { "loose-envify": "^1.1.0" }, @@ -16930,6 +16953,7 @@ "resolved": "https://registry.npmjs.org/react-dom/-/react-dom-18.3.1.tgz", "integrity": "sha512-5m4nQKp+rZRb09LNH59GM4BxTh9251/ylbKIbpe7TpGxfJ+9kv6BLkLBXIjjspbgbnIBNqlI23tRnTWT0snUIw==", "license": "MIT", + "peer": true, "dependencies": { "loose-envify": "^1.1.0", "scheduler": "^0.23.2" @@ -16986,6 +17010,7 @@ "resolved": "https://registry.npmjs.org/@docusaurus/react-loadable/-/react-loadable-6.0.0.tgz", "integrity": "sha512-YMMxTUQV/QFSnbgrP3tjDzLHRg7vsbMn8e9HAa8o/1iXoiomo48b7sk/kkmWEuWNDPJVlKSJRB6Y2fHqdJk+SQ==", "license": "MIT", + "peer": true, "dependencies": { "@types/react": "*" }, @@ -17014,6 +17039,7 @@ "resolved": "https://registry.npmjs.org/react-router/-/react-router-5.3.4.tgz", "integrity": "sha512-Ys9K+ppnJah3QuaRiLxk+jDWOR1MekYQrlytiXxC1RyfbdsZkS5pvKAzCCr031xHixZwpnsYNT5xysdFHQaYsA==", "license": "MIT", + "peer": true, "dependencies": { "@babel/runtime": "^7.12.13", "history": "^4.9.0", @@ -18826,7 +18852,8 @@ "version": "2.8.1", "resolved": "https://registry.npmjs.org/tslib/-/tslib-2.8.1.tgz", "integrity": "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w==", - "license": "0BSD" + "license": "0BSD", + "peer": true }, "node_modules/tsyringe": { "version": "4.10.0", @@ -18907,6 +18934,7 @@ "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", "devOptional": true, "license": "Apache-2.0", + "peer": true, "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" @@ -19262,6 +19290,7 @@ "resolved": "https://registry.npmjs.org/ajv/-/ajv-6.12.6.tgz", "integrity": "sha512-j3fVLgvTo527anyYyJOGTYJbG+vnnQYvE0m5mmkc1TK+nxAppkCLMIL0aZ4dblVCNoGShhm+kzE4ZUykBoMg4g==", "license": "MIT", + "peer": true, "dependencies": { "fast-deep-equal": "^3.1.1", "fast-json-stable-stringify": "^2.0.0", @@ -19509,6 +19538,7 @@ "resolved": "https://registry.npmjs.org/webpack/-/webpack-5.105.1.tgz", "integrity": "sha512-Gdj3X74CLJJ8zy4URmK42W7wTZUJrqL+z8nyGEr4dTN0kb3nVs+ZvjbTOqRYPD7qX4tUmwyHL9Q9K6T1seW6Yw==", "license": "MIT", + "peer": true, "dependencies": { "@types/eslint-scope": "^3.7.7", "@types/estree": "^1.0.8",