Skip to content

Commit e0edd64

Browse files
committed
docs(helm): Add helm docs for medcat-trainer
1 parent a5db933 commit e0edd64

4 files changed

Lines changed: 217 additions & 32 deletions

File tree

.pre-commit-config.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,9 @@ repos:
66
files: helm-charts/medcat-service-helm/.*
77
args:
88
# Make the tool search for charts only under the `example-charts` directory
9-
- --chart-search-root=helm-charts/medcat-service-helm
9+
- --chart-search-root=helm-charts/medcat-service-helm
10+
- id: helm-docs-container
11+
files: helm-charts/medcat-trainer-helm/.*
12+
args:
13+
# Make the tool search for charts only under the `example-charts` directory
14+
- --chart-search-root=helm-charts/medcat-trainer-helm

helm-charts/medcat-trainer-helm/README.md

Lines changed: 119 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ By default the chart will:
99
- Run a SOLR and Zookeeper cluster for the Concept DB
1010
- Run a Postgres database for persistence
1111

12-
1312
## Installation
1413

1514
```sh
@@ -20,12 +19,11 @@ helm install my-medcat-trainer oci://registry-1.docker.io/cogstacksystems/medcat
2019

2120
See these values for common configurations to change:
2221

23-
| Setting |description |
24-
| -------- | -------- |
25-
| `env` | Environment variables as defined in the [MedCAT Trainer docs](https://docs.cogstack.org/projects/medcat-trainer/en/latest/installation.html). |
26-
|`medcatConfig`|MedCAT config file as described [here](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/medcat/config/config.py)|
27-
| `env.CSRF_TRUSTED_ORIGINS` | The Host and Port to access the application on |
28-
22+
| Setting | description |
23+
| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
24+
| `env` | Environment variables as defined in the [MedCAT Trainer docs](https://docs.cogstack.org/projects/medcat-trainer/en/latest/installation.html). |
25+
| `medcatConfig` | MedCAT config file as described [here](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/medcat/config/config.py) |
26+
| `env.CSRF_TRUSTED_ORIGINS` | The Host and Port to access the application on |
2927

3028
### Use Sqlite instead of Postgres
3129

@@ -41,9 +39,123 @@ postgresql:
4139
```
4240
4341
## Missing features
42+
4443
These features are not yet existing but to be added in future:
44+
4545
- Use a pre existing postgres db
4646
- Use a pre existing SOLR instance
4747
- Migrate from supervisord to standalone deployment for background tasks for better scaling
4848
- Support SOLR authentication from medcat trainer
4949
- Support passing DB OPTIONS to medcat trainer for use in cloud environments
50+
51+
## Requirements
52+
53+
| Repository | Name | Version |
54+
|------------|------|---------|
55+
| oci://registry-1.docker.io/bitnamicharts | postgresql | 16.7.27 |
56+
| oci://registry-1.docker.io/bitnamicharts | solr | 9.6.10 |
57+
58+
## Values
59+
60+
| Key | Type | Default | Description |
61+
|-----|------|---------|-------------|
62+
| affinity | object | `{}` | |
63+
| autoscaling.enabled | bool | `false` | |
64+
| autoscaling.maxReplicas | int | `100` | |
65+
| autoscaling.minReplicas | int | `1` | |
66+
| autoscaling.targetCPUUtilizationPercentage | int | `80` | |
67+
| env | object | `{"CSRF_TRUSTED_ORIGINS":"http://localhost:8080","DB_ENGINE":"postgresql","DB_PORT":"5432","DEBUG":"1","EMAIL_HOST":"mail.cogstack.org","EMAIL_PASS":"to-be-changed","EMAIL_PORT":"465","EMAIL_USER":"example@cogstack.org","ENV":"non-prod","LOAD_NUM_DOC_PAGES":"10","MAX_DATASET_SIZE":"10000","MAX_MEDCAT_MODELS":"2","OPENBLAS_NUM_THREADS":"1","RESUBMIT_ALL_ON_STARTUP":"0","UNIQUE_DOC_NAMES_IN_DATASETS":"True","VITE_USE_OIDC":"0"}` | Add any environment variables here that should be set in the medcat-trainer container |
68+
| env.CSRF_TRUSTED_ORIGINS | string | `"http://localhost:8080"` | This sets the CSRF trusted origins for the medcat-trainer container. Change to allow access from other domains |
69+
| envValueFrom | object | `{"K8S_NODE_NAME":{"fieldRef":{"fieldPath":"spec.nodeName"}},"K8S_POD_NAME":{"fieldRef":{"fieldPath":"metadata.name"}},"K8S_POD_NAMESPACE":{"fieldRef":{"fieldPath":"metadata.namespace"}},"K8S_POD_UID":{"fieldRef":{"fieldPath":"metadata.uid"}}}` | Allow setting env values from field/configmap/secret references @default -- Adds K8s downward API values for tracing |
70+
| fullnameOverride | string | `""` | |
71+
| hostAliases | list | `[]` | Host aliases for the pod |
72+
| image.pullPolicy | string | `"IfNotPresent"` | This sets the pull policy for images. |
73+
| image.repository | string | `"cogstacksystems/medcat-trainer"` | Image repository for the MedCAT service container |
74+
| imagePullSecrets | list | `[]` | This is for the secrets for pulling an image from a private repository more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ |
75+
| ingress.annotations | object | `{}` | |
76+
| ingress.className | string | `""` | |
77+
| ingress.enabled | bool | `false` | |
78+
| ingress.hosts[0].host | string | `"chart-example.local"` | |
79+
| ingress.hosts[0].paths[0].path | string | `"/"` | |
80+
| ingress.hosts[0].paths[0].pathType | string | `"ImplementationSpecific"` | |
81+
| ingress.tls | list | `[]` | |
82+
| livenessProbe.failureThreshold | int | `30` | |
83+
| livenessProbe.httpGet.path | string | `"/api/health/live/?format=json"` | |
84+
| livenessProbe.httpGet.port | string | `"api"` | |
85+
| medcatConfig | string | Default config for MedCAT Trainer | MedCAT config as described here: [MedCAT config](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/medcat/config/config.py) |
86+
| nameOverride | string | `""` | This is to override the chart name. |
87+
| nginx.livenessProbe.httpGet.path | string | `"/nginx/health/live"` | |
88+
| nginx.livenessProbe.httpGet.port | string | `"http"` | |
89+
| nginx.readinessProbe.httpGet.path | string | `"/nginx/health/live"` | |
90+
| nginx.readinessProbe.httpGet.port | string | `"http"` | |
91+
| nginxImage | object | `{"pullPolicy":"IfNotPresent","repository":"nginx","tag":"1.29.1"}` | This sets the container image for the nginx server more information can be found here: https://kubernetes.io/docs/concepts/containers/images/ |
92+
| nginxImage.pullPolicy | string | `"IfNotPresent"` | This sets the pull policy for images. |
93+
| nginxImage.repository | string | `"nginx"` | Image repository for the nginx server |
94+
| nginxImage.tag | string | `"1.29.1"` | This sets the image tag for the nginx server |
95+
| nginxUpdateStrategy.type | string | `"RollingUpdate"` | |
96+
| nodeSelector | object | `{}` | |
97+
| persistence.media.size | string | `"8Gi"` | |
98+
| persistence.sqlite.backupDbSize | string | `"300Mi"` | |
99+
| persistence.sqlite.size | string | `"100Mi"` | |
100+
| persistence.static.size | string | `"100Mi"` | |
101+
| persistence.storageClassName | string | `""` | |
102+
| podAnnotations | object | `{}` | This is for setting Kubernetes Annotations to a Pod. For more information checkout: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
103+
| podLabels | object | `{}` | This is for setting Kubernetes Labels to a Pod. For more information checkout: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ |
104+
| podSecurityContext | object | `{}` | |
105+
| postgresql.auth.database | string | `"postgres"` | |
106+
| postgresql.auth.password | string | `"postgres"` | |
107+
| postgresql.auth.username | string | `"postgres"` | |
108+
| postgresql.enabled | bool | `true` | |
109+
| postgresql.image.repository | string | `"bitnamilegacy/postgresql"` | |
110+
| postgresql.image.tag | string | `"17.6.0-debian-12-r4"` | |
111+
| postgresql.primary.persistence.size | string | `"500Mi"` | |
112+
| provisioning.config | object | Config to load example project from github | Provisioning Config Yaml contents. Can be templated See https://docs.cogstack.org/projects/medcat-trainer/en/latest/provisioning/ |
113+
| provisioning.enabled | bool | `false` | Set to true to enable provisioning of projects and models on startup.. |
114+
| provisioning.existingConfigMap | object | `{}` | Optional: Reference an existing configmap for the provisioning config. |
115+
| readinessProbe.httpGet.path | string | `"/api/health/ready/?format=json"` | |
116+
| readinessProbe.httpGet.port | string | `"api"` | |
117+
| replicaCount | int | `1` | This will set the replicaset count more information can be found here: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/ |
118+
| resources | object | `{}` | Resources for the pod. More information can be found here: https://kubernetes.io/docs/concepts/containers/ Recommendation for a minimal production setup is { requests: { cpu: 2, memory: 2Gi }, limits: { cpu: null <unset>, memory: 4Gi } } |
119+
| runtimeClassName | string | `""` | Runtime class name for the pod (e.g., "nvidia" for GPU workloads) |
120+
| securityContext | object | `{}` | |
121+
| service.apiPort | int | `8000` | |
122+
| service.port | int | `8001` | This sets the ports more information can be found here: https://kubernetes.io/docs/concepts/services-networking/service/#field-spec-ports |
123+
| service.type | string | `"ClusterIP"` | This sets the service type more information can be found here: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types |
124+
| serviceAccount.annotations | object | `{}` | Annotations to add to the service account |
125+
| serviceAccount.automount | bool | `true` | Automatically mount a ServiceAccount's API credentials? |
126+
| serviceAccount.create | bool | `true` | Specifies whether a service account should be created |
127+
| serviceAccount.name | string | `""` | The name of the service account to use. If not set and create is true, a name is generated using the fullname template |
128+
| solr.auth.enabled | bool | `false` | |
129+
| solr.collectionReplicas | int | `1` | |
130+
| solr.collectionShards | int | `1` | |
131+
| solr.image.repository | string | `"bitnamilegacy/solr"` | |
132+
| solr.image.tag | string | `"9.9.0-debian-12-r1"` | |
133+
| solr.persistence.size | string | `"1Gi"` | |
134+
| solr.podLabels."app.kubernetes.io/component" | string | `"solr"` | |
135+
| solr.podLabels."app.kubernetes.io/part-of" | string | `"cogstack"` | |
136+
| solr.replicaCount | int | `1` | |
137+
| solr.zookeeper.image.repository | string | `"bitnamilegacy/zookeeper"` | |
138+
| solr.zookeeper.image.tag | string | `"3.9.3-debian-12-r22"` | |
139+
| solr.zookeeper.persistence.size | string | `"1Gi"` | |
140+
| solr.zookeeper.replicaCount | int | `1` | |
141+
| startupProbe.failureThreshold | int | `30` | |
142+
| startupProbe.httpGet.path | string | `"/api/health/startup/?format=json"` | |
143+
| startupProbe.httpGet.port | string | `"api"` | |
144+
| startupProbe.initialDelaySeconds | int | `15` | |
145+
| startupProbe.periodSeconds | int | `10` | |
146+
| tolerations | list | `[]` | |
147+
| tracing.disabledInstrumentations | string | `"psycopg,sqlite3"` | |
148+
| tracing.experimentalResourceDetectors | string | `"containerid,os"` | |
149+
| tracing.otlp.enabled | bool | `false` | |
150+
| tracing.otlp.grpc.enabled | bool | `false` | |
151+
| tracing.otlp.grpc.endpoint | string | `"http://unused:4317"` | |
152+
| tracing.otlp.http.enabled | bool | `false` | |
153+
| tracing.otlp.http.endpoint | string | `"http://unused:4318"` | |
154+
| tracing.resourceAttributes | object | Adds semantic k8s attributes for tracing | Resource attributes to add to the traces. Can be templated |
155+
| tracing.serviceName | string | `"medcat-trainer"` | |
156+
| updateStrategy.type | string | `"RollingUpdate"` | Used for Kubernetes deployment .spec.strategy.type. Allowed values are "Recreate" or "RollingUpdate". |
157+
| volumeMounts | list | `[]` | Additional volumeMounts on the output Deployment definition. |
158+
| volumes | list | `[]` | Additional volumes on the output Deployment definition. |
159+
160+
----------------------------------------------
161+
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# MedCAT Trainer Helm Chart
2+
3+
This Helm chart deploys MedCAT Trainer and infrastructure to a Kubernetes cluster.
4+
5+
By default the chart will:
6+
7+
- Run MedCAT Trainer Django server
8+
- Run NGINX for static site hosting and routing
9+
- Run a SOLR and Zookeeper cluster for the Concept DB
10+
- Run a Postgres database for persistence
11+
12+
## Installation
13+
14+
```sh
15+
helm install my-medcat-trainer oci://registry-1.docker.io/cogstacksystems/medcat-trainer-helm
16+
```
17+
18+
## Configuration
19+
20+
See these values for common configurations to change:
21+
22+
| Setting | description |
23+
| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
24+
| `env` | Environment variables as defined in the [MedCAT Trainer docs](https://docs.cogstack.org/projects/medcat-trainer/en/latest/installation.html). |
25+
| `medcatConfig` | MedCAT config file as described [here](https://github.com/CogStack/cogstack-nlp/blob/main/medcat-v2/medcat/config/config.py) |
26+
| `env.CSRF_TRUSTED_ORIGINS` | The Host and Port to access the application on |
27+
28+
### Use Sqlite instead of Postgres
29+
30+
Sqlite can be used for smaller single instance deployments
31+
32+
Set these values:
33+
34+
```yaml
35+
DB_ENGINE: "sqlite3"
36+
37+
postgresql:
38+
enabled: false
39+
```
40+
41+
## Missing features
42+
43+
These features are not yet existing but to be added in future:
44+
45+
- Use a pre existing postgres db
46+
- Use a pre existing SOLR instance
47+
- Migrate from supervisord to standalone deployment for background tasks for better scaling
48+
- Support SOLR authentication from medcat trainer
49+
- Support passing DB OPTIONS to medcat trainer for use in cloud environments
50+
51+
{{ template "chart.requirementsSection" . }}
52+
53+
{{ template "chart.valuesSection" . }}
54+
55+
{{ template "helm-docs.versionFooter" . }}

0 commit comments

Comments
 (0)