Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Structured JSON logging with Logstash encoder and trace/span IDs
- Prometheus metrics via Spring Boot Actuator
- Swagger/OpenAPI documentation (SpringDoc)
- Full observability stack via Docker Compose (Grafana, Prometheus, Tempo, Loki, Promtail)
- Full observability stack via Docker Compose (Grafana, Prometheus, Tempo, Loki)
- Comprehensive integration tests with Testcontainers (Keycloak, PostgreSQL)
- Negative testing with WireMock (network failures, timeouts, error responses)
- Phone uniqueness validation with conflict handling
Expand Down
1 change: 0 additions & 1 deletion KODA.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ The backend does not store client credentials for login/refresh/logout — the c
├── prometheus.yml — Prometheus configuration
├── tempo.yaml — Tempo configuration
├── loki-config.yaml — Loki configuration
├── promtail-config.yaml — Promtail configuration
├── otel.yaml — OpenTelemetry Collector configuration
├── src/main/java/lt/satsyuk/ — Source code
│ ├── api/ — Controllers, DTOs
Expand Down
43 changes: 40 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -769,9 +769,46 @@ http://localhost:8081/actuator/prometheus
http://localhost:8081/actuator/health
```

**Tracing**: OpenTelemetry traces are exported to OTLP endpoint (configure in `application.properties`)

**Logging**: Structured JSON logs with trace/span IDs via Logstash encoder
**OTLP-first Observability (recommended)**

- traces: `Spring Boot -> OTLP -> OTel Collector -> Tempo`
- logs: `Spring Boot -> OTLP -> OTel Collector -> Loki`
- metrics: `Prometheus` pulls `/actuator/prometheus`

```mermaid
flowchart LR
subgraph App[Spring Boot jwt-demo]
A1[HTTP metrics\nActuator /prometheus]
A2[Traces OTLP\nmanagement.opentelemetry.tracing.export.otlp.endpoint]
A3[Logs OTLP\nmanagement.opentelemetry.logging.export.otlp.endpoint]
end

subgraph Infra[Observability Infra]
C[OTel Collector]
T[Tempo]
L[Loki]
P[Prometheus]
G[Grafana]
end

A1 -->|pull /actuator/prometheus| P
A2 -->|OTLP traces| C
A3 -->|OTLP logs| C
C -->|traces| T
C -->|logs| L

P --> G
T --> G
L --> G
```

**Recommended properties**

- `management.opentelemetry.tracing.export.otlp.endpoint=${MANAGEMENT_OTLP_TRACING_ENDPOINT:http://localhost:4318/v1/traces}`
- `management.tracing.export.otlp.enabled=true`
- `management.opentelemetry.logging.export.otlp.endpoint=${MANAGEMENT_OTLP_LOGGING_ENDPOINT:http://localhost:4318/v1/logs}`
- `management.logging.export.otlp.enabled=true`
- `management.otlp.metrics.export.enabled=false` (avoid duplicate metric ingestion with Prometheus scrape)

---

Expand Down
16 changes: 2 additions & 14 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ services:
KEYCLOAK_RESOURCE_CLIENT_SECRET: ${KEYCLOAK_RESOURCE_CLIENT_SECRET}

MANAGEMENT_OTLP_TRACING_ENDPOINT: http://otel-collector:4318/v1/traces
MANAGEMENT_OTLP_LOGGING_ENDPOINT: http://otel-collector:4318/v1/logs
OTEL_RESOURCE_ATTRIBUTES: service.name=jwt-demo,service.namespace=jwt-demo,deployment.environment=local
depends_on:
- keycloak
- postgres-app
Expand Down Expand Up @@ -122,20 +124,6 @@ services:
- loki-data:/loki
- loki-wal:/wal

# ------------------------------------------------------------
# PROMTAIL (LOG SHIPPER)
# ------------------------------------------------------------
promtail:
image: grafana/promtail:2.8.2
container_name: promtail
volumes:
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./promtail-config.yaml:/etc/promtail/promtail-config.yaml
command: -config.file=/etc/promtail/promtail-config.yaml
depends_on:
- loki

# ------------------------------------------------------------
# PROMETHEUS (METRICS)
Expand Down
19 changes: 15 additions & 4 deletions grafana/provisioning/dashboards/app-metrics.json
Original file line number Diff line number Diff line change
Expand Up @@ -659,7 +659,7 @@
},
"editorMode": "code",
"exemplar": true,
"expr": "histogram_quantile(.99,sum(rate(http_server_requests_seconds_bucket{job=\"$job\", uri!=\"/actuator/prometheus\"}[1m])) by(uri, le))",
"expr": "histogram_quantile(.99,sum(rate(http_server_requests_seconds_bucket{job=\"$job\", uri!=\"/actuator/prometheus\"}[1m])) by(uri, le)) or avg by(uri) (http_server_requests_seconds{job=\"$job\", uri!=\"/actuator/prometheus\", quantile=\"0.99\"})",
"interval": "",
"legendFormat": "{{uri}}",
"range": true,
Expand Down Expand Up @@ -755,7 +755,7 @@
},
"editorMode": "code",
"exemplar": true,
"expr": "histogram_quantile(.95,sum(rate(http_server_requests_seconds_bucket{job=\"$job\", uri!=\"/actuator/prometheus\"}[1m])) by(uri, le))",
"expr": "histogram_quantile(.95,sum(rate(http_server_requests_seconds_bucket{job=\"$job\", uri!=\"/actuator/prometheus\"}[1m])) by(uri, le)) or avg by(uri) (http_server_requests_seconds{job=\"$job\", uri!=\"/actuator/prometheus\", quantile=\"0.95\"})",
"interval": "",
"legendFormat": "{{uri}}",
"range": true,
Expand Down Expand Up @@ -992,7 +992,7 @@
"uid": "loki"
},
"editorMode": "code",
"expr": "sum by(level) (rate({job=\"jwt-demo\"} |= \"$log_keyword\" | json | level != \"\" [1m]))",
"expr": "sum by(level) (rate({service_name=\"jwt-demo\",level=~\".+\"} |= \"$log_keyword\" != \"/actuator/prometheus\" [1m]) or rate({job=~\".*jwt-demo.*\",level=~\".+\"} |= \"$log_keyword\" != \"/actuator/prometheus\" [1m]))",
"legendFormat": "{{level}}",
"queryType": "range",
"refId": "A"
Expand Down Expand Up @@ -1031,10 +1031,21 @@
"uid": "loki"
},
"editorMode": "code",
"expr": "{job=\"jwt-demo\"} |= \"$log_keyword\" | json | line_format \"{{ index . \\\"@timestamp\\\" }}\\t{{.logger_name}}\\t{{.level}}\\ttrace_id={{.traceId}}\\tspan_id={{.spanId}}\\t{{.message}}\"",
"expr": "{service_name=\"jwt-demo\"} |= \"$log_keyword\" != \"/actuator/prometheus\" | json | line_format \"{{.severity}}\\t{{.body}}\"",
"hide": false,
"queryType": "range",
"refId": "A"
},
{
"datasource": {
"type": "loki",
"uid": "loki"
},
"editorMode": "code",
"expr": "{job=~\".*jwt-demo.*\",service_name!~\".+\"} |= \"$log_keyword\" != \"/actuator/prometheus\" | json | line_format \"{{.severity}}\\t{{.body}}\"",
"hide": false,
"queryType": "range",
"refId": "B"
}
],
"title": "Log of All Spring Boot Apps",
Expand Down
26 changes: 13 additions & 13 deletions grafana/provisioning/dashboards/application-metrics-dashboard.json
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "histogram_quantile(0.95, sum by(uri, le) (rate(http_server_requests_seconds_bucket{job=\"jwt-demo\",uri!=\"/actuator/prometheus\"}[5m]))) * 1000",
"expr": "(histogram_quantile(0.95, sum by(uri, le) (rate(http_server_requests_seconds_bucket{job=\"jwt-demo\",uri!=\"/actuator/prometheus\"}[5m]))) * 1000) or (avg by(uri) (http_server_requests_seconds{job=\"jwt-demo\",uri!=\"/actuator/prometheus\",quantile=\"0.95\"}) * 1000)",
"legendFormat": "{{uri}}",
"refId": "A"
}
Expand Down Expand Up @@ -2011,7 +2011,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "sum by(cache) (rate(cache_gets_total{job=\"jwt-demo\",result=\"hit\"}[5m])) / sum by(cache) (rate(cache_gets_total{job=\"jwt-demo\"}[5m]))",
"expr": "(sum by(cache) (rate(cache_gets_total{job=\"jwt-demo\",result=\"hit\"}[5m])) / clamp_min(sum by(cache) (rate(cache_gets_total{job=\"jwt-demo\"}[5m])), 1e-9)) or on() vector(0)",
"legendFormat": "{{cache}}",
"refId": "A"
}
Expand Down Expand Up @@ -2098,7 +2098,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "rate(cache_evictions_total{job=\"jwt-demo\"}[5m])",
"expr": "rate(cache_evictions_total{job=\"jwt-demo\"}[5m]) or on() vector(0)",
"legendFormat": "{{cache}}",
"refId": "A"
}
Expand Down Expand Up @@ -2184,7 +2184,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "bucket4j_summary_consumed_total{job=\"jwt-demo\"}",
"expr": "bucket4j_summary_available_tokens{job=\"jwt-demo\"} or bucket4j_summary_consumed_total{job=\"jwt-demo\"}",
"legendFormat": "{{id}}",
"refId": "A"
}
Expand Down Expand Up @@ -2271,7 +2271,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "rate(bucket4j_summary_rejected_total{job=\"jwt-demo\"}[5m])",
"expr": "rate(bucket4j_summary_rejected_total{job=\"jwt-demo\"}[5m]) or on() vector(0)",
"legendFormat": "{{id}}",
"refId": "A"
}
Expand Down Expand Up @@ -2358,7 +2358,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "sum by(uri) (rate(http_client_requests_seconds_count{job=\"jwt-demo\"}[5m]))",
"expr": "sum by(uri) (rate(http_client_requests_seconds_count{job=\"jwt-demo\"}[5m])) or on() vector(0)",
"legendFormat": "{{uri}}",
"refId": "A"
}
Expand Down Expand Up @@ -2445,7 +2445,7 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "histogram_quantile(0.95, sum by(uri, le) (rate(http_client_requests_seconds_bucket{job=\"jwt-demo\"}[5m]))) * 1000",
"expr": "(histogram_quantile(0.95, sum by(uri, le) (rate(http_client_requests_seconds_bucket{job=\"jwt-demo\"}[5m]))) * 1000) or (avg by(uri) (http_client_requests_seconds{job=\"jwt-demo\",quantile=\"0.95\"}) * 1000) or on() vector(0)",
"legendFormat": "{{uri}}",
"refId": "A"
}
Expand Down Expand Up @@ -2532,12 +2532,12 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "rate(container_cpu_usage_seconds_total{container=~\"jwt-demo.*\"}[5m])",
"legendFormat": "{{container}}",
"expr": "process_cpu_usage{job=\"jwt-demo\"}",
"legendFormat": "process_cpu_usage",
"refId": "A"
}
],
"title": "Container CPU Usage",
"title": "Process CPU Usage",
"type": "timeseries"
},
{
Expand Down Expand Up @@ -2619,12 +2619,12 @@
"type": "prometheus",
"uid": "prometheus"
},
"expr": "container_memory_working_set_bytes{container=~\"jwt-demo.*\"}",
"legendFormat": "{{container}}",
"expr": "sum(jvm_memory_used_bytes{job=\"jwt-demo\",area=\"heap\"})",
"legendFormat": "jvm_heap_used",
"refId": "A"
}
],
"title": "Container Memory Working Set",
"title": "JVM Heap Used",
"type": "timeseries"
}
],
Expand Down
8 changes: 6 additions & 2 deletions grafana/provisioning/dashboards/logs-dashboard.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,15 @@
},
"targets": [
{
"expr": "{job=\"jwt-demo\"}",
"expr": "{service_name=\"jwt-demo\"}",
"refId": "A"
},
{
"expr": "{job=~\".*jwt-demo.*\",service_name!~\".+\"}",
"refId": "B"
}
],
"title": "Recent logs (jwt-demo)",
"title": "Recent logs (jwt-demo via OTLP)",
"type": "logs"
}
],
Expand Down
12 changes: 6 additions & 6 deletions grafana/provisioning/dashboards/traces-dashboard.json
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
"limit": 20,
"queryType": "traceql",
"refId": "A",
"query": "{}"
"query": "{ span.http.route != \"/actuator/prometheus\" && name !~ \".*actuator/prometheus.*\" }"
}
],
"title": "Traces by Service",
Expand Down Expand Up @@ -138,7 +138,7 @@
"limit": 20,
"queryType": "traceql",
"refId": "A",
"query": "{}"
"query": "{ name != \"http get /actuator/prometheus\" }"
}
],
"title": "Total Traces",
Expand Down Expand Up @@ -226,7 +226,7 @@
"limit": 20,
"queryType": "traceql",
"refId": "A",
"query": "{}"
"query": "{ name != \"http get /actuator/prometheus\" }"
}
],
"title": "Request Duration (P95) by Service",
Expand Down Expand Up @@ -313,7 +313,7 @@
"limit": 20,
"queryType": "traceql",
"refId": "A",
"query": "{}"
"query": "{ name != \"http get /actuator/prometheus\" }"
}
],
"title": "Request Rate by Service",
Expand Down Expand Up @@ -391,7 +391,7 @@
"limit": 20,
"queryType": "traceql",
"refId": "A",
"query": "{ status = error }"
"query": "{ status = error && name != \"http get /actuator/prometheus\" }"
}
],
"title": "Error Traces",
Expand Down Expand Up @@ -503,7 +503,7 @@
"limit": 50,
"queryType": "traceql",
"refId": "A",
"query": "{}"
"query": "{ name != \"http get /actuator/prometheus\" }"
}
],
"title": "Recent Traces",
Expand Down
2 changes: 1 addition & 1 deletion grafana/provisioning/datasources/loki.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ datasources:
derivedFields:
- name: trace
datasourceUid: tempo
matcher: "trace"
matcherRegex: '"traceId":"([a-f0-9]{32})"'
4 changes: 2 additions & 2 deletions grafana/provisioning/datasources/tempo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ datasources:
httpMethod: GET
tracesToLogs:
datasourceUid: 'loki'
tags: ['job', 'instance', 'pod', 'namespace']
mappedTags: [{ key: 'service.name', value: 'service' }]
tags: ['service.name']
mappedTags: [{ key: 'service.name', value: 'service_name' }]
serviceMap:
datasourceUid: 'Tempo'
nodeGraph:
Expand Down
29 changes: 28 additions & 1 deletion otel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,36 @@ exporters:
endpoint: tempo:4317
tls:
insecure: true
loki:
endpoint: http://loki:3100/loki/api/v1/push

processors:
memory_limiter:
check_interval: 1s
limit_mib: 256
spike_limit_mib: 64
batch:
timeout: 2s
send_batch_size: 1024
resource/logs:
attributes:
- action: upsert
key: service.name
value: jwt-demo
- action: upsert
key: loki.resource.labels
value: service.name,service.namespace,service.instance.id
- action: upsert
key: loki.attribute.labels
value: level,logger_name,traceId,spanId

service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp]
logs:
receivers: [otlp]
processors: [memory_limiter, resource/logs, batch]
exporters: [loki]
Loading
Loading