Skip to main content

Phase 11 — Advanced Observability

Complete visibility into logs, metrics, traces, and alerts.


Three Pillars

Metrics → Prometheus + Grafana
Logs → Loki + Grafana
Traces → Jaeger
Alerts → Alertmanager

Loki — Log Aggregation

Install Loki stack:

helm repo add grafana https://grafana.github.io/helm-charts
helm install loki-stack grafana/loki-stack \
--namespace monitoring \
--set grafana.enabled=false \
--set promtail.enabled=true

Promtail ships pod logs to Loki. Query them from Grafana using LogQL.


Alertmanager

Configure alerts for critical conditions:

groups:
- name: cluster
rules:
- alert: NodeDown
expr: up{job="node"} == 0
for: 5m
annotations:
summary: "Node {{ $labels.instance }} is down"

Jaeger — Distributed Tracing

kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/latest/download/jaeger-operator.yaml

Add tracing to your apps using OpenTelemetry SDKs.


Full Observability Dashboard (Grafana)

PanelData Source
Node CPU/RAM/DiskPrometheus (node-exporter)
Pod healthPrometheus (kube-state-metrics)
Application logsLoki
Request tracesJaeger
Active alertsAlertmanager

Done When

✔ Logs flowing into Loki
✔ Alerts firing on node failures
✔ Traces visible for app requests
✔ Single Grafana dashboard shows everything