Phase 11 — Advanced Observability

Complete visibility into logs, metrics, traces, and alerts.

Three Pillars

Metrics  → Prometheus + Grafana
Logs     → Loki + Grafana
Traces   → Jaeger
Alerts   → Alertmanager

Loki — Log Aggregation

Install Loki stack:

helm repo add grafana https://grafana.github.io/helm-charts
helm install loki-stack grafana/loki-stack \
  --namespace monitoring \
  --set grafana.enabled=false \
  --set promtail.enabled=true

Promtail ships pod logs to Loki. Query them from Grafana using LogQL.

Alertmanager

Configure alerts for critical conditions:

groups:
  - name: cluster
    rules:
      - alert: NodeDown
        expr: up{job="node"} == 0
        for: 5m
        annotations:
          summary: "Node {{ $labels.instance }} is down"

Jaeger — Distributed Tracing

kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/latest/download/jaeger-operator.yaml

Add tracing to your apps using OpenTelemetry SDKs.

Full Observability Dashboard (Grafana)

Panel	Data Source
Node CPU/RAM/Disk	Prometheus (node-exporter)
Pod health	Prometheus (kube-state-metrics)
Application logs	Loki
Request traces	Jaeger
Active alerts	Alertmanager

Done When

✔ Logs flowing into Loki
✔ Alerts firing on node failures
✔ Traces visible for app requests
✔ Single Grafana dashboard shows everything

Three Pillars​

Loki — Log Aggregation​

Alertmanager​

Jaeger — Distributed Tracing​

Full Observability Dashboard (Grafana)​

Done When​