Skip to main content

Phase 17 — Temporal — Code-Based Workflow Orchestration

Temporal is a workflow orchestration engine for developers. You write workflows in real code (Python, Go, TypeScript, Java) — Temporal handles retries, timeouts, state persistence, and failure recovery automatically. Think of it as a durable, distributed function runner.


n8n vs Temporal

n8nTemporal
InterfaceVisual (no-code / low-code)Code (Python, Go, TypeScript)
AudienceOps, non-developersDevelopers
ComplexitySimple automationsComplex long-running workflows
RetriesBasicSophisticated (exponential backoff, deadlines)
StateLimitedFull — survives restarts and crashes
Best forAPI integrations, notificationsBusiness logic, sagas, data pipelines

Use both: n8n for simple event-driven automations, Temporal for complex business workflows.


Key Concepts

Workflow → A durable function that can run for minutes, hours, or months
Activity → A single step (call API, send email, run script)
Worker → A process that executes workflows and activities
Signal → Send data into a running workflow
Query → Read state from a running workflow

Deploy Temporal in k3s

helm repo add temporalio https://charts.temporal.io

helm install temporal temporalio/temporal \
--namespace temporal \
--create-namespace \
--set server.replicaCount=1 \
--set cassandra.config.cluster_size=1 \
--set elasticsearch.enabled=false \
--set prometheus.enabled=false \
--set grafana.enabled=false

Access the Temporal UI:

kubectl port-forward -n temporal svc/temporal-web 8088:8088

Open: http://localhost:8088


Example Workflow — Node Provisioning Pipeline

This is the kind of workflow Temporal excels at — long-running, multi-step, with retries:

# workflows/provision_node.py
from temporalio import workflow, activity
from datetime import timedelta

@activity.defn
async def trigger_maas_deploy(node_name: str) -> str:
# Call MAAS API to deploy node
import httpx
resp = httpx.post(
"http://10.0.0.1:5240/MAAS/api/2.0/machines/",
data={"hostname": node_name, "distro_series": "noble"}
)
return resp.json()["system_id"]

@activity.defn
async def wait_for_deployment(system_id: str) -> bool:
import httpx, asyncio
for _ in range(60): # 30 minutes max
status = httpx.get(
f"http://10.0.0.1:5240/MAAS/api/2.0/machines/{system_id}/"
).json()["status_name"]
if status == "Deployed":
return True
await asyncio.sleep(30)
return False

@activity.defn
async def run_ansible(node_ip: str) -> None:
import subprocess
subprocess.run([
"ansible-playbook",
"-i", f"{node_ip},",
"playbooks/configure-nodes.yml"
], check=True)

@workflow.defn
class ProvisionNodeWorkflow:
@workflow.run
async def run(self, node_name: str, node_ip: str) -> str:
system_id = await workflow.execute_activity(
trigger_maas_deploy,
node_name,
start_to_close_timeout=timedelta(minutes=5),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)

deployed = await workflow.execute_activity(
wait_for_deployment,
system_id,
start_to_close_timeout=timedelta(minutes=35)
)

if deployed:
await workflow.execute_activity(
run_ansible,
node_ip,
start_to_close_timeout=timedelta(minutes=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)

return f"Node {node_name} provisioned and configured"

Example Workflow — CI/CD Saga

@workflow.defn
class DeployWorkflow:
@workflow.run
async def run(self, app: str, version: str) -> str:
# Build image
await workflow.execute_activity(build_docker_image, app, version)

# Scan for vulnerabilities
scan_ok = await workflow.execute_activity(scan_image_harbor, app, version)
if not scan_ok:
await workflow.execute_activity(notify_slack, f"❌ {app}:{version} failed security scan")
return "aborted"

# Deploy to staging
await workflow.execute_activity(argocd_sync, app, "staging")

# Run integration tests
tests_ok = await workflow.execute_activity(run_integration_tests, app)
if not tests_ok:
await workflow.execute_activity(argocd_rollback, app, "staging")
return "rolled back"

# Promote to production
await workflow.execute_activity(argocd_sync, app, "production")
await workflow.execute_activity(notify_slack, f"✅ {app}:{version} deployed to production")
return "deployed"

If any step fails, Temporal retries it. If the worker crashes mid-workflow, Temporal resumes from where it left off. This is impossible to replicate reliably with bash scripts or CI pipelines alone.


Use Cases on This Cluster

✔ Node provisioning pipeline (MAAS → Ansible → Kubernetes join)
✔ Full CI/CD saga (build → scan → stage → test → promote)
✔ ML training pipeline (data prep → train → evaluate → register)
✔ Scheduled cleanup (old images, expired tokens, stale namespaces)
✔ Multi-step incident response (detect → diagnose → remediate → report)

Done When

✔ Temporal server Running in temporal namespace
✔ Temporal UI accessible
✔ First workflow registered and executed
✔ Worker running and processing activities
✔ Workflow visible in UI with event history