OpenLume
Sign in
Roadmaps/DevOps Engineer
DevOps Engineer Roadmap · + optional SRE & Platform Engineering paths

DevOps Engineer Roadmap

Stuck as a DevOps Engineer?

Ask any question, get a 60-second video answer. Free for the first 5 a month — no credit card.

Browse the full curriculum below — 17 modules · 131 topics · 10-14 weeks

▶ Tap any topic or module to ask a question about it.

START

Personalized setup

Choose your experience level and goals before beginning.

PRODUCTION READY
START

Personalized setup

Choose your experience level and goals before beginning.

PRODUCTION READY

Capstone Project

What you'll build by the end

You'll build the deployment platform for DeployBot — a sample Node.js + PostgreSQL application. Starting from a blank terminal, you'll containerize the app, set up automated CI/CD with GitHub Actions, provision a full AWS environment with Terraform (VPC, EKS, RDS, ECR), package everything into Helm charts, wire up Argo CD for GitOps delivery, and build a Prometheus + Grafana + OpenTelemetry observability stack with real alerting. By the end, pushing to main triggers a fully automated build → test → deploy → observe pipeline — the same workflow used at companies like Spotify and Shopify.

Hands-on tool: paste any Kubernetes manifest and see it explained — 100% in your browser, no signup.
Try the YAML visualizer→

Don't want the whole roadmap? Just ask one question and get an instant video answer.

Full Curriculum

17 modules · 131 topics · 10-14 weeks

Click any topic to ask a question about it.

Module 1

Every tool in this roadmap runs on Linux. Get fluent with the shell, file system, processes, networking, and scripting — the skills that separate people who use DevOps tools from people who understand them.

Project: Spin up an Ubuntu VM, configure SSH key-based login, create a deploy user with sudo access, and write a bash script that checks disk usage, memory, and running services — scheduled with cron to log every 5 minutes.

Module 2

Every pipeline, every GitOps tool, every IaC repo assumes you understand Git deeply. Get past 'git pull, git push' to actually controlling history and collaborating safely.

Project: Initialize the DeployBot repo, set up branch protection on main, configure CODEOWNERS, and run through a realistic PR workflow including a feature branch, conflict resolution, and a clean rebase.

Module 3

Build real pipelines that lint, test, build images, and trigger deploys — not toy examples. By the end you'll know why every step is there.

Project: Build a GitHub Actions pipeline for DeployBot: on every push, run lint and tests in parallel, build a Docker image with a git SHA tag, push it to ECR via OIDC, and post a Slack notification on result. Add a manual approval gate for prod.

Module 4

Containers are the unit of deployment in modern infra. Learn to build small, secure, reproducible images — and what Docker is actually doing under the hood.

Project: Containerize DeployBot with a multi-stage Dockerfile producing a distroless image (~50MB). Add Docker Compose with the app, Postgres, and Redis for local dev. Run Trivy to scan for vulnerabilities.

Module 5

Before deploying to Kubernetes, you need infrastructure. Understand the AWS building blocks — networking, identity, storage — that everything sits on.

Project: Manually set up DeployBot's staging environment in AWS: VPC with public/private subnets across two AZs, NAT gateway, security groups, ECR repo, and an S3 bucket for Terraform state. Document every step — you'll automate it next.

Module 6

Clicking through the AWS console doesn't scale. Define infra as code, version it, review it in PRs, and apply it safely with terraform plan.

Project: Rewrite everything you built manually in AWS as Terraform modules — VPC, EKS, ECR. Use remote state in S3 with DynamoDB locking. Separate tfvars for staging and prod. Run terraform plan as a PR check in CI.

Module 7

Deploy DeployBot to the EKS cluster you provisioned. Learn how Kubernetes schedules, networks, scales, and self-heals containers — and the resources that make it all work.

Project: Deploy DeployBot to EKS with a Deployment, Service, and Ingress with TLS via cert-manager. Use ConfigMaps and Secrets. Add an HPA on CPU. Verify a rolling update completes with zero downtime.

Module 8

A pipeline that deploys without visibility is a liability. Build the metrics, dashboards, and alerts that let you sleep at night and actually debug problems when they happen.

Project: Deploy Prometheus + Grafana via Helm. Instrument DeployBot with /metrics. Build a Grafana dashboard with the RED method (rate, errors, duration). Configure alerts for >1% error rate and >500ms p95 — routed to Slack via Alertmanager.

Module 9

Raw YAML doesn't scale past one environment. Learn to template, version, and share manifests so staging and prod use the same source with different values.

Project: Convert DeployBot's manifests into a Helm chart. Use templates with conditionals for staging vs prod (replicas, limits, hostname). Add Postgres as a chart dependency. Compare with a Kustomize-based approach.

Module 10

Stop running kubectl apply from your laptop. Argo CD watches your Git repo and reconciles your cluster to match — if it drifts, it self-heals. This is how mature teams ship.

Project: Install Argo CD on EKS. Create Applications for DeployBot staging and prod pointing at different branches. Auto-sync with self-heal on staging; manual sync with approval on prod. Wire up Argo Image Updater so new images deploy without manifest edits.

Module 11

Modern attacks target the build pipeline, not the running app. Learn to sign images, generate SBOMs, manage secrets without shoving them in env vars, and enforce policy at the cluster door.

Project: Sign DeployBot images with Cosign in CI, generate a CycloneDX SBOM, and enforce signed-images-only on the cluster with a Kyverno policy. Replace inline secrets with External Secrets Operator + AWS Secrets Manager.

Optional PathSRE

SLOs, incident response & deep observability

The modules below are optional — only follow this branch if you want to specialize in SRE. Skip ahead if not.

Optional · SRE

SRE turns reliability into a number you can negotiate. Learn to set SLIs and SLOs that mean something, run on an error budget, and know when to slow down feature work.

Project: Define SLIs and SLOs for DeployBot's two main user journeys, set a 28-day error budget, and configure multi-window burn-rate alerts in Prometheus that page only on real budget threats.

Optional · SRE

Metrics tell you something is wrong. Traces and logs tell you what and why. Build the modern observability stack with OpenTelemetry, distributed tracing, and structured logs that correlate across the three pillars.

Project: Instrument DeployBot with OpenTelemetry SDKs for traces and metrics. Ship traces to Tempo and logs to Loki. Build a Grafana dashboard that lets you click from a slow trace to its logs to its metrics — all linked by trace ID.

Optional · SRE

Incidents will happen. Mature teams have a playbook for them. Learn the roles, communication patterns, and post-incident process that turn outages into systemic improvements.

Project: Build DeployBot's incident playbook: severity matrix, on-call rotation in PagerDuty, runbook for each top alert, and a blameless postmortem template. Then run a tabletop exercise simulating a database outage.

Optional PathPlatform Engineering

Internal platforms, golden paths & DevEx

The modules below are optional — only follow this branch if you want to specialize in Platform Engineering. Skip ahead if not.

Optional · Platform Engineering

Platform Engineering is DevOps with a product mindset. Build paved roads that other engineers actually want to use — with Backstage, service catalogs, and golden paths from day one.

Project: Stand up Backstage for your org. Onboard DeployBot via a Software Template that generates the repo, CI pipeline, infra, and Argo CD manifests in one click. Build a Tech Radar and a Service Catalog with on-call info.

Optional · Platform Engineering

The platform's job is to remove tickets. Build self-service infra with Backstage Software Templates, Crossplane, and multi-tenant Kubernetes — so any engineer can spin up a fully wired-up service in minutes.

Project: Add a Backstage Software Template that creates a new microservice end-to-end: GitHub repo, CI pipeline, Kubernetes namespace, Crossplane-provisioned RDS, ingress, dashboards, and on-call rotation.

Optional · Platform Engineering

If you can't measure DevEx, you can't improve it. Learn the frameworks (DORA, SPACE) and the tactics (inner loop, docs as product) that mature platform teams use to prove their value to leadership.

Project: Instrument DeployBot's pipeline to emit DORA metrics (deploy frequency, lead time, change failure rate, MTTR) into a Grafana dashboard. Run a quarterly platform NPS survey. Define one inner-loop improvement and ship it.

Looking at a different role?

Full-Stack Developer
Frontend Developer
Backend Developer

Want a guided plan instead?

Get a full learning plan, personalized to your level and goals — ready in 30 seconds.

Generate my full learning plan

Free. No credit card.

Full plan
OpenLume
Get instant, personalized explainer videos for any tech topic.
Products
  • OpenLume
Tools
  • Kubernetes YAML Visualizer
  • GitHub Actions Visualizer
  • Dockerfile Visualizer
Roadmaps
  • All roadmaps
  • DevOps / Platform Engineer
  • Frontend Developer
  • Backend Developer
  • Full-Stack Developer
Resources
  • How it works?
  • FAQs
Company
  • Contact us
  • Privacy Policy
  • Terms of Use

Copyright © 2026 OpenLume. All rights reserved.