Roadmaps/DevOps / Platform Engineer

⚙️ DevOps / Platform Engineer Roadmap

Learn to ship software the way the best teams do. You'll go from Linux basics to running a Kubernetes cluster on AWS — building the entire CI/CD pipeline, infrastructure-as-code, and monitoring stack yourself, one layer at a time.

Create my learning plan

Adjust pace, depth, and focus based on your experience.

9 modules39 topics10-14 weeks8-10 hours/weekAdapts to your background

START

Personalized setup

Choose your experience level and goals before beginning.

Module 1

Linux: The Foundation Under Everything

The Shell: Navigation, Pipes, Redirection & the Commands You'll Use Daily

Users, Groups & File Permissions: Who Can Do What

Processes & Systemd: How Linux Runs (and Restarts) Your Software

Networking Fundamentals: Ports, DNS, Firewalls & What Happens When You curl

Shell Scripting: Variables, Loops, Exit Codes & Writing Scripts That Don't Break

Hands-on project

Module 2

CI/CD Pipelines with GitHub Actions

Workflow Anatomy: Triggers, Jobs, Steps & the YAML You'll Write a Lot

Building a Real CI Pipeline: Lint, Test & Fail Fast

Secrets, Environment Variables & OIDC: No More Hardcoded Credentials

Reusable Workflows & Custom Actions: Don't Repeat Yourself Across Repos

Hands-on project

Module 3

Docker: From "Works on My Machine" to Portable Artifacts

How Containers Work: Namespaces, Cgroups & Why It's Not a VM

Dockerfiles That Don't Suck: Layer Caching, Multi-Stage Builds & .dockerignore

Image Security: Non-Root Users, Distroless Bases & Vulnerability Scanning

Docker Compose: Multi-Service Local Dev That Mirrors Production

Hands-on project

Module 4

AWS Core Services: Your Production Environment

VPC Design: Subnets, Route Tables, NAT Gateways & Why Networking Matters

IAM: Roles, Policies, Trust Relationships & the Principle of Least Privilege

ECR & S3: Where Your Images and State Files Live

EKS Overview: How AWS Runs Kubernetes (So You Don't Have To)

Hands-on project

Module 5

Terraform: Infrastructure That Fits in a Git Repo

Terraform 101: Providers, Resources, State & the Apply/Plan Loop

Variables, Locals, Outputs & Data Sources: Making Config Flexible

Modules: Reusable Infrastructure You'd Actually Share With Your Team

Remote State & Locking: Why terraform.tfstate Should Never Be Local

Terraform in CI: Plan on PR, Apply on Merge, Drift Detection on Schedule

Hands-on project

Module 6

Kubernetes: Deploying & Running Your Application

Kubernetes Architecture: What Each Component Does & How Scheduling Works

Pods, Deployments & ReplicaSets: The Core Resource Model

Services & Ingress: Routing Traffic Into Your Cluster

ConfigMaps, Secrets & Resource Limits: Configuring Apps for Production

Rolling Updates, Readiness Probes & Zero-Downtime Deploys

Hands-on project

Module 7

Helm: Packaging Kubernetes for Real Teams

Why Helm Exists: The Problem with Managing Raw YAML at Scale

Chart Anatomy: Templates, Values, Helpers & the _helpers.tpl Pattern

Templating in Practice: Conditionals, Loops & Per-Environment Overrides

Chart Dependencies & Hooks: Compose Charts and Run Migrations Safely

Hands-on project

Module 8

Argo CD: GitOps-Driven Delivery

GitOps Principles: Why the Git Repo Is the Source of Truth

Argo CD Setup: Applications, Projects & Repository Connections

Sync Policies: Auto-Sync, Self-Heal, Prune & Manual Gates

Image Updater & Notifications: Close the Loop from CI to CD

Hands-on project

Module 9

Monitoring & Observability: Know Before Your Users Do

The Three Pillars: Metrics, Logs & Traces — and When You Need Each

Prometheus: Scraping, PromQL & the Queries That Actually Matter

Grafana Dashboards: The RED Method & Building Views Your Team Will Use

Alerting Done Right: Alertmanager, Routing & Writing Alerts That Don't Cry Wolf

Hands-on project

PRODUCTION READY

START

Personalized setup

Choose your experience level and goals before beginning.

Module 1

Linux: The Foundation Under Everything

The Shell: Navigation, Pipes, Redirection & the Commands You'll Use Daily

Users, Groups & File Permissions: Who Can Do What

Processes & Systemd: How Linux Runs (and Restarts) Your Software

Networking Fundamentals: Ports, DNS, Firewalls & What Happens When You curl

Shell Scripting: Variables, Loops, Exit Codes & Writing Scripts That Don't Break

Hands-on project

Module 2

CI/CD Pipelines with GitHub Actions

Workflow Anatomy: Triggers, Jobs, Steps & the YAML You'll Write a Lot

Building a Real CI Pipeline: Lint, Test & Fail Fast

Secrets, Environment Variables & OIDC: No More Hardcoded Credentials

Reusable Workflows & Custom Actions: Don't Repeat Yourself Across Repos

Hands-on project

Module 3

Docker: From "Works on My Machine" to Portable Artifacts

How Containers Work: Namespaces, Cgroups & Why It's Not a VM

Dockerfiles That Don't Suck: Layer Caching, Multi-Stage Builds & .dockerignore

Image Security: Non-Root Users, Distroless Bases & Vulnerability Scanning

Docker Compose: Multi-Service Local Dev That Mirrors Production

Hands-on project

Module 4

AWS Core Services: Your Production Environment

VPC Design: Subnets, Route Tables, NAT Gateways & Why Networking Matters

IAM: Roles, Policies, Trust Relationships & the Principle of Least Privilege

ECR & S3: Where Your Images and State Files Live

EKS Overview: How AWS Runs Kubernetes (So You Don't Have To)

Hands-on project

Module 5

Terraform: Infrastructure That Fits in a Git Repo

Terraform 101: Providers, Resources, State & the Apply/Plan Loop

Variables, Locals, Outputs & Data Sources: Making Config Flexible

Modules: Reusable Infrastructure You'd Actually Share With Your Team

Remote State & Locking: Why terraform.tfstate Should Never Be Local

Terraform in CI: Plan on PR, Apply on Merge, Drift Detection on Schedule

Hands-on project

Module 6

Kubernetes: Deploying & Running Your Application

Kubernetes Architecture: What Each Component Does & How Scheduling Works

Pods, Deployments & ReplicaSets: The Core Resource Model

Services & Ingress: Routing Traffic Into Your Cluster

ConfigMaps, Secrets & Resource Limits: Configuring Apps for Production

Rolling Updates, Readiness Probes & Zero-Downtime Deploys

Hands-on project

Module 7

Helm: Packaging Kubernetes for Real Teams

Why Helm Exists: The Problem with Managing Raw YAML at Scale

Chart Anatomy: Templates, Values, Helpers & the _helpers.tpl Pattern

Templating in Practice: Conditionals, Loops & Per-Environment Overrides

Chart Dependencies & Hooks: Compose Charts and Run Migrations Safely

Hands-on project

Module 8

Argo CD: GitOps-Driven Delivery

GitOps Principles: Why the Git Repo Is the Source of Truth

Argo CD Setup: Applications, Projects & Repository Connections

Sync Policies: Auto-Sync, Self-Heal, Prune & Manual Gates

Image Updater & Notifications: Close the Loop from CI to CD

Hands-on project

Module 9

Monitoring & Observability: Know Before Your Users Do

The Three Pillars: Metrics, Logs & Traces — and When You Need Each

Prometheus: Scraping, PromQL & the Queries That Actually Matter

Grafana Dashboards: The RED Method & Building Views Your Team Will Use

Alerting Done Right: Alertmanager, Routing & Writing Alerts That Don't Cry Wolf

Hands-on project

PRODUCTION READY

Capstone Project

What you'll build by the end

You'll build the deployment platform for DeployBot — a sample Node.js + PostgreSQL application. Starting from a blank terminal, you'll containerize the app, set up automated CI/CD with GitHub Actions, provision a full AWS environment with Terraform (VPC, EKS, RDS, ECR), package everything into Helm charts, wire up Argo CD for GitOps delivery, and build a Prometheus + Grafana monitoring stack with real alerting. By the end, pushing to main triggers a fully automated build → test → deploy → monitor pipeline — the same workflow used at companies like Spotify and Shopify.

Full Curriculum

9 modules · 39 topics · 10-14 weeks

Module 1

Linux: The Foundation Under Everything

Every tool in this roadmap runs on Linux. Get confident with the command line, file system, networking, and shell scripting — the skills that separate people who use DevOps tools from people who understand them.

1The Shell: Navigation, Pipes, Redirection & the Commands You'll Use Daily
2Users, Groups & File Permissions: Who Can Do What
3Processes & Systemd: How Linux Runs (and Restarts) Your Software
4Networking Fundamentals: Ports, DNS, Firewalls & What Happens When You curl
5Shell Scripting: Variables, Loops, Exit Codes & Writing Scripts That Don't Break

Project: Spin up an Ubuntu instance, configure SSH key-based login, create a deploy user with sudo access, write a bash script that checks disk usage, memory, and running services — then schedule it with cron to run every 5 minutes and log output to a file.

Module 2

CI/CD Pipelines with GitHub Actions

Automate everything that happens between a git push and a running deployment. You'll build real pipelines that lint, test, build images, and trigger deployments — not toy examples.

1Workflow Anatomy: Triggers, Jobs, Steps & the YAML You'll Write a Lot
2Building a Real CI Pipeline: Lint, Test & Fail Fast
3Secrets, Environment Variables & OIDC: No More Hardcoded Credentials
4Reusable Workflows & Custom Actions: Don't Repeat Yourself Across Repos

Project: Build a GitHub Actions pipeline for DeployBot: on every push, run linting and tests in parallel, build a Docker image with a git SHA tag, push it to Amazon ECR, and post a Slack notification on success or failure. Add a manual approval gate for production deploys.

Module 3

Docker: From "Works on My Machine" to Portable Artifacts

Containers are the unit of deployment in modern infrastructure. Learn to build small, secure, reproducible images — and understand what Docker is actually doing under the hood.

1How Containers Work: Namespaces, Cgroups & Why It's Not a VM
2Dockerfiles That Don't Suck: Layer Caching, Multi-Stage Builds & .dockerignore
3Image Security: Non-Root Users, Distroless Bases & Vulnerability Scanning
4Docker Compose: Multi-Service Local Dev That Mirrors Production

Project: Containerize DeployBot: write a multi-stage Dockerfile that builds the Node.js app in one stage and runs it in a distroless image (~50MB). Set up Docker Compose with the app, PostgreSQL, and Redis for local development. Run Trivy to scan the image for vulnerabilities.

Module 4

AWS Core Services: Your Production Environment

Before you can deploy to Kubernetes, you need infrastructure. Understand the AWS building blocks — networking, compute, storage, and IAM — that everything else sits on top of.

1VPC Design: Subnets, Route Tables, NAT Gateways & Why Networking Matters
2IAM: Roles, Policies, Trust Relationships & the Principle of Least Privilege
3ECR & S3: Where Your Images and State Files Live
4EKS Overview: How AWS Runs Kubernetes (So You Don't Have To)

Project: Manually set up the DeployBot staging environment in AWS: create a VPC with public and private subnets across two AZs, configure a NAT gateway, set up security groups, create an ECR repository for container images, and create an S3 bucket for Terraform state. Document every step — you'll automate it all with Terraform next.

Module 5

Terraform: Infrastructure That Fits in a Git Repo

Clicking through the AWS console doesn't scale. Learn to define your entire infrastructure as code — version it, review it in PRs, and apply it safely with terraform plan.

1Terraform 101: Providers, Resources, State & the Apply/Plan Loop
2Variables, Locals, Outputs & Data Sources: Making Config Flexible
3Modules: Reusable Infrastructure You'd Actually Share With Your Team
4Remote State & Locking: Why terraform.tfstate Should Never Be Local
5Terraform in CI: Plan on PR, Apply on Merge, Drift Detection on Schedule

Project: Rewrite everything you built manually in Module 4 as Terraform: create reusable modules for VPC, EKS, and ECR. Use remote state in S3 with DynamoDB locking. Add separate tfvars files for staging and production environments. Run terraform plan in your CI pipeline as a PR check.

Module 6

Kubernetes: Deploying & Running Your Application

Deploy DeployBot to the EKS cluster you provisioned. Learn how Kubernetes schedules, networks, scales, and self-heals your containers — and the resource types that make it all work.

1Kubernetes Architecture: What Each Component Does & How Scheduling Works
2Pods, Deployments & ReplicaSets: The Core Resource Model
3Services & Ingress: Routing Traffic Into Your Cluster
4ConfigMaps, Secrets & Resource Limits: Configuring Apps for Production
5Rolling Updates, Readiness Probes & Zero-Downtime Deploys

Project: Deploy DeployBot to your EKS cluster: create a Deployment with resource requests/limits, a Service, and an Ingress with TLS via cert-manager. Use ConfigMaps for app config, Secrets for database credentials, and a HorizontalPodAutoscaler that scales based on CPU. Verify a rolling update completes with zero downtime.

Module 7

Helm: Packaging Kubernetes for Real Teams

Raw YAML doesn't scale past one environment. Helm lets you template, version, and share Kubernetes manifests — so staging and production use the same chart with different values.

1Why Helm Exists: The Problem with Managing Raw YAML at Scale
2Chart Anatomy: Templates, Values, Helpers & the _helpers.tpl Pattern
3Templating in Practice: Conditionals, Loops & Per-Environment Overrides
4Chart Dependencies & Hooks: Compose Charts and Run Migrations Safely

Project: Convert all DeployBot Kubernetes manifests into a Helm chart. Use templates with conditionals for staging vs. production (e.g., replica count, resource limits, ingress hostname). Add chart dependencies for PostgreSQL using the Bitnami subchart. Run helm template to verify the output and helm test to validate the deployed release.

Module 8

Argo CD: GitOps-Driven Delivery

Stop running kubectl apply from your laptop. Argo CD watches your Git repo and automatically reconciles your cluster to match — if it drifts, it self-heals. This is how mature teams ship.

1GitOps Principles: Why the Git Repo Is the Source of Truth
2Argo CD Setup: Applications, Projects & Repository Connections
3Sync Policies: Auto-Sync, Self-Heal, Prune & Manual Gates
4Image Updater & Notifications: Close the Loop from CI to CD

Project: Install Argo CD on your EKS cluster. Create Application resources for DeployBot's staging and production environments pointing at different branches. Configure auto-sync with self-heal and pruning on staging, manual sync with approval on production. Set up an automated image updater so pushing a new image tag triggers a deployment without changing any manifests.

Module 9

Monitoring & Observability: Know Before Your Users Do

A pipeline that deploys without visibility is a liability. Build the monitoring, logging, and alerting stack that lets you sleep at night — and actually debug problems when they happen.

1The Three Pillars: Metrics, Logs & Traces — and When You Need Each
2Prometheus: Scraping, PromQL & the Queries That Actually Matter
3Grafana Dashboards: The RED Method & Building Views Your Team Will Use
4Alerting Done Right: Alertmanager, Routing & Writing Alerts That Don't Cry Wolf

Project: Deploy the Prometheus + Grafana stack via Helm to your cluster. Instrument DeployBot with a /metrics endpoint, create a Grafana dashboard with request rate, error rate, latency (RED method), and pod resource usage. Configure alerting rules: alert on >1% error rate, >500ms p95 latency, and pod restarts. Route alerts to a Slack channel via Alertmanager.

Ready to start?

Create my learning plan

Full Curriculum

9 modules · 39 topics · 10-14 weeks

Module 1

Linux: The Foundation Under Everything

1The Shell: Navigation, Pipes, Redirection & the Commands You'll Use Daily
2Users, Groups & File Permissions: Who Can Do What
3Processes & Systemd: How Linux Runs (and Restarts) Your Software
4Networking Fundamentals: Ports, DNS, Firewalls & What Happens When You curl
5Shell Scripting: Variables, Loops, Exit Codes & Writing Scripts That Don't Break

Module 2

CI/CD Pipelines with GitHub Actions

Automate everything that happens between a git push and a running deployment. You'll build real pipelines that lint, test, build images, and trigger deployments — not toy examples.

1Workflow Anatomy: Triggers, Jobs, Steps & the YAML You'll Write a Lot
2Building a Real CI Pipeline: Lint, Test & Fail Fast
3Secrets, Environment Variables & OIDC: No More Hardcoded Credentials
4Reusable Workflows & Custom Actions: Don't Repeat Yourself Across Repos

Module 3

Docker: From "Works on My Machine" to Portable Artifacts

Containers are the unit of deployment in modern infrastructure. Learn to build small, secure, reproducible images — and understand what Docker is actually doing under the hood.

1How Containers Work: Namespaces, Cgroups & Why It's Not a VM
2Dockerfiles That Don't Suck: Layer Caching, Multi-Stage Builds & .dockerignore
3Image Security: Non-Root Users, Distroless Bases & Vulnerability Scanning
4Docker Compose: Multi-Service Local Dev That Mirrors Production

Module 4

AWS Core Services: Your Production Environment

Before you can deploy to Kubernetes, you need infrastructure. Understand the AWS building blocks — networking, compute, storage, and IAM — that everything else sits on top of.

1VPC Design: Subnets, Route Tables, NAT Gateways & Why Networking Matters
2IAM: Roles, Policies, Trust Relationships & the Principle of Least Privilege
3ECR & S3: Where Your Images and State Files Live
4EKS Overview: How AWS Runs Kubernetes (So You Don't Have To)

Module 5

Terraform: Infrastructure That Fits in a Git Repo

Clicking through the AWS console doesn't scale. Learn to define your entire infrastructure as code — version it, review it in PRs, and apply it safely with terraform plan.

1Terraform 101: Providers, Resources, State & the Apply/Plan Loop
2Variables, Locals, Outputs & Data Sources: Making Config Flexible
3Modules: Reusable Infrastructure You'd Actually Share With Your Team
4Remote State & Locking: Why terraform.tfstate Should Never Be Local
5Terraform in CI: Plan on PR, Apply on Merge, Drift Detection on Schedule

Module 6

Kubernetes: Deploying & Running Your Application

Deploy DeployBot to the EKS cluster you provisioned. Learn how Kubernetes schedules, networks, scales, and self-heals your containers — and the resource types that make it all work.

1Kubernetes Architecture: What Each Component Does & How Scheduling Works
2Pods, Deployments & ReplicaSets: The Core Resource Model
3Services & Ingress: Routing Traffic Into Your Cluster
4ConfigMaps, Secrets & Resource Limits: Configuring Apps for Production
5Rolling Updates, Readiness Probes & Zero-Downtime Deploys

Module 7

Helm: Packaging Kubernetes for Real Teams

Raw YAML doesn't scale past one environment. Helm lets you template, version, and share Kubernetes manifests — so staging and production use the same chart with different values.

1Why Helm Exists: The Problem with Managing Raw YAML at Scale
2Chart Anatomy: Templates, Values, Helpers & the _helpers.tpl Pattern
3Templating in Practice: Conditionals, Loops & Per-Environment Overrides
4Chart Dependencies & Hooks: Compose Charts and Run Migrations Safely

Module 8

Argo CD: GitOps-Driven Delivery

Stop running kubectl apply from your laptop. Argo CD watches your Git repo and automatically reconciles your cluster to match — if it drifts, it self-heals. This is how mature teams ship.

1GitOps Principles: Why the Git Repo Is the Source of Truth
2Argo CD Setup: Applications, Projects & Repository Connections
3Sync Policies: Auto-Sync, Self-Heal, Prune & Manual Gates
4Image Updater & Notifications: Close the Loop from CI to CD

Module 9

Monitoring & Observability: Know Before Your Users Do

A pipeline that deploys without visibility is a liability. Build the monitoring, logging, and alerting stack that lets you sleep at night — and actually debug problems when they happen.

1The Three Pillars: Metrics, Logs & Traces — and When You Need Each
2Prometheus: Scraping, PromQL & the Queries That Actually Matter
3Grafana Dashboards: The RED Method & Building Views Your Team Will Use
4Alerting Done Right: Alertmanager, Routing & Writing Alerts That Don't Cry Wolf

⚙️ DevOps / Platform Engineer Roadmap

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Capstone Project

Full Curriculum

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Ready to start?

Products

Resources

Company

⚙️ DevOps / Platform Engineer Roadmap

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Capstone Project

Full Curriculum

Linux: The Foundation Under Everything

CI/CD Pipelines with GitHub Actions

Docker: From "Works on My Machine" to Portable Artifacts

AWS Core Services: Your Production Environment

Terraform: Infrastructure That Fits in a Git Repo

Kubernetes: Deploying & Running Your Application

Helm: Packaging Kubernetes for Real Teams

Argo CD: GitOps-Driven Delivery

Monitoring & Observability: Know Before Your Users Do

Ready to start?

Products

Resources

Company