Kubernetes 1.31 Essentials: Pods, Deployments, Services & Scaling

Kubernetes Architecture: The 10,000-Foot View

Kubernetes (K8s) is a container orchestration platform. It schedules containers across a cluster of machines, handles networking between them, scales them up/down based on load, restarts them when they crash, and rolls out updates without downtime.

The architecture: the Control Plane (API Server, etcd, Scheduler, Controller Manager) manages cluster state. Worker Nodes run your containers in Pods. You interact with the API Server using kubectl or YAML manifests.

Everything in Kubernetes is a "resource" with a declarative YAML specification. You declare the desired state ("I want 3 replicas of this container"), and Kubernetes continuously works to make the actual state match the desired state.

Key Takeaways

Declarative: you describe desired state, K8s makes it happen.

Control Plane: API Server (entry point), etcd (state store), Scheduler (placement).

Worker Nodes: run Pods (containers). kubelet manages Pods on each node.

Reconciliation loop: K8s constantly compares desired vs actual state.

Pods, Deployments & ReplicaSets

A Pod is the smallest deployable unit — one or more containers that share networking (same IP), storage, and lifecycle. In practice, most Pods run a single container. Multi-container Pods are used for sidecar patterns (log collectors, service mesh proxies).

A Deployment manages ReplicaSets and Pods. It handles: creating N replicas, rolling updates (updating containers one at a time), rollbacks (reverting to a previous version), and self-healing (restarting crashed containers).

You almost never create Pods directly. Instead, you create a Deployment, which creates a ReplicaSet, which creates Pods. This three-level hierarchy enables seamless updates and rollbacks.

Snippet

# deployment.yaml — Production-ready deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ink-api
  labels:
    app: ink-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ink-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max 1 extra Pod during update
      maxUnavailable: 0   # Never reduce below desired count
  template:
    metadata:
      labels:
        app: ink-api
    spec:
      containers:
        - name: api
          image: inkandhorizon/api:v2.1.0
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: 250m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: ink-secrets
                  key: database-url

Key Takeaways

maxUnavailable: 0 ensures zero downtime during rolling updates.

readinessProbe: traffic only routes to Pods that pass health checks.

livenessProbe: restart Pods that become unresponsive (deadlocks).

resources.requests: guaranteed CPU/memory. limits: maximum allowed.

Secrets: never hardcode credentials — use Kubernetes Secrets or Vault.

Services & Networking

Pods get random IPs that change on restart. Services provide a stable DNS name and IP that routes traffic to healthy Pods. There are three Service types: ClusterIP (internal traffic), NodePort (expose on every node), and LoadBalancer (cloud provider load balancer).

For production ingress (routing external HTTP/HTTPS traffic to services), Kubernetes 1.31 recommends the Gateway API — the next-generation replacement for the Ingress resource.

Snippet

# Service: Stable endpoint for Pods
apiVersion: v1
kind: Service
metadata:
  name: ink-api-service
spec:
  selector:
    app: ink-api   # Routes to Pods with this label
  ports:
    - port: 80
      targetPort: 3000
  type: ClusterIP    # Internal only (default)

---
# Gateway API: Modern HTTP routing (replaces Ingress)
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: ink-routes
spec:
  parentRefs:
    - name: main-gateway
  hostnames:
    - "inkandhorizon.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /api
      backendRefs:
        - name: ink-api-service
          port: 80
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: ink-web-service
          port: 80

HPA v2: Horizontal Pod Autoscaling

HPA v2 automatically scales the number of Pod replicas based on metrics: CPU utilization, memory usage, custom metrics (request rate, queue depth), or external metrics (cloud provider metrics).

The key parameters: minReplicas (floor), maxReplicas (ceiling), and target utilization. HPA checks metrics every 15 seconds and scales up/down to keep utilization near the target. There is a 5-minute cooldown after scale-down to prevent oscillation.

Snippet

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ink-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ink-api
  minReplicas: 2
  maxReplicas: 10
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 50        # Scale up max 50% at a time
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300  # 5min cooldown
      policies:
        - type: Pods
          value: 1          # Scale down 1 Pod at a time
          periodSeconds: 60
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70    # Scale when CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80    # Scale when memory > 80%

Key Takeaways

HPA v2 supports multiple metrics simultaneously.

behavior section controls scale-up/down speed to prevent oscillation.

minReplicas: 2 ensures high availability (survives 1 Pod failure).

Scale up fast (50%/min), scale down slow (1 Pod/min) is the safe pattern.

Requires metrics-server installed in the cluster.

Pod Disruption Budgets & High Availability

Pod Disruption Budgets (PDBs) protect your application during voluntary disruptions (node maintenance, cluster upgrades, spot instance interruptions). They guarantee that a minimum number of Pods remain available during disruptions.

Without a PDB, a node drain during maintenance can take down ALL your Pods on that node simultaneously. With a PDB saying "keep at least 2 Pods running," Kubernetes drains nodes one at a time, waiting for replacement Pods to start before draining the next.

Snippet

# PDB: Always keep at least 2 replicas available
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: ink-api-pdb
spec:
  minAvailable: 2   # OR: maxUnavailable: 1
  selector:
    matchLabels:
      app: ink-api

# Topology Spread: Distribute Pods across nodes/zones
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: ink-api
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app: ink-api

Key Takeaways

PDB: prevents voluntary disruptions from taking down too many Pods.

minAvailable: 2 = always keep at least 2 healthy Pods.

topologySpreadConstraints: distribute Pods across nodes and zones.

Cross-zone spread = survives an entire availability zone failure.

Without PDBs, cluster upgrades can cause outages.

Key Takeaways

Kubernetes is the "operating system of the cloud" — it schedules, scales, heals, and updates your containers across a cluster. The core concepts are: Pods (compute), Services (networking), Deployments (lifecycle), HPA (scaling), and PDBs (availability).

For system design interviews: explain the Pod → ReplicaSet → Deployment hierarchy, describe rolling updates with zero downtime, discuss HPA scaling policies, and demonstrate how PDBs + topology spread constraints achieve high availability.

In 2026, the Gateway API replaces Ingress for HTTP routing. Use HPA v2 with behavior controls for stable autoscaling. Always set resource requests/limits and health probes on every container.

Key Takeaways

Pod = smallest unit. Deployment = manages Pods with rolling updates.

Service = stable DNS for Pods. Gateway API = modern HTTP routing.

HPA v2 = autoscale on CPU, memory, or custom metrics.

PDB = protect availability during node maintenance.

Resources: requests (guaranteed) vs limits (maximum). Always set both.

Health probes: readiness (traffic routing) + liveness (restart policy).

Article Author

Ashutosh

Lead Developer

# deployment.yaml — Production-ready deployment apiVersion: apps/v1 kind: Deployment metadata: name: ink-api labels: app: ink-api spec: replicas: 3 selector: matchLabels: app: ink-api strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # Max 1 extra Pod during update maxUnavailable: 0 # Never reduce below desired count template: metadata: labels: app: ink-api spec: containers: - name: api image: inkandhorizon/api:v2.1.0 ports: - containerPort: 3000 resources: requests: cpu: 250m memory: 256Mi limits: cpu: 500m memory: 512Mi readinessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 15 periodSeconds: 20 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: ink-secrets key: database-url

# Service: Stable endpoint for Pods apiVersion: v1 kind: Service metadata: name: ink-api-service spec: selector: app: ink-api # Routes to Pods with this label ports: - port: 80 targetPort: 3000 type: ClusterIP # Internal only (default) --- # Gateway API: Modern HTTP routing (replaces Ingress) apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: ink-routes spec: parentRefs: - name: main-gateway hostnames: - "inkandhorizon.com" rules: - matches: - path: type: PathPrefix value: /api backendRefs: - name: ink-api-service port: 80 - matches: - path: type: PathPrefix value: / backendRefs: - name: ink-web-service port: 80

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: ink-api-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: ink-api minReplicas: 2 maxReplicas: 10 behavior: scaleUp: stabilizationWindowSeconds: 60 policies: - type: Percent value: 50 # Scale up max 50% at a time periodSeconds: 60 scaleDown: stabilizationWindowSeconds: 300 # 5min cooldown policies: - type: Pods value: 1 # Scale down 1 Pod at a time periodSeconds: 60 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 # Scale when CPU > 70% - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 # Scale when memory > 80%

# PDB: Always keep at least 2 replicas available apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: ink-api-pdb spec: minAvailable: 2 # OR: maxUnavailable: 1 selector: matchLabels: app: ink-api # Topology Spread: Distribute Pods across nodes/zones apiVersion: apps/v1 kind: Deployment spec: template: spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: ink-api - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: ScheduleAnyway labelSelector: matchLabels: app: ink-api

Kubernetes Architecture: The 10,000-Foot View

Key Takeaways

Pods, Deployments & ReplicaSets

Key Takeaways

Services & Networking

HPA v2: Horizontal Pod Autoscaling

Key Takeaways

Pod Disruption Budgets & High Availability

Key Takeaways

Key Takeaways

Key Takeaways

Related Knowledge

Docker & Kubernetes for Production

Understanding Closures in JavaScript: The Complete 2026 Guide

React 19 Server Components: The Definitive 2026 Guide

Next.js 15 App Router Masterclass: Everything You Need to Know

Kubernetes Architecture: The 10,000-Foot View

Key Takeaways

Pods, Deployments & ReplicaSets

Key Takeaways

Services & Networking

HPA v2: Horizontal Pod Autoscaling

Key Takeaways

Pod Disruption Budgets & High Availability

Key Takeaways

Key Takeaways

Key Takeaways

Related Knowledge

Docker & Kubernetes for Production

Understanding Closures in JavaScript: The Complete 2026 Guide

React 19 Server Components: The Definitive 2026 Guide

Next.js 15 App Router Masterclass: Everything You Need to Know