Kubernetes Deployment YAML Guide for Production-Ready Apps

Chafik Belhaoues

Deployment YAML looks simple until you start deploying in production. Incorrect probes, forgotten resource limits, a poor update strategy - and instead of a stable service, you get CrashLoopBackOff. In this guide, we'll break down how to write Kubernetes deployment YAML so that applications run reliably and update without downtime.

What Is a Kubernetes Deployment YAML?

A Kubernetes deployment YAML file is a declarative configuration file that describes the desired state of an application in a Kubernetes cluster. It is the central building block for container management and answers three key questions:

  • Which container images to run
  • How many replicas to keep running
  • How to perform updates and rollbacks

Kubernetes constantly compares the current state of the cluster with the description in the deployment file. If a pod goes down, the controller automatically creates a new one. If three replicas are specified and two are running, the third will be launched without an engineer's involvement. This declarative approach is the basis for deployment strategies: instead of step-by-step instructions, you describe the desired result, and Kubernetes determines how to achieve it.

Anatomy of a Kubernetes Deployment File

Each Kubernetes deployment file consists of several mandatory blocks. Understanding their structure is the basis for a competent Kubernetes deployment strategy, helping you avoid common deployment errors.

Metadata and Labels

The metadata section contains the deployment name, namespace, and a set of labels with annotations. Labels are key-value pairs that are used to organize, filter, and select resources across the cluster. For example, the labels app: web-api and env: production allow you to find all production pods for a specific service quickly.

Spec and Replicas

The spec block defines the desired state of the deployment:

  • replicas - the number of pods that Kubernetes should maintain
  • selector - a matching rule by which the deployment finds "its" pods
  • matchLabels - links the deployment to pods via labels

Kubernetes continuously monitors to ensure that the number of running pods matches the specified value replicas.

Pod Template and Containers

The template section describes the pod template - the actual workload that Kubernetes will schedule and run:

  • image - container image and its tag
  • ports - open container ports
  • env - environment variables
  • resources.requests and resources.limits - requested and maximum allowable CPU and memory resources

Health Checks and Probes

According to the Kubernetes deployment documentation, health checks are critical for production reliability:

  • Liveness probe - determines whether the container is alive. If it fails, Kubernetes restarts the pod.
  • Readiness probe - determines whether the container is ready to accept traffic. Until it is ready, the pod is excluded from the Service.
  • Startup probe - gives the container extra time to start up, useful for heavy applications.

Without probes, Kubernetes cannot automatically detect and fix failures, which makes the application fragile.

Kubernetes Deployment Strategies Explained

Choosing the right update strategy depends on application requirements and acceptable risk levels. Let's look at the main deployment strategies.

Rolling Update

The default strategy: Kubernetes gradually replaces old pods with new ones. Key parameters:

  • maxSurge - how many additional pods can be created beyond the desired number
  • maxUnavailable - how many pods can be unavailable at the same time

Rolling update is ideal for most production workloads that require zero downtime. The update is smooth: new pods pass the readiness probe before the old ones are terminated.

Recreate

The Recreate type of Kubernetes update strategy works differently: all existing pods are stopped before new ones are created. This results in a brief downtime, but is suitable for applications that:

  • cannot run in two versions simultaneously
  • require exclusive access to a resource (e.g., a database with migrations)

Blue-Green and Canary (Advanced)

Blue-green and canary are advanced Kubernetes deployment types that are not built into the standard Deployment resource directly, but are implemented through additional tools:

  • Blue-green - two complete environments run in parallel; traffic switches instantly from the "blue" to the "green" version.
  • Canary - the new version receives a small percentage of traffic; if there are no errors, the share gradually increases.

These strategies are implemented through Argo Rollouts, Flagger, or service mesh (Istio, Linkerd). They are justified for critical services where the cost of failure is high. To visualize such architectures, it is convenient to use Brainboard, which displays infrastructure dependencies and helps plan complex deployments.

Best Practices for Production-Ready Deployment YAML

A reliable deployment file for production must comply with several recommendations:

  • Always set resource requests and limits. Without them, the scheduler cannot correctly distribute pods across nodes, and a single container can exhaust a node's resources.
  • Use specific image tags. The latest tag is unpredictable - fix the version: app:v1.4.2.
  • Define health probes for each container. Liveness and readiness probes are a mandatory minimum for production.
  • Configure PodDisruptionBudget. PDB ensures that a minimum number of pods remain available when servicing nodes.
  • Add meaningful labels and annotations. The team, env, and version labels simplify monitoring and debugging.
  • Separate environments using namespaces. Separate namespaces for dev, staging, and production to prevent accidental changes.

For teams managing multiple deployment files, Brainboard provides a centralized view of the entire infrastructure and helps enforce configuration standards.

Common Mistakes and Troubleshooting

When working with Kubernetes deployment YAML, teams regularly encounter typical problems:

  • ImagePullBackOff. Incorrect image tag or missing credentials for a private registry. Check the image name and make sure imagePullSecrets is configured.
  • CrashLoopBackOff. The container constantly crashes and restarts. Reasons include incorrect probes, missing environment variables, or errors in the entrypoint. Check the logs: kubectl logs <pod-name>.
  • Failed rollout due to lack of resources. If there are no free resources for new pods in the cluster, the update will hang. Check ResourceQuota and available node resources.
  • Selector mismatch. The labels in spec.selector.matchLabels must match the labels in template.metadata.labels. A mismatch will cause an error when creating a deployment.
  • YAML syntax errors. An extra space or incorrect indentation breaks the entire file. Use linters (kubeval, yamllint) to check before applying.

The Brainboard tool reduces the risk of such errors by allowing you to design infrastructure visually and automatically generate correct configurations.

FAQ

1. What is the difference between a Deployment and a Pod in Kubernetes? 

A Pod is the smallest unit of execution, consisting of one or more containers. Deployment is a higher-level controller that manages pods: it maintains a specified number of replicas, performs updates, and rolls back.

2. How do I roll back a failed Kubernetes deployment? 

With the command kubectl rollout undo deployment/<name>. Kubernetes will revert to the previous ReplicaSet version. To roll back to a specific revision, use the flag --to-revision=N.

3. Can I use JSON instead of YAML for deployment files? 

Yes, Kubernetes accepts both JSON and YAML. However, YAML is preferred: it is more compact, supports comments, and is easier to read.

4. How do I update a running deployment without downtime? 

Use the Rolling Update strategy (default). Update the image with the command kubectl set image deployment/<name> <container>=<image>:<tag> - Kubernetes will smoothly replace pods without downtime.

5. What is the default deployment strategy in Kubernetes? 

The default is RollingUpdate with the parameters maxSurge: 25% and maxUnavailable: 25%. This ensures a gradual update without complete application downtime.