Skip to content
Go back

Argo Rollouts vs Flagger Progressive Delivery

By SumGuy 11 min read
Argo Rollouts vs Flagger Progressive Delivery

Argo Rollouts vs Flagger — Progressive delivery on Kubernetes (pick the right forklift)

Progressive delivery is the engineering equivalent of easing a forklift into a crowded warehouse aisle: you don’t slam the throttle and pray the boxes rearrange themselves politely. It’s deploying changes gradually, shifting a little traffic, checking metrics, and only then committing to the full switch — so your 2 AM pager stays mercifully quiet.

Below: a pragmatic, slightly snarky face-off between Argo Rollouts and Flagger. Both get the job done, but they drive different rigs. Code examples, metric queries, and real-world tradeoffs included.

What is progressive delivery? (quick hook)

Progressive delivery = staged, observable, reversible releases. Instead of “deploy-and-hope,” you can:

It’s like test driving a car on a neighborhood street before taking it onto the interstate — fewer witnesses to your mistakes.

Argo Rollouts: The Replacement Model

Control model: Rollout CRD replaces Deployment

Argo Rollouts is opinionated: you replace your Deployment with a Rollout CRD. That CRD becomes the authoritative controller: replicas, strategy (canary / blue-green), and traffic routing live on the Rollout. That “replacement” model gives tight control — but it means you must migrate resources to the Rollout type.

Pros: single source of truth, first-class rollout semantics, tight UI and CLI. Cons: you no longer operate a Deployment, which can confuse some GitOps flows if your pipeline expects Deployments.

Example YAML

Blue–green (preview + stable service):

rollout-bluegreen.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: bluegreen-demo
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: bluegreen-demo
template:
metadata:
labels:
app: bluegreen-demo
spec:
containers:
- name: web
image: ghcr.io/stefanprodan/podinfo:6.0.1
ports:
- containerPort: 9898
strategy:
blueGreen:
activeService: bluegreen-demo
previewService: bluegreen-demo-preview
autoPromotionEnabled: false # manual promotion for verification

Canary with istio traffic shifting and analysis steps:

rollout-canary.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: canary-demo
namespace: default
spec:
replicas: 4
selector:
matchLabels:
app: canary-demo
template:
metadata:
labels:
app: canary-demo
spec:
containers:
- name: web
image: ghcr.io/stefanprodan/podinfo:6.0.1
ports:
- containerPort: 9898
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 1m}
- analysis:
templates:
- templateName: success-rate
- setWeight: 50
- pause: {duration: 2m}
- analysis:
templates:
- templateName: success-rate
- setWeight: 100
trafficRouting:
istio:
virtualService:
name: canary-demo-vs
routes:
- name: http

Notes:

Traffic shifting providers (Istio, NGINX, etc.)

Argo Rollouts supports traffic shifting through a set of providers: Istio, NGINX (via ingress/annotations or Ingress controllers), Gateway API (Envoy/Contour), SMI implementations, AWS ALB/ELBv2, and others via provider adapters. That makes it flexible whether you’re on a service mesh or using plain Ingress.

If your infra is mesh-first (Istio/Linkerd), Argo’s native Istio integration is very smooth. If you’re using Ingress controllers, check controller support (NGINX, ALB) for weight-based routing.

AnalysisTemplate & AnalysisRun

Argo Rollouts ships with AnalysisTemplate (templated checks) and AnalysisRun (runtime execution). You author reusable AnalysisTemplates that run Prometheus queries, webhooks, or external scripts.

Example AnalysisTemplate that checks request success-rate via Prometheus:

analysis-template.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
namespace: default
spec:
metrics:
- name: request-success-rate
interval: 1m
count: 3
successCondition: result >= 99
failureCondition: result < 99
provider:
prometheus:
address: http://prometheus.monitoring.svc:9090
query: |
sum(rate(istio_requests_total{destination_service=~"canary-demo.*",response_code!~"5.."}[5m]))
/ sum(rate(istio_requests_total{destination_service=~"canary-demo.*"}[5m])) * 100

Argo runs AnalysisRuns automatically as part of the Rollout analysis step. Failing AnalysisRuns trigger rollback behavior (or halt promotion) depending on your config.

Promotion & rollback behavior

Argo Rollouts dashboard UI win

Argo Rollouts includes a slick kubectl argo rollouts dashboard web UI (and kubectl argo rollouts get rollout -n ns <name> CLI) that visualizes steps, weights, AnalysisRuns, and ReplicaSet history. It’s a strong operator UX — clicky, visual, and the kind of thing you want when a human is deciding if a release is safe.

Flagger: The Sidecar Model

Control model: Canary CRD watches Deployment

Flagger follows a different philosophy: it doesn’t replace your Deployment. Instead, Flagger watches your existing Deployment and orchestrates traffic shifting, analysis, and promotion externally. The Canary CRD points to a Deployment (targetRef) and the Flagger controller performs the progressive delivery.

Pros: non-invasive (keeps Deployment), works well with Flux/GitOps patterns where Deployments are expected. Cons: model split (Deployment vs Canary) can feel like a companion app that occasionally does weird things.

Example Canary CRD YAML

This is a typical Flagger Canary that uses Istio traffic shifting and an inline Prometheus metric:

canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
service:
port: 9898
provider:
name: istio
analysis:
interval: 1m
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
thresholdRange:
- 99
- 100
interval: 1m
query: |
sum(rate(istio_requests_total{destination_service=~"podinfo.*",response_code!~"5.."}[5m]))
/ sum(rate(istio_requests_total{destination_service=~"podinfo.*"}[5m])) * 100

Key points:

Traffic provider support (Istio, Linkerd, NGINX, Gateway API, ALB)

Flagger supports a wide set of providers: Istio, Linkerd, NGINX Ingress Controller, Gateway API (Contour/Envoy), AWS ALB (via aws-alb-ingress-controller integrations) and more. It speaks to the data plane via the provider adapters and updates virtual services/ingress to change weights.

This makes Flagger a good choice if you already have Deployments and want an external controller to manage canaries without switching resource types.

MetricTemplate in Flagger

Flagger lets you define MetricTemplate CRs that centralize Prometheus (or other) queries. Reuse them across canaries. A MetricTemplate decouples query details from Canary objects.

Example MetricTemplate:

metric-template.yaml
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
name: request-success-rate
namespace: flagger
spec:
provider:
type: prometheus
address: http://prometheus.monitoring.svc:9090
query: |
sum(rate(istio_requests_total{destination_service=~"{{ $namespace }}-podinfo.*",response_code!~"5.."}[{{ $interval }}]))
/ sum(rate(istio_requests_total{destination_service=~"{{ $namespace }}-podinfo.*"}[{{ $interval }}])) * 100
thresholdRange:
- 99
- 100

Note: Flagger uses templating placeholders to inject namespace/interval values. The MetricTemplate is then referenced inside the Canary analysis.metrics block.

Prometheus metric queries

Examples you’ll actually paste into templates:

Error rate (example for Istio):

sum(rate(istio_requests_total{destination_service=~"podinfo.*",response_code=~"5.."}[5m]))
/ sum(rate(istio_requests_total{destination_service=~"podinfo.*"}[5m])) * 100

Request success rate (inverse):

1 - sum(rate(istio_requests_total{destination_service=~"podinfo.*",response_code=~"5.."}[5m]))
/ sum(rate(istio_requests_total{destination_service=~"podinfo.*"}[5m]))

Latency P95 for service:

histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job="istio-mesh",destination_service=~"podinfo.*"}[5m])) by (le))

Always test queries in Prometheus or Grafana before wiring to analysis; a bad query = false positives at 3 AM.

Manual vs automatic promotion

Practically: Flagger is built to run automatically in most setups (automatic promotion by default), but you can easily add gates (manual approvals in GitOps, or set thresholds so promotion rarely happens without human sign-off).

Rollback story

Flagger will automatically roll back when an analysis fails: it sets the canary weight back to 0, routes traffic back to the stable revision, and marks the Canary as failed. It also exposes status conditions for automation or alerting.

Rollback in Flagger is battle-tested — many teams use Flagger as an automated safety net in production clusters.

Key Differences Table/Matrix

AreaArgo RolloutsFlagger
Control modelReplaces Deployment with Rollout CRD (single source of truth)Watches existing Deployment with Canary CRD (non-invasive)
UI / UXStrong dashboard + CLI plugin (Argo Rollouts UI) — Argo winsNo native visual dashboard; relies on logs/Prom/Grafana
IntegrationFirst-class with Argo CD, kubectl pluginBuilt to play with Flux and conventional Deployments; works with Argo CD too
Metric sourcesAnalysisTemplate supports Prometheus, webhooks, datadog via providersMetricTemplate supports Prometheus (and other providers via adapters)
Traffic providersIstio, NGINX, SMI, Gateway API, ALB, etc.Istio, Linkerd, NGINX Ingress, Gateway API, AWS ALB, etc.
ComplexitySlightly higher because you change resource type; richer UILower friction for teams that want to keep Deployments, easier GitOps compatibility
Best forTeams that want tight control, visual ops, Argo CD shopTeams that want minimal intrusion, Flux/Deployment-first workflows

Decision Framework

Do you actually need this, or is rolling update fine?

Ask the real questions:

If you’re running a small internal cron job, a simple rolling update is fine. If you’re deploying high-traffic frontends, APIs, payment or auth services — progressive delivery is worth the extra configuration. The rule of thumb: if a failed deploy costs more than a coffee or an apology email, add a progressive gate.

Which fits your team?

Other factors:

Real commands (handy)

Terminal window
# Argo Rollouts CLI: inspect rollout
kubectl argo rollouts get rollout -n default canary-demo
# Open the dashboard
kubectl argo rollouts dashboard -n default
# Flagger: check canary status
kubectl -n test get canary podinfo
kubectl -n test describe canary podinfo

Conclusion (SumGuy voice — stop guessing, pick one)

If you’re running a fleet of critical services and want a polished cockpit with step-by-step controls, Argo Rollouts is the nicer driving experience. Swap in the Rollout CRD, hook up AnalysisTemplates, point it at Istio or Gateway API, and the dashboard makes you feel like a capable airline pilot. It’s less like a forklift and more like a precision crane.

If you like your Deployments and want a “set-it-and-forget-it” safety rail that watches your existing objects, Flagger is the faithful mechanic riding shotgun. It’s less intrusive, works with Flux, and will quietly nudge traffic back when metrics go sideways.

Pick Argo Rollouts if your team values UI, tight control, and Argo CD integration. Pick Flagger if you want minimal churn, Deployment-first workflows, and a controller that tucks neatly into an existing mesh or ingress setup.

Either way: add good Prometheus queries, test your analysis in a staging environment, and don’t let the fancy automation lull you into skipping canary observability. Progressive delivery is a seatbelt, not a guarantee — but it’s a damn good seatbelt for production.

Now go pick a forklift, back slowly, and keep the lights on.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
iperf3 + nload: Network Diagnosis

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts