The High Availability (HA) installation is the recommended deployment method for production environments. It provides redundancy, resiliency, and the ability to handle component failures without service disruption.
Argo CD is largely stateless. All data is persisted as Kubernetes objects in etcd. Redis is only used as a disposable cache and can be safely rebuilt without service disruption.
Prerequisites
Kubernetes cluster (version 1.27+)
Minimum 3 worker nodes (required for pod anti-affinity rules)
kubectl CLI configured with cluster-admin access
IPv4 networking (IPv6-only clusters are not supported)
The HA installation requires at least three different nodes due to pod anti-affinity rules that prevent multiple replicas of the same component from running on the same node.
Installation
Create the namespace
Create a dedicated namespace for Argo CD: kubectl create namespace argocd
Apply the HA manifest
Install Argo CD using the HA manifest: kubectl apply -n argocd --server-side --force-conflicts \
-f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/install.yaml
For a specific version: kubectl apply -n argocd --server-side --force-conflicts \
-f https://raw.githubusercontent.com/argoproj/argo-cd/v2.14.0/manifests/ha/install.yaml
Verify the deployment
Check that all pods are running with multiple replicas: kubectl get pods -n argocd
Expected output (with multiple replicas): NAME READY STATUS
argocd-application-controller-0 1/1 Running
argocd-applicationset-controller-xxx 1/1 Running
argocd-dex-server-xxx 1/1 Running
argocd-notifications-controller-xxx 1/1 Running
argocd-redis-ha-haproxy-xxx 1/1 Running
argocd-redis-ha-haproxy-yyy 1/1 Running
argocd-redis-ha-haproxy-zzz 1/1 Running
argocd-redis-ha-server-0 2/2 Running
argocd-redis-ha-server-1 2/2 Running
argocd-redis-ha-server-2 2/2 Running
argocd-repo-server-xxx 1/1 Running
argocd-repo-server-yyy 1/1 Running
argocd-server-xxx 1/1 Running
argocd-server-yyy 1/1 Running
HA Architecture
Component Replicas
Redis HA
Anti-Affinity
API Server (argocd-server)
Type: Deployment
Replicas: 2+ (configurable)
Purpose: Stateless API and UI servers
Scaling: Can be scaled horizontally for load distribution
apiVersion : apps/v1
kind : Deployment
metadata :
name : argocd-server
spec :
replicas : 3
template :
spec :
containers :
- name : argocd-server
env :
- name : ARGOCD_API_SERVER_REPLICAS
value : "3"
Repository Server (argocd-repo-server)
Type: Deployment
Replicas: 2+ (configurable)
Purpose: Handles manifest generation
Scaling: Scale based on repository count and manifest generation load
Application Controller (argocd-application-controller)
Type: StatefulSet
Replicas: 1 (can be sharded for large deployments)
Purpose: Reconciles application state
Sharding: Enable for managing 1000+ applications or multiple clusters
Redis High Availability Setup The HA installation includes Redis Sentinel for automatic failover: Redis StatefulSet (argocd-redis-ha-server)
Replicas: 3
Purpose: Redis cache cluster
Includes: Redis Sentinel for leader election
HAProxy Deployment (argocd-redis-ha-haproxy)
Replicas: 3
Purpose: Load balancer for Redis instances
Routes: Connections to current Redis master
# Redis HA StatefulSet
apiVersion : apps/v1
kind : StatefulSet
metadata :
name : argocd-redis-ha-server
spec :
replicas : 3
serviceName : argocd-redis-ha
template :
spec :
containers :
- name : redis
image : redis:7.0.15-alpine
- name : sentinel
image : redis:7.0.15-alpine
Pod Anti-Affinity Rules The HA manifests include anti-affinity rules to distribute pods across nodes: affinity :
podAntiAffinity :
requiredDuringSchedulingIgnoredDuringExecution :
- labelSelector :
matchLabels :
app.kubernetes.io/name : argocd-server
topologyKey : kubernetes.io/hostname
This ensures:
No single node failure affects all replicas
Better resource distribution
Improved resilience
Requires at least 3 nodes. If you have fewer nodes, you’ll need to adjust or remove anti-affinity rules.
HA vs Standard Installation
Component Standard High Availability argocd-server 1 replica 2+ replicas argocd-repo-server 1 replica 2+ replicas argocd-application-controller 1 replica 1 replica (shardable) Redis Single deployment Redis HA (3 replicas + Sentinel) HAProxy Not included 3 replicas Anti-affinity No Yes (requires 3+ nodes) Suitable for Development/Testing Production
Scaling Strategies
Increase replicas for handling more concurrent users: kubectl scale deployment argocd-server -n argocd --replicas=3
Update the ARGOCD_API_SERVER_REPLICAS environment variable: apiVersion : apps/v1
kind : Deployment
metadata :
name : argocd-server
spec :
replicas : 3
template :
spec :
containers :
- name : argocd-server
env :
- name : ARGOCD_API_SERVER_REPLICAS
value : "3"
The ARGOCD_API_SERVER_REPLICAS variable is used to divide the limit of concurrent login requests between replicas.
Scaling the Repository Server
Increase replicas for handling more manifest generation: kubectl scale deployment argocd-repo-server -n argocd --replicas=3
Tuning Parameters:
--parallelismlimit: Control concurrent manifest generations (default: 20)
--repo-cache-expiration: Cache duration (default: 24h)
ARGOCD_EXEC_TIMEOUT: Command execution timeout (default: 90s)
containers :
- name : argocd-repo-server
command :
- argocd-repo-server
args :
- --parallelismlimit=50
- --repo-cache-expiration=1h
env :
- name : ARGOCD_EXEC_TIMEOUT
value : "2m"
Sharding the Application Controller
For managing 1000+ applications, enable controller sharding: apiVersion : apps/v1
kind : StatefulSet
metadata :
name : argocd-application-controller
spec :
replicas : 2
template :
spec :
containers :
- name : argocd-application-controller
env :
- name : ARGOCD_CONTROLLER_REPLICAS
value : "2"
args :
- --status-processors=50
- --operation-processors=25
- --sharding-method=consistent-hashing
Sharding Methods:
legacy: UID-based distribution (non-uniform)
round-robin: Equal distribution across shards (alpha)
consistent-hashing: Bounded load algorithm (alpha)
The round-robin and consistent-hashing algorithms are experimental. Test thoroughly before using in production.
Namespace-Level HA Installation
For HA installation without cluster-admin privileges:
# Install CRDs first
kubectl apply --server-side --force-conflicts \
-k https://github.com/argoproj/argo-cd/manifests/crds \? ref \= stable
# Install HA namespace-scoped resources
kubectl create namespace argocd
kubectl apply -n argocd --server-side --force-conflicts \
-f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/namespace-install.yaml
Application Controller
Repository Server
Redis HA
Processor Configuration containers :
- name : argocd-application-controller
args :
# For 1000 applications
- --status-processors=50
- --operation-processors=25
- --repo-server-timeout-seconds=120
env :
- name : ARGOCD_RECONCILIATION_TIMEOUT
value : "180s"
- name : ARGOCD_RECONCILIATION_JITTER
value : "60"
Resource Requests/Limits resources :
requests :
cpu : 1000m
memory : 2Gi
limits :
cpu : 2000m
memory : 4Gi
Parallelism and Caching containers :
- name : argocd-repo-server
args :
- --parallelismlimit=50
- --repo-cache-expiration=1h
env :
- name : ARGOCD_EXEC_TIMEOUT
value : "2m30s"
- name : ARGOCD_GIT_ATTEMPTS_COUNT
value : "3"
- name : TMPDIR
value : "/tmp"
volumeMounts :
- name : tmp
mountPath : /tmp
volumes :
- name : tmp
emptyDir :
sizeLimit : 10Gi
Redis Resource Tuning # Redis StatefulSet resources
containers :
- name : redis
resources :
requests :
cpu : 100m
memory : 256Mi
limits :
cpu : 500m
memory : 512Mi
- name : sentinel
resources :
requests :
cpu : 50m
memory : 64Mi
limits :
cpu : 100m
memory : 128Mi
Monitoring and Observability
Metrics Endpoints
All components expose Prometheus metrics:
argocd-server: :8083/metrics
argocd-repo-server: :8084/metrics
argocd-application-controller: :8082/metrics
Key Metrics
# Application reconciliation duration
argocd_app_reconcile
# Git request total
argocd_git_request_total
# Kubernetes API requests per application
argocd_app_k8s_request_total
Enable Profiling (Optional)
apiVersion : v1
kind : ConfigMap
metadata :
name : argocd-cmd-params-cm
namespace : argocd
data :
controller.profile.enabled : "true"
reposerver.profile.enabled : "true"
server.profile.enabled : "true"
Access profiling:
kubectl port-forward svc/argocd-metrics 8082:8082 -n argocd
go tool pprof http://localhost:8082/debug/pprof/heap
Upgrading
To upgrade the HA installation:
kubectl apply -n argocd --server-side --force-conflicts \
-f https://raw.githubusercontent.com/argoproj/argo-cd/ < versio n > /manifests/ha/install.yaml
Always review the upgrade notes before upgrading. Take backups of critical data.
Disaster Recovery
Backup Strategy
Since Argo CD stores all state in Kubernetes objects:
# Backup all Argo CD resources
kubectl get applications,applicationsets,appprojects -n argocd -o yaml > argocd-backup.yaml
# Backup configuration
kubectl get configmaps,secrets -n argocd -o yaml > argocd-config-backup.yaml
Restore Strategy
# Restore resources
kubectl apply -f argocd-backup.yaml
kubectl apply -f argocd-config-backup.yaml
Redis is only a cache. Even if Redis data is lost, Argo CD will rebuild the cache automatically.
Troubleshooting
Pods stuck in Pending (Anti-affinity issues)
If you have fewer than 3 nodes, you’ll need to adjust anti-affinity rules: kubectl patch deployment argocd-server -n argocd --type json \
-p= '[{"op": "remove", "path": "/spec/template/spec/affinity"}]'
Or create a kustomization that removes anti-affinity rules.
Redis HA connection issues
Check Redis Sentinel status: kubectl exec -it argocd-redis-ha-server-0 -n argocd -c sentinel -- \
redis-cli -p 26379 SENTINEL masters
Check HAProxy status: kubectl logs -n argocd deploy/argocd-redis-ha-haproxy
Review metrics and adjust resource limits: kubectl top pods -n argocd
kubectl describe pod -n argocd < pod-nam e >
Consider enabling profiling to identify bottlenecks (see Monitoring section).
Next Steps
Configure SSO Set up Single Sign-On for your team
Add Clusters Register external Kubernetes clusters
Monitoring Set up Prometheus and Grafana
Backup & Restore Implement disaster recovery