Monitoring

Argo CD exposes comprehensive Prometheus metrics across all components to monitor performance, health, and operational status.

Metrics Endpoints

Each Argo CD component exposes metrics on dedicated ports:

Application Controller

Endpoint: argocd-metrics:8082/metricsMonitors application reconciliation, cluster connections, and sync operations.

API Server

Endpoint: argocd-server-metrics:8083/metricsTracks API requests, authentication, and user activity.

Repo Server

Endpoint: argocd-repo-server:8084/metricsMeasures Git operations, manifest generation, and caching.

ApplicationSet Controller

Endpoint: argocd-applicationset-controller:8085/metricsMonitors ApplicationSet reconciliation and generation.

Key Metrics by Component

Application Controller Metrics

The controller exposes metrics about application state and cluster management.

Application Health & Sync Status

argocd_app_info (gauge)

Labels: name, namespace, project, sync_status, health_status, dest_server
Information about applications including sync and health status

# Count applications by health status
count by (health_status) (argocd_app_info)

# Applications out of sync
argocd_app_info{sync_status="OutOfSync"}

Application Performance

argocd_app_reconcile (histogram)

Application reconciliation performance in seconds
Use to identify slow reconciliation

# 95th percentile reconciliation time
histogram_quantile(0.95, 
  rate(argocd_app_reconcile_bucket[5m])
)

argocd_app_k8s_request_total (counter)

Number of Kubernetes API requests per application
High values may indicate performance issues

Cluster Metrics

argocd_cluster_info (gauge)

Information about managed clusters
Labels: name, server, version

argocd_cluster_connection_status (gauge)

Cluster connection health (1 = healthy, 0 = unhealthy)

# Disconnected clusters
argocd_cluster_connection_status{status="Unhealthy"} == 1

argocd_cluster_api_resource_objects (gauge)

Number of cached Kubernetes resources

argocd_cluster_cache_age_seconds (gauge)

Age of cluster cache data

Sync Operations

argocd_app_sync_total (counter)

Counter for application sync history
Labels: name, namespace, phase, project

argocd_app_sync_duration_seconds_total (counter)

Total time spent syncing applications

# Sync success rate
rate(argocd_app_sync_total{phase="Succeeded"}[5m]) /
rate(argocd_app_sync_total[5m])

Repo Server Metrics

Metrics for Git operations and manifest generation.

Git Operations

argocd_git_request_total (counter)

Number of Git requests performed
Labels: repo, request_type (ls-remote, fetch)

argocd_git_request_duration_seconds (histogram)

Git request duration

argocd_git_fetch_fail_total (counter)

Number of failed Git fetch operations

# Git fetch error rate by repo
rate(argocd_git_fetch_fail_total[5m])

Repository Cache

argocd_repo_pending_request_total (gauge)

Number of pending requests requiring repository lock
High values indicate repository contention

# Alert on high pending requests
argocd_repo_pending_request_total > 10

API Server Metrics

gRPC & REST API

grpc_server_handled_total (counter)

Total RPCs completed on the server
Labels: grpc_code, grpc_method, grpc_service

argocd_login_request_total (counter)

Number of login requests

# Failed login attempts
rate(argocd_login_request_total{status="failed"}[5m])

For gRPC metrics to appear, set the environment variable ARGOCD_ENABLE_GRPC_TIME_HISTOGRAM=true. Note that this metric is expensive to query and store.

Prometheus ServiceMonitor Configuration

For Prometheus Operator, deploy ServiceMonitors to automatically scrape metrics:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: argocd-metrics
  namespace: argocd
  labels:
    release: prometheus-operator
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: argocd-metrics
  endpoints:
    - port: metrics

Replace release: prometheus-operator with the label selected by your Prometheus installation.

Grafana Dashboards

Argo CD provides an official Grafana dashboard for visualizing metrics.

Installing the Dashboard

Download dashboard JSON

Get the official dashboard from the Argo CD repository:

curl -o argocd-dashboard.json \
  https://raw.githubusercontent.com/argoproj/argo-cd/master/examples/dashboard.json

Import to Grafana

Navigate to Grafana UI
Click + → Import
Upload argocd-dashboard.json
Select Prometheus datasource
Click Import

View live dashboard

Access the dashboard at: https://grafana.apps.argoproj.io (demo instance)

Dashboard Panels

The official dashboard includes:

Application Health: Count by health status (Healthy, Progressing, Degraded, Missing)
Application Sync Status: Count by sync status (Synced, OutOfSync)
Reconciliation Performance: Histogram of reconciliation times
Git Fetch Operations: Rate and duration of Git operations
Cluster Connections: Status of managed cluster connections
API Request Rate: gRPC and REST API request rates
Redis Operations: Cache hit rates and operation counts

Alerting Rules

Recommended Prometheus alerting rules:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: argocd-alerts
  namespace: argocd
spec:
  groups:
  - name: argocd
    interval: 30s
    rules:
    - alert: ArgoCDAppUnhealthy
      expr: argocd_app_info{health_status="Degraded"} == 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "ArgoCD Application {{ $labels.name }} is unhealthy"
        description: "Application {{ $labels.name }} in project {{ $labels.project }} has been Degraded for more than 5 minutes."
    
    - alert: ArgoCDAppOutOfSync
      expr: argocd_app_info{sync_status="OutOfSync"} == 1
      for: 15m
      labels:
        severity: warning
      annotations:
        summary: "ArgoCD Application {{ $labels.name }} is out of sync"
        description: "Application {{ $labels.name }} has been OutOfSync for more than 15 minutes."
    
    - alert: ArgoCDClusterDisconnected
      expr: argocd_cluster_connection_status == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "ArgoCD cluster {{ $labels.name }} is disconnected"
        description: "Cluster {{ $labels.name }} at {{ $labels.server }} has been disconnected for more than 5 minutes."
    
    - alert: ArgoCDRepoServerHighPendingRequests
      expr: argocd_repo_pending_request_total > 50
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "ArgoCD Repo Server has high pending requests"
        description: "Repo server has {{ $value }} pending requests, indicating potential performance issues."
    
    - alert: ArgoCDGitFetchFailures
      expr: rate(argocd_git_fetch_fail_total[5m]) > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "ArgoCD Git fetch failures detected"
        description: "Git fetch operations are failing at rate of {{ $value | humanize }} per second for repo {{ $labels.repo }}."
    
    - alert: ArgoCDReconciliationSlow
      expr: |
        histogram_quantile(0.95, 
          rate(argocd_app_reconcile_bucket[5m])
        ) > 60
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "ArgoCD reconciliation is slow"
        description: "95th percentile reconciliation time is {{ $value | humanize }}s, exceeding 60s threshold."

Advanced Monitoring Features

Exposing Application Labels as Metrics

Enable custom application labels in metrics for team-based routing:

containers:
- command:
  - argocd-application-controller
  - --metrics-application-labels
  - team-name
  - --metrics-application-labels
  - business-unit

Result:

argocd_app_labels{label_business_unit="bu-id-1",label_team_name="my-team",name="my-app-1",namespace="argocd",project="important-project"} 1

Metrics Cache Expiration

For environments with frequent application creation/deletion, configure cache expiration:

containers:
- command:
  - argocd-application-controller
  - --metrics-cache-expiration=24h0m0s

CPU/Memory Profiling

Enable profiling endpoints for performance troubleshooting:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cmd-params-cm
  namespace: argocd
data:
  controller.profile.enabled: "true"
  server.profile.enabled: "true"
  reposerver.profile.enabled: "true"

Access profiling data:

kubectl port-forward svc/argocd-metrics 8082:8082
go tool pprof http://localhost:8082/debug/pprof/heap

Monitoring Best Practices

Set Baselines

Establish baseline metrics for reconciliation times, sync rates, and API request patterns during normal operations.

Alert Tuning

Tune alert thresholds based on your environment to reduce false positives while catching real issues.

Dashboard Review

Regularly review dashboards to identify trends and potential capacity issues before they impact users.

Metric Cardinality

Monitor metric cardinality, especially with custom labels, to avoid overwhelming Prometheus storage.

Get Started

Installation

Core Features

Configuration

Application Management

ApplicationSet

Manifest Sources

Operations

Security

Metrics Endpoints

Application Controller

API Server

Repo Server

ApplicationSet Controller

Key Metrics by Component

Application Controller Metrics

Repo Server Metrics

API Server Metrics

Prometheus ServiceMonitor Configuration

Grafana Dashboards

Installing the Dashboard

Dashboard Panels

Alerting Rules

Advanced Monitoring Features

Exposing Application Labels as Metrics

Metrics Cache Expiration

CPU/Memory Profiling

Monitoring Best Practices

Set Baselines

Alert Tuning

Dashboard Review

Metric Cardinality

Get Started

Installation

Core Features

Configuration

Application Management

ApplicationSet

Manifest Sources

Operations

Security

Documentation Index

​Metrics Endpoints

Application Controller

API Server

Repo Server

ApplicationSet Controller

​Key Metrics by Component

​Application Controller Metrics

​Repo Server Metrics

​API Server Metrics

​Prometheus ServiceMonitor Configuration

​Grafana Dashboards

​Installing the Dashboard

​Dashboard Panels

​Alerting Rules

​Advanced Monitoring Features

​Exposing Application Labels as Metrics

​Metrics Cache Expiration

​CPU/Memory Profiling

​Monitoring Best Practices

Set Baselines

Alert Tuning

Dashboard Review

Metric Cardinality

​Related Resources

Metrics Endpoints

Key Metrics by Component

Application Controller Metrics

Repo Server Metrics

API Server Metrics

Prometheus ServiceMonitor Configuration

Grafana Dashboards

Installing the Dashboard

Dashboard Panels

Alerting Rules

Advanced Monitoring Features

Exposing Application Labels as Metrics

Metrics Cache Expiration

CPU/Memory Profiling

Monitoring Best Practices

Related Resources