GitOps practices for declarative, automated, and auditable infrastructure management
15 min read
Introduction to GitOps
GitOps is a modern operational framework that takes DevOps best practices used for application development and applies them to infrastructure automation. It uses Git as a single source of truth for declarative infrastructure and applications, providing automated delivery and continuous deployment.
Core Principles of GitOps
1. Declarative Configuration
# Everything is described declaratively
# kubernetes/manifests/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
2. Version Controlled and Immutable
# Git history provides audit trail git log --oneline # a1b2c3d (HEAD -> main) feat: scale nginx to 5 replicas # e5f6g7h feat: add nginx deployment # i8j9k0l initial commit # Every change is tracked and revertible git revert a1b2c3d
3. Automated Delivery
# Automated synchronization
# .github/workflows/gitops-sync.yaml
name: GitOps Sync
on:
push:
branches: [ main ]
jobs:
sync:
runs-on: ubuntu-latest
steps:
- name: Sync with ArgoCD
run: |
curl -X POST $ARGOCD_SERVER/api/v1/applications/my-app/sync \
-H "Authorization: Bearer $ARGOCD_TOKEN"
4. Continuous Reconciliation
# The system continuously reconciles actual state with desired state # If someone manually changes replicas to 10... kubectl scale deployment nginx --replicas=10 # GitOps operator detects drift and reconciles back to 3 # Actual State (10 replicas) → Desired State (3 replicas)
GitOps vs Traditional DevOps
Traditional DevOps Pipeline
# Traditional approach Developer → Commit Code → CI Pipeline → Build Image → Update Manifest → Deploy to Cluster → Manual Verification
GitOps Pipeline
# GitOps approach Developer → Commit to Git (Infrastructure as Code) → GitOps Operator → Automatic Sync → Continuous Reconciliation
GitOps Tools Ecosystem
Popular GitOps Tools
- ArgoCD: Declarative Kubernetes CD
- FluxCD: GitOps Kubernetes operator
- Jenkins X: CI/CD with GitOps
- Tekton: Cloud-native CI/CD
- Weave GitOps: Enterprise GitOps platform
Implementing GitOps with ArgoCD
ArgoCD Application Configuration
# applications/production/app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-app
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/my-org/gitops-repo
targetRevision: HEAD
path: kubernetes/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- CreateNamespace=true
- PruneLast=true
GitOps Directory Structure
Recommended Structure
gitops-repo/ ├── applications/ # ArgoCD Application definitions │ ├── production/ │ │ ├── app.yaml │ │ └── kustomization.yaml │ ├── staging/ │ │ └── app.yaml │ └── development/ │ └── app.yaml ├── kubernetes/ # Kubernetes manifests │ ├── base/ # Common base configurations │ │ ├── deployment.yaml │ │ ├── service.yaml │ │ ├── configmap.yaml │ │ └── kustomization.yaml │ ├── overlays/ │ │ ├── development/ # Development specific patches │ │ │ ├── kustomization.yaml │ │ │ └── patch.yaml │ │ ├── staging/ # Staging specific patches │ │ │ ├── kustomization.yaml │ │ │ └── patch.yaml │ │ └── production/ # Production specific patches │ │ ├── kustomization.yaml │ │ └── patch.yaml │ └── cluster/ # Cluster-level resources │ ├── namespaces.yaml │ ├── network-policies.yaml │ └── rbac.yaml ├── helm/ # Helm charts │ ├── my-app/ │ │ ├── Chart.yaml │ │ ├── values.yaml │ │ └── templates/ │ └── dependencies/ ├── scripts/ # Utility scripts │ ├── bootstrap.sh │ └── health-check.sh └── README.md
Kustomize for Environment Management
Base Configuration
# kubernetes/base/kustomization.yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - deployment.yaml - service.yaml - configmap.yaml commonLabels: app: my-application version: v1.0.0
Environment Overlays
# kubernetes/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: production
resources:
- ../../base
replicas:
- name: my-application
count: 5
patchesStrategicMerge:
- patch.yaml
images:
- name: my-application
newTag: v1.2.3
# kubernetes/overlays/production/patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-application
spec:
template:
spec:
containers:
- name: app
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
GitOps Workflow Patterns
1. Environment Promotion
# Development → Staging → Production promotion # Each environment has its own branch or directory # Feature branch workflow feature/login-page → develop → staging → main (production) # Directory-based workflow kubernetes/overlays/development/ kubernetes/overlays/staging/ kubernetes/overlays/production/
2. Blue-Green Deployment
# blue-green-deployment.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: app-blue
annotations:
argocd.argoproj.io/sync-wave: "-1"
spec:
source:
repoURL: https://github.com/my-org/gitops-repo
targetRevision: blue
path: kubernetes/blue
destination:
server: https://kubernetes.default.svc
namespace: blue
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: app-green
spec:
source:
repoURL: https://github.com/my-org/gitops-repo
targetRevision: main
path: kubernetes/green
destination:
server: https://kubernetes.default.svc
namespace: green
Security in GitOps
Secrets Management
# Using Sealed Secrets
# Install kubeseal
kubectl apply -f https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.18.0/controller.yaml
# Create sealed secret
kubectl create secret generic my-secret \
--from-literal=password=supersecret \
--dry-run=client -o yaml | kubeseal > sealed-secret.yaml
# sealed-secret.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: my-secret
spec:
encryptedData:
password: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq...
RBAC and Access Control
# gitops-rbac.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: gitops-deployer rules: - apiGroups: [""] resources: ["pods", "services", "configmaps", "secrets"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] - apiGroups: ["apps"] resources: ["deployments", "replicasets"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: gitops-deployer-binding subjects: - kind: ServiceAccount name: argocd-application-controller namespace: argocd roleRef: kind: ClusterRole name: gitops-deployer apiGroup: rbac.authorization.k8s.io
Monitoring and Observability
GitOps Health Metrics
# Prometheus metrics for GitOps
# Key metrics to monitor:
# Application sync status
argocd_app_sync_status{name="my-app",namespace="argocd"} 1
# Application health status
argocd_app_health_status{name="my-app",namespace="argocd"} 1
# Sync duration
argocd_app_sync_duration_seconds{name="my-app"}
# Reconciliation performance
gitops_reconciliation_duration_seconds
gitops_reconciliation_errors_total
GitOps Dashboard
# Grafana dashboard for GitOps
# Track:
# - Deployment frequency
# - Lead time for changes
# - Mean time to recovery (MTTR)
# - Change failure rate
# - Sync success rate
# Alert rules
groups:
- name: gitops.alerts
rules:
- alert: ApplicationOutOfSync
expr: argocd_app_sync_status == 0
for: 5m
labels:
severity: warning
annotations:
summary: "Application {{ $labels.name }} is out of sync"
- alert: ApplicationUnhealthy
expr: argocd_app_health_status == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Application {{ $labels.name }} is unhealthy"
Multi-Cluster GitOps
Cluster Management
# clusters-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: clusters-config
namespace: argocd
data:
clusters.yaml: |
clusters:
- name: development
server: https://dev-cluster.example.com
labels:
environment: dev
region: us-west
- name: staging
server: https://staging-cluster.example.com
labels:
environment: staging
region: us-east
- name: production
server: https://prod-cluster.example.com
labels:
environment: prod
region: eu-central
ApplicationSet for Multi-Cluster
# applicationset-multi-cluster.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: multi-cluster-apps
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
environment: production
template:
metadata:
name: '{{name}}-web-app'
spec:
project: default
source:
repoURL: https://github.com/my-org/gitops-repo
targetRevision: main
path: kubernetes/overlays/production
destination:
server: '{{server}}'
namespace: web-app
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 3
backoff:
duration: 5s
Disaster Recovery with GitOps
Git as Recovery Source
# Disaster recovery procedure # 1. Provision new cluster # 2. Install GitOps operator # 3. Point to existing Git repository # 4. All applications automatically deploy # Bootstrap script for disaster recovery #!/bin/bash # bootstrap-recovery.sh # Install ArgoCD kubectl create namespace argocd kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml # Wait for ArgoCD to be ready kubectl wait --for=condition=available deployment/argocd-server -n argocd --timeout=300s # Create applications from Git kubectl apply -f https://raw.githubusercontent.com/my-org/gitops-repo/main/applications/production/app.yaml echo "Disaster recovery initiated. Applications will sync automatically."
Best Practices for GitOps
1. Git Branching Strategy
# Recommended branching strategy main → Production (protected) staging → Staging environment develop → Development environment feature/* → Feature development hotfix/* → Production hotfixes # Protection rules for main branch - Require pull request reviews - Require status checks to pass - Require linear history - Include administrators - Restrict who can push to main
2. Commit Conventions
# Conventional commits for GitOps feat: add redis cache deployment fix: correct nginx configmap docs: update deployment documentation style: format kustomization.yaml refactor: reorganize kubernetes manifests test: add e2e test deployment chore: update argo-cd to v2.4 # Scope for environment-specific changes feat(production): increase replicas to 10 fix(development): fix database connection string
3. Code Review Process
# Pull request template ## Description ## Type of change - [ ] Bug fix - [ ] New feature - [ ] Breaking change - [ ] Documentation update ## Testing - [ ] Manually tested in development - [ ] Automated tests pass - [ ] Security scan completed ## Checklist - [ ] Follows GitOps principles - [ ] No secrets in plain text - [ ] Backward compatible - [ ] Documentation updated
Advanced GitOps Patterns
1. Canary Deployments
# canary-deployment.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-app
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
service:
port: 9898
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
threshold: 99
interval: 1m
- name: request-duration
threshold: 500
interval: 1m
2. GitOps with Feature Flags
# feature-flags.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: feature-flags
data:
new-ui-enabled: "false"
experimental-api: "true"
# In application code
if (process.env.NEW_UI_ENABLED === 'true') {
// Render new UI
} else {
// Render old UI
}
# Rollout new feature
git commit -m "feat: enable new UI feature flag"
git push origin main
# ArgoCD automatically syncs and enables the feature
Troubleshooting GitOps
Common Issues and Solutions
# 1. Application out of sync argocd app sync my-app argocd app get my-app # 2. Sync conflicts argocd app diff my-app # Resolve in Git, then sync # 3. Permission issues kubectl auth can-i create deployment --as=system:serviceaccount:argocd:argocd-application-controller # 4. Resource constraints kubectl top pods -n argocd kubectl describe node # 5. Network connectivity argocd app actions run my-app ping
Real-World GitOps Implementation
Enterprise GitOps Setup
# Complete GitOps workflow for enterprise 1. **Source Control**: GitHub Enterprise with branch protection 2. **CI Pipeline**: Jenkins/GitHub Actions for building images 3. **Image Registry**: Harbor/ECR with vulnerability scanning 4. **GitOps Operator**: ArgoCD with SSO and RBAC 5. **Secrets Management**: HashiCorp Vault with external secrets operator 6. **Monitoring**: Prometheus/Grafana with GitOps dashboards 7. **Security**: Aqua/Twistlock for runtime security 8. **Backup**: Velero for cluster state backup # Success metrics - Deployment frequency: Multiple times per day - Lead time for changes: Less than 1 hour - Mean time to recovery: Less than 15 minutes - Change failure rate: Less than 5%
Conclusion
GitOps represents a paradigm shift in how we manage infrastructure and applications. By treating infrastructure as code and using Git as the single source of truth, organizations can achieve higher reliability, better audit trails, and faster deployment cycles.
Key Benefits of GitOps:
- Increased Reliability: Automated and consistent deployments
- Enhanced Security: Everything is versioned and auditable
- Better Collaboration: Standardized workflow for all teams
- Faster Recovery: Git serves as disaster recovery source
- Improved Compliance: Complete audit trail of all changes
Adopting GitOps requires cultural change and proper tooling, but the benefits in reliability, security, and velocity make it an essential practice for modern cloud-native organizations.
Comments
Post a Comment