Operational Workflows¶
Policy update workflows, backup procedures, performance tuning, and operational best practices.
Policy Updates Workflow¶
Rolling Out Policy Changes¶
Safe rollout process:
graph TD
DEV[1. Test in Dev] --> QAC[2. Deploy to QAC]
QAC --> AUDIT[3. Monitor in Audit Mode]
AUDIT --> VALIDATE[4. Review Violations]
VALIDATE --> FIX[5. Fix Issues]
FIX --> STG[6. Deploy to Staging]
STG --> ENFORCE[7. Switch to Enforce]
ENFORCE --> PRD[8. Deploy to Production]
%% Ghostty Hardcore Theme
style DEV fill:#a7e22e,color:#1b1d1e
style PRD fill:#65d9ef,color:#1b1d1e
Never Skip Audit Phase
Deploying policies directly to Enforce mode in production causes outages. Always start with Audit, monitor for 1+ week, fix violations, then switch to Enforce.
Timeline:
- Dev: Immediate
- QAC: 1 week in Audit mode
- Staging: 1 week in Enforce mode
- Production: After 0 violations in Staging
Backup and Recovery¶
Policy Backup¶
# Backup all policies
kubectl get clusterpolicy -o yaml > backup-policies.yaml
kubectl get policyexception -A -o yaml >> backup-policies.yaml
# Backup Kyverno configuration
helm get values kyverno -n kyverno > backup-kyverno-values.yaml
helm get values policy-reporter -n policy-reporter > backup-policy-reporter-values.yaml
Disaster Recovery¶
Restore from backup:
# Reinstall Kyverno
helm install kyverno kyverno/kyverno \
--namespace kyverno \
--values backup-kyverno-values.yaml
# Restore policies
kubectl apply -f backup-policies.yaml
Automate Backups
Set up a cronjob to backup policies nightly. Store backups in Git or S3 for disaster recovery.
Performance Tuning¶
Admission Controller Scaling¶
# kyverno-values.yaml
admissionController:
replicas: 3 # High availability
resources:
limits:
memory: 512Mi
cpu: 500m
requests:
memory: 256Mi
cpu: 200m
Background Scan Interval¶
Resource Filters¶
Exclude unnecessary resources:
resourceFilters:
resourceFiltersExcludeNamespaces:
- kube-system
- kube-public
- gmp-system
- cnrm-system
resourceFiltersExcludeResources:
- "[Event,*,*]"
- "[*,kube-system,*]"
- "[Node,*,*]"
Best Practices¶
1. Version Everything¶
- Policy repos: Semantic versions
- policy-platform container: Tagged versions
- Helm deployments: Track release history
2. Monitor Continuously¶
- PolicyReporter dashboard: Real-time violation tracking
- Prometheus alerts: Automated alert notifications
- Slack notifications: Team communication channels
3. Test Before Enforce¶
- New policies start in Audit mode: Never deploy directly to Enforce
- Monitor for 1+ week: Collect violation data before enforcement
- Fix violations before Enforce: Remediate issues in Audit phase
4. Document Exceptions¶
- Require JIRA tickets: Track all policy exceptions
- Set expiration dates: Enforce time-bound exceptions
- Quarterly reviews: Audit exception validity regularly
5. Automate Evidence Collection¶
- Monthly policy exports: Snapshot policy state regularly
- Automated compliance reports: Generate reports automatically
- Audit trail preservation: Retain evidence for audits
Key Takeaways¶
✅ Local dev: Run policies in containers before commit
✅ CI integration: Automated validation in pipelines
✅ Runtime enforcement: Kyverno admission control
✅ Multi-source: Aggregate policies from multiple repos
✅ Operations: Monitor, update, and audit
Related Resources:
- SDLC Hardening: Broader enforcement patterns
- Kyverno Guide: Deep dive on Kyverno
- Pre-commit Hooks: Complementary local checks