Fail Secure¶
Fail secure means the system defaults to denying access. Failures default to safe states.
Core Principle¶
When security controls fail, the system should deny access rather than allow it.
Key Properties:
- Default-deny posture
- Explicit allow rules required
- Failures block operations (don't bypass)
- All decisions audited and logged
Admission Control¶
Admission controllers intercept API requests before objects are persisted.
ValidatingAdmissionWebhook¶
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: security-policy-validator
webhooks:
- name: validate.security.example.com
admissionReviewVersions: ["v1", "v1beta1"]
clientConfig:
service:
name: security-webhook
namespace: security-system
path: "/validate"
caBundle: LS0tLS1CRUdJTi... # Base64-encoded CA cert
failurePolicy: Fail # CRITICAL: Block on webhook failure
sideEffects: None
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
namespaceSelector:
matchExpressions:
- key: security-policy
operator: NotIn
values: ["exempt"]
Critical Configuration:
failurePolicy: Fail- Rejects requests if webhook is unavailablesideEffects: None- Webhook does not modify cluster statenamespaceSelector- Scopes which namespaces are validated
failurePolicy: Ignore Is Dangerous
Using failurePolicy: Ignore means policies are bypassed when webhooks fail. This defeats the purpose. Use Fail in production. Accept temporary deployment blocks over policy bypass.
Failure Policy Comparison¶
| Policy | Behavior on Webhook Failure | Use Case |
|---|---|---|
| Fail | Rejects all matching requests | Production security policies |
| Ignore | Allows all requests through | Development, testing only |
Security Implication:
failurePolicy: Fail # Secure: Blocks if webhook down
failurePolicy: Ignore # DANGEROUS: Bypasses policy if webhook down
When to Use Ignore:
- Never in production for security policies
- Development environments only
- Non-critical validation only
Webhook Implementation¶
See integration.md for complete webhook implementation examples in Go.
Policy Enforcement Patterns¶
Pattern 1: Image Registry Allowlist¶
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: image-policy
webhooks:
- name: validate.image.example.com
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: image-validator
namespace: security-system
path: "/validate-image"
failurePolicy: Fail
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: ["", "apps"]
apiVersions: ["*"]
resources: ["pods", "deployments", "replicasets"]
Threat Mitigated: Prevents pulling images from untrusted registries.
Pattern 2: Resource Quota Enforcement¶
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: resource-quota-validator
webhooks:
- name: validate.quota.example.com
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: quota-validator
namespace: security-system
path: "/validate-resources"
failurePolicy: Fail
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
Threat Mitigated: Prevents resource exhaustion and noisy neighbor problems.
Pattern 3: Security Context Validation¶
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: security-context-validator
webhooks:
- name: validate.securitycontext.example.com
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: security-validator
namespace: security-system
path: "/validate-security"
failurePolicy: Fail
rules:
- operations: ["CREATE", "UPDATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
What This Enforces: Pod security contexts, runAsNonRoot, allowPrivilegeEscalation, and readOnlyRootFilesystem requirements.
MutatingAdmissionWebhook¶
Mutating webhooks modify objects before admission.
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: security-defaults
webhooks:
- name: mutate.security.example.com
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: security-mutator
namespace: security-system
path: "/mutate"
failurePolicy: Fail
rules:
- operations: ["CREATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
Use Case: Inject security defaults if not specified. Mutate to add secure defaults, then validate with ValidatingWebhook.
Audit Logging¶
Audit logs record all API requests for forensics and compliance.
Audit Policy¶
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all authentication decisions
- level: RequestResponse
verbs: ["create", "update", "patch", "delete"]
resources:
- group: ""
resources: ["secrets", "configmaps"]
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
# Log pod creation/deletion
- level: Request
verbs: ["create", "delete"]
resources:
- group: ""
resources: ["pods"]
- group: "apps"
resources: ["deployments", "statefulsets", "daemonsets"]
# Log all denials
- level: RequestResponse
omitStages:
- RequestReceived
What This Logs:
- All secret and ConfigMap modifications
- RBAC changes
- Pod creation and deletion
- All admission denials
Audit Log Analysis¶
Query denied admissions:
# Find all denied pod creations
jq 'select(.verb=="create" and .objectRef.resource=="pods" and .responseStatus.code>=400)' \
/var/log/kubernetes/audit.log
# Find RBAC modifications
jq 'select(.objectRef.apiGroup=="rbac.authorization.k8s.io")' \
/var/log/kubernetes/audit.log
Webhook High Availability¶
Admission webhooks are critical. They must be highly available.
Deployment Configuration¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: security-webhook
namespace: security-system
spec:
replicas: 3
selector:
matchLabels:
app: security-webhook
template:
metadata:
labels:
app: security-webhook
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: security-webhook
topologyKey: kubernetes.io/hostname
containers:
- name: webhook
image: security-webhook:1.0
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /readyz
port: 8443
scheme: HTTPS
initialDelaySeconds: 5
periodSeconds: 5
High Availability Properties:
- 3 replicas for redundancy
- Pod anti-affinity (different nodes)
- Liveness and readiness probes
- Resource limits prevent starvation
PodDisruptionBudget¶
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: security-webhook-pdb
namespace: security-system
spec:
minAvailable: 2
selector:
matchLabels:
app: security-webhook
What This Ensures: At least 2 webhook pods available during node maintenance.
Testing Admission Webhooks¶
See integration.md for complete testing examples including valid/invalid pod tests and webhook failure simulations.
Common Patterns¶
Pattern: Exempt System Namespaces¶
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: security-policy
webhooks:
- name: validate.security.example.com
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values: ["kube-system", "kube-public", "kube-node-lease"]
Why: System components may need privileged access.
Threat Model¶
| Threat | Mitigation |
|---|---|
| Configuration bypass | failurePolicy: Fail (blocks if webhook down) |
| Untrusted images | Image registry allowlist webhook |
| Privilege escalation | Security context validation webhook |
| Resource exhaustion | Resource quota enforcement webhook |
| Audit evasion | Audit policy logs all denials |
References¶
Fail secure: deny by default, audit everything, enforce at admission time.