Evidence Collection Strategies¶
Automated evidence capture in CI/CD workflows, retention policies, and aggregation patterns. How to collect evidence at scale without manual intervention.
Automation First
Real-time evidence capture in CI/CD workflows eliminates data loss risk. S3 lifecycle policies manage retention automatically. Evidence aggregation creates release bundles for auditors.
Automated Capture in CI/CD¶
Pattern: Every workflow generates evidence as artifacts.
name: Evidence Collection Workflow
on:
push:
branches: [main]
pull_request:
jobs:
collect-evidence:
runs-on: ubuntu-latest
permissions:
contents: read
actions: read
steps:
- uses: actions/checkout@v4
- name: Collect Evidence Bundle
run: |
mkdir evidence
# Branch protection
gh api repos/${{ github.repository }}/branches/main/protection \
> evidence/branch-protection.json
# Workflow metadata
gh run view ${{ github.run_id }} --json \
databaseId,event,headBranch,headSha,status,conclusion,createdAt \
> evidence/workflow-metadata.json
# Commit signature verification
git verify-commit ${{ github.sha }} > evidence/commit-signature.txt || true
- uses: actions/upload-artifact@v4
with:
name: audit-evidence-${{ github.run_id }}
path: evidence/
retention-days: 365
Retention Policies¶
| Evidence Type | Retention Period | Storage Class | Cost Optimization |
|---|---|---|---|
| Workflow logs | 1 year | S3 Standard | Archive to Glacier after 90 days |
| Security scans | 1 year | S3 Standard | Archive to Glacier after 180 days |
| SBOMs | Permanent | S3 Glacier IR | Immediate retrieval for active versions |
| Approvals | 3 years | S3 Standard | Archive to Deep Archive after 1 year |
| Deployments | Permanent | S3 Standard | Current deployments only |
| Branch protection | 1 year | S3 Standard | - |
S3 Lifecycle Policy Example:
{
"Rules": [
{
"Id": "Archive workflow logs after 90 days",
"Status": "Enabled",
"Filter": {
"Prefix": "workflow-logs/"
},
"Transitions": [
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
},
{
"Id": "Archive security scans after 180 days",
"Status": "Enabled",
"Filter": {
"Prefix": "scans/"
},
"Transitions": [
{
"Days": 180,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 365
}
}
]
}
Evidence Aggregation¶
Pattern: Bundle all evidence for a release into a single archive.
- name: Create Release Evidence Bundle
run: |
VERSION="${{ github.ref_name }}"
mkdir -p evidence-bundle
# Collect all evidence for this release
aws s3 cp s3://audit-evidence/sboms/${VERSION}/ evidence-bundle/sbom/ --recursive
aws s3 cp s3://audit-evidence/scans/${VERSION}/ evidence-bundle/scans/ --recursive
cp multiple.intoto.jsonl evidence-bundle/slsa-provenance.jsonl
# Create signed archive
tar -czf evidence-${VERSION}.tar.gz evidence-bundle/
cosign sign-blob --bundle evidence-${VERSION}.tar.gz.sig evidence-${VERSION}.tar.gz
# Upload bundle
aws s3 cp evidence-${VERSION}.tar.gz s3://audit-evidence/bundles/
aws s3 cp evidence-${VERSION}.tar.gz.sig s3://audit-evidence/bundles/
Real-Time vs Batch Collection¶
Real-Time (Preferred):
- Evidence captured during workflow execution
- Lower risk of data loss
- Immediate availability
- Example: Workflow logs uploaded to S3 at end of each run
Batch (Fallback):
- Periodic aggregation from multiple sources
- Useful for GitHub API data (PRs, reviews)
- Risk: Data may be deleted before collection
- Example: Weekly cron job to export all merged PRs
Recommendation: Use real-time for critical evidence (SLSA provenance, security scans). Batch is acceptable for historical aggregation (PR statistics, contribution metrics).
Related Patterns¶
- Audit Evidence Collection - Main overview
- Evidence Types - What to collect
- Compliance Reporting - How to retrieve evidence
- Implementation - Complete workflow examples
Real-time capture eliminates data loss. Lifecycle policies manage retention automatically. Evidence aggregation creates audit-ready bundles. Automate collection. Enforce retention. Prove compliance.