The Coverage That Mattered: When 99% Became a Security Signal¶
The OpenSSF Best Practices Passing badge doesn't mandate a specific coverage percentage.
We set our bar at 95% minimum. Above even Gold (90%). Self-imposed. Strategic.
We started at 0%. We wanted the Passing badge. But we knew something important: it's easier to build high standards into a young project than retrofit them later. When we go for Gold, 95% would already be habit.
The Security Requirement¶
OpenSSF Best Practices isn't a suggestion. It's certification that your project follows security best practices.
We were targeting Passing badge. But we set standards that exceed Gold:
- ✅ Automated test suite (Passing requirement)
- ✅ Tests invocable with standard command (
go test ./...) (Passing requirement) - ✅ Passing: No specific coverage % required
- ✅ Gold: 90% statement + 80% branch coverage
- ✅ Our standard: 95% coverage (exceeding Gold before we needed it)
- ✅ Race detection enabled (Passing requirement)
- ✅ Tests run in CI (Passing requirement)
Young projects benefit from stringent standards. Easier to build discipline in than retrofit it later.
Phase 1: The Sprint (0% → 85%)¶
Starting from zero tests, we built comprehensive test suites for five packages in one focused effort.
Initial Coverage by Package:
| Package | Coverage | Challenge |
|---|---|---|
| pkg/output | 99.6% | JSON, table, markdown formatters |
| pkg/markdown | 95.1% | Parser, AST traversal, admonitions |
| pkg/analyzer | 94.7% | Metrics, analysis, helpers |
| pkg/config | 89.8% | Load, defaults, overrides |
| cmd/readability | 36.8% | CLI integration |
| Total | 85.8% | Below requirement |
The output formatters were easy because they were pure functions. The parser required mocking file I/O. The analyzer needed table-driven tests for metric variations.
But the CLI package dragged down the total. 85.8% wasn't our 95% target.
We wrote more tests. Coverage didn't move.
The Wall¶
gocyclo revealed why:
cmd/readability/main.go:45: main.run() complexity = 35 (limit: 15)
pkg/markdown/parser.go:78: Parse() complexity = 21 (limit: 15)
Cyclomatic complexity 35 means 35 different execution paths through one function.
Testing all 35 paths in main.run() required:
- All flag combinations (config path, threshold, format, target)
- Config loading variations (explicit file vs auto-detect)
- Flag override scenarios
- File vs directory analysis
- Success vs failure for each layer
- Error injection at multiple depths
The tests would be unmaintainable. The problem wasn't test coverage. It was code structure.
The Refactoring¶
main.run() did everything in one 35-complexity function. We extracted 11 focused functions:
| Function | Complexity | Responsibility |
|---|---|---|
loadConfig |
5 | Load config from file or auto-detect |
applyFlagOverrides |
5 | Apply CLI flags to config |
analyzeTarget |
5 | Analyze file or directory |
outputResults |
7 | Format and write output |
checkResults |
3 | Validate against thresholds |
countFailures |
8 | Count failures by category |
printFailureGuidance |
4 | Print guidance messages |
printLengthGuidance |
1 | Length-specific guidance |
printReadabilityGuidance |
1 | Readability-specific guidance |
printAdmonitionGuidance |
1 | Admonition-specific guidance |
run |
6 | Main orchestration (was 35) |
Each function became testable independently. Table-driven tests worked.
Same pattern for the parser:
Before: Parse() complexity = 21
After: Max complexity = 10 across 6 focused functions
The Breakthrough¶
After refactoring, coverage jumped:
Final Coverage:
| Package | Before | After | Improvement |
|---|---|---|---|
| cmd/readability | 36.8% | 97.5% | +60.7% |
| pkg/analyzer | 94.7% | 99.1% | +4.4% |
| pkg/config | 89.8% | 100.0% | +10.2% |
| pkg/markdown | 95.1% | 98.8% | +3.7% |
| pkg/output | 99.6% | 99.6% | - |
| Total | 85.8% | 99.0% | +13.2% |
From 85% to 99% without exotic test infrastructure. By making code simple enough to test.
293 tests total. Every test focused on one responsibility.
The Enforcement: Single Source of Truth¶
We raised the threshold from 80% to 95% and made codecov.yml the single source of truth.
Both CI and pre-commit hooks read from it:
# In both .github/workflows/ci.yml and .pre-commit-config.yaml
THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/")
One definition. Zero drift. Change the threshold in one place, enforcement updates everywhere.
Pre-push hook blocks commits below threshold:
- id: go-coverage
name: Check test coverage threshold
entry: >
bash -c '
THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/") || THRESHOLD=95;
gotestsum --format testdox -- -race -coverprofile=/tmp/coverage.out -covermode=atomic ./... &&
COVERAGE=$(go tool cover -func=/tmp/coverage.out | grep total | awk "{print \$3}" | sed "s/%//") &&
if (( $(echo "$COVERAGE < $THRESHOLD" | bc -l) )); then
echo "Coverage ${COVERAGE}% is below ${THRESHOLD}%"; exit 1;
fi
'
stages: [pre-push]
Coverage became a blocking gate, not a suggestion.
The Security Signal¶
This wasn't about quality for quality's sake. It was about certification.
OpenSSF Best Practices Passing badge criteria we met (and exceeded):
- ✅ test - Automated test suite exists (Passing requirement)
- ✅ test_invocation -
go test ./...works (Passing requirement) - ✅ test_most - "Most code" tested (Passing guideline, not enforced)
- ✅ Our standard - 95% coverage (exceeds even Gold's 90%/80%)
- ✅ test_continuous_integration - CI runs tests (Passing requirement)
- ✅ dynamic_analysis - Race detector enabled (Passing requirement)
Without the Passing badge, you don't signal security maturity to auditors and enterprise users. By targeting 95%, we set our young project up for eventual Gold certification with standards already ingrained.
Coverage became a security signal. Not because tests prevent bugs (they do), but because coverage thresholds prove you take quality seriously enough to measure and enforce it.
The Tooling Stack¶
Getting to 99% required the right tools:
Coverage Reporting:
- Codecov with component-based tracking
- Per-package coverage breakdown
- Trend analysis over time
Test Execution:
- gotestsum for readable output and JUnit XML
- Race detector (
-race) in every run - Codecov Test Analytics for failure tracking
Enforcement:
codecov.ymlas single source of truth- Pre-push hooks for local enforcement
- CI threshold checks blocking merges
Complexity Management:
- gocyclo to identify refactoring targets
- Strict mode enforcing complexity limits
- Pre-commit hooks catching violations early
What Changed¶
Before: "We have tests and they're comprehensive."
After: "We have 99% coverage enforced at commit time, verified in CI, tracked in Codecov, and certified by OpenSSF."
The difference isn't the tests. It's the enforcement and verification.
Coverage went from a quality metric to a security certification requirement.
Implementation Details
See Test Coverage Patterns for refactoring techniques and Coverage Enforcement for CI integration and threshold management.
Related Patterns¶
- The Wall at 85% - Complexity blocking coverage (same journey, different angle)
- OpenSSF Best Practices Badge - The Passing badge we earned (with standards exceeding Gold)
- SDLC Hardening - Testing in audit context
Started at 0%. OpenSSF Passing has no coverage requirement. We targeted 95% (above Gold's 90%). Hit a wall at 85%. Refactoring broke through. 99% became enforced. Coverage became the security signal auditors recognize.