The Coverage That Mattered: When 99% Became a Security Signal¶

The OpenSSF Best Practices Passing badge doesn't mandate a specific coverage percentage.

We set our bar at 95% minimum. Above even Gold (90%). Self-imposed. Strategic.

We started at 0%. We wanted the Passing badge. But we knew something important: it's easier to build high standards into a young project than retrofit them later. When we go for Gold, 95% would already be habit.

The Security Requirement¶

OpenSSF Best Practices isn't a suggestion. It's certification that your project follows security best practices.

We were targeting Passing badge. But we set standards that exceed Gold:

✅ Automated test suite (Passing requirement)
✅ Tests invocable with standard command (go test ./...) (Passing requirement)
✅ Passing: No specific coverage % required
✅ Gold: 90% statement + 80% branch coverage
✅ Our standard: 95% coverage (exceeding Gold before we needed it)
✅ Race detection enabled (Passing requirement)
✅ Tests run in CI (Passing requirement)

Young projects benefit from stringent standards. Easier to build discipline in than retrofit it later.

Phase 1: The Sprint (0% → 85%)¶

Starting from zero tests, we built comprehensive test suites for five packages in one focused effort.

Initial Coverage by Package:

Package	Coverage	Challenge
pkg/output	99.6%	JSON, table, markdown formatters
pkg/markdown	95.1%	Parser, AST traversal, admonitions
pkg/analyzer	94.7%	Metrics, analysis, helpers
pkg/config	89.8%	Load, defaults, overrides
cmd/readability	36.8%	CLI integration
Total	85.8%	Below requirement

The output formatters were easy because they were pure functions. The parser required mocking file I/O. The analyzer needed table-driven tests for metric variations.

But the CLI package dragged down the total. 85.8% wasn't our 95% target.

We wrote more tests. Coverage didn't move.

The Wall¶

gocyclo revealed why:

cmd/readability/main.go:45: main.run() complexity = 35 (limit: 15)
pkg/markdown/parser.go:78: Parse() complexity = 21 (limit: 15)

Cyclomatic complexity 35 means 35 different execution paths through one function.

Testing all 35 paths in main.run() required:

All flag combinations (config path, threshold, format, target)
Config loading variations (explicit file vs auto-detect)
Flag override scenarios
File vs directory analysis
Success vs failure for each layer
Error injection at multiple depths

The tests would be unmaintainable. The problem wasn't test coverage. It was code structure.

The Refactoring¶

main.run() did everything in one 35-complexity function. We extracted 11 focused functions:

Function	Complexity	Responsibility
`loadConfig`	5	Load config from file or auto-detect
`applyFlagOverrides`	5	Apply CLI flags to config
`analyzeTarget`	5	Analyze file or directory
`outputResults`	7	Format and write output
`checkResults`	3	Validate against thresholds
`countFailures`	8	Count failures by category
`printFailureGuidance`	4	Print guidance messages
`printLengthGuidance`	1	Length-specific guidance
`printReadabilityGuidance`	1	Readability-specific guidance
`printAdmonitionGuidance`	1	Admonition-specific guidance
`run`	6	Main orchestration (was 35)

Each function became testable independently. Table-driven tests worked.

Same pattern for the parser:

Before: Parse() complexity = 21 After: Max complexity = 10 across 6 focused functions

The Breakthrough¶

After refactoring, coverage jumped:

Final Coverage:

Package	Before	After	Improvement
cmd/readability	36.8%	97.5%	+60.7%
pkg/analyzer	94.7%	99.1%	+4.4%
pkg/config	89.8%	100.0%	+10.2%
pkg/markdown	95.1%	98.8%	+3.7%
pkg/output	99.6%	99.6%	-
Total	85.8%	99.0%	+13.2%

From 85% to 99% without exotic test infrastructure. By making code simple enough to test.

293 tests total. Every test focused on one responsibility.

The Enforcement: Single Source of Truth¶

We raised the threshold from 80% to 95% and made codecov.yml the single source of truth.

Both CI and pre-commit hooks read from it:

# In both .github/workflows/ci.yml and .pre-commit-config.yaml
THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/")

One definition. Zero drift. Change the threshold in one place, enforcement updates everywhere.

Pre-push hook blocks commits below threshold:

- id: go-coverage
  name: Check test coverage threshold
  entry: >
    bash -c '
    THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/") || THRESHOLD=95;
    gotestsum --format testdox -- -race -coverprofile=/tmp/coverage.out -covermode=atomic ./... &&
    COVERAGE=$(go tool cover -func=/tmp/coverage.out | grep total | awk "{print \$3}" | sed "s/%//") &&
    if (( $(echo "$COVERAGE < $THRESHOLD" | bc -l) )); then
      echo "Coverage ${COVERAGE}% is below ${THRESHOLD}%"; exit 1;
    fi
    '
  stages: [pre-push]

Coverage became a blocking gate, not a suggestion.

The Security Signal¶

This wasn't about quality for quality's sake. It was about certification.

OpenSSF Best Practices Passing badge criteria we met (and exceeded):

✅ test - Automated test suite exists (Passing requirement)
✅ test_invocation - go test ./... works (Passing requirement)
✅ test_most - "Most code" tested (Passing guideline, not enforced)
✅ Our standard - 95% coverage (exceeds even Gold's 90%/80%)
✅ test_continuous_integration - CI runs tests (Passing requirement)
✅ dynamic_analysis - Race detector enabled (Passing requirement)

Without the Passing badge, you don't signal security maturity to auditors and enterprise users. By targeting 95%, we set our young project up for eventual Gold certification with standards already ingrained.

Coverage became a security signal. Not because tests prevent bugs (they do), but because coverage thresholds prove you take quality seriously enough to measure and enforce it.

The Tooling Stack¶

Getting to 99% required the right tools:

Coverage Reporting:

Codecov with component-based tracking
Per-package coverage breakdown
Trend analysis over time

Test Execution:

gotestsum for readable output and JUnit XML
Race detector (-race) in every run
Codecov Test Analytics for failure tracking

Enforcement:

codecov.yml as single source of truth
Pre-push hooks for local enforcement
CI threshold checks blocking merges

Complexity Management:

gocyclo to identify refactoring targets
Strict mode enforcing complexity limits
Pre-commit hooks catching violations early

What Changed¶

Before: "We have tests and they're comprehensive."

After: "We have 99% coverage enforced at commit time, verified in CI, tracked in Codecov, and certified by OpenSSF."

The difference isn't the tests. It's the enforcement and verification.

Coverage went from a quality metric to a security certification requirement.

Implementation Details

See Test Coverage Patterns for refactoring techniques and Coverage Enforcement for CI integration and threshold management.

Related Patterns¶

The Wall at 85% - Complexity blocking coverage (same journey, different angle)
OpenSSF Best Practices Badge - The Passing badge we earned (with standards exceeding Gold)
SDLC Hardening - Testing in audit context

Started at 0%. OpenSSF Passing has no coverage requirement. We targeted 95% (above Gold's 90%). Hit a wall at 85%. Refactoring broke through. 99% became enforced. Coverage became the security signal auditors recognize.