Skip to content

The Coverage That Mattered: When 99% Became a Security Signal

The OpenSSF Best Practices Passing badge doesn't mandate a specific coverage percentage.

We set our bar at 95% minimum. Above even Gold (90%). Self-imposed. Strategic.

We started at 0%. We wanted the Passing badge. But we knew something important: it's easier to build high standards into a young project than retrofit them later. When we go for Gold, 95% would already be habit.


The Security Requirement

OpenSSF Best Practices isn't a suggestion. It's certification that your project follows security best practices.

We were targeting Passing badge. But we set standards that exceed Gold:

  • ✅ Automated test suite (Passing requirement)
  • ✅ Tests invocable with standard command (go test ./...) (Passing requirement)
  • Passing: No specific coverage % required
  • Gold: 90% statement + 80% branch coverage
  • Our standard: 95% coverage (exceeding Gold before we needed it)
  • ✅ Race detection enabled (Passing requirement)
  • ✅ Tests run in CI (Passing requirement)

Young projects benefit from stringent standards. Easier to build discipline in than retrofit it later.


Phase 1: The Sprint (0% → 85%)

Starting from zero tests, we built comprehensive test suites for five packages in one focused effort.

Initial Coverage by Package:

Package Coverage Challenge
pkg/output 99.6% JSON, table, markdown formatters
pkg/markdown 95.1% Parser, AST traversal, admonitions
pkg/analyzer 94.7% Metrics, analysis, helpers
pkg/config 89.8% Load, defaults, overrides
cmd/readability 36.8% CLI integration
Total 85.8% Below requirement

The output formatters were easy because they were pure functions. The parser required mocking file I/O. The analyzer needed table-driven tests for metric variations.

But the CLI package dragged down the total. 85.8% wasn't our 95% target.

We wrote more tests. Coverage didn't move.


The Wall

gocyclo revealed why:

cmd/readability/main.go:45: main.run() complexity = 35 (limit: 15)
pkg/markdown/parser.go:78: Parse() complexity = 21 (limit: 15)

Cyclomatic complexity 35 means 35 different execution paths through one function.

Testing all 35 paths in main.run() required:

  • All flag combinations (config path, threshold, format, target)
  • Config loading variations (explicit file vs auto-detect)
  • Flag override scenarios
  • File vs directory analysis
  • Success vs failure for each layer
  • Error injection at multiple depths

The tests would be unmaintainable. The problem wasn't test coverage. It was code structure.


The Refactoring

main.run() did everything in one 35-complexity function. We extracted 11 focused functions:

Function Complexity Responsibility
loadConfig 5 Load config from file or auto-detect
applyFlagOverrides 5 Apply CLI flags to config
analyzeTarget 5 Analyze file or directory
outputResults 7 Format and write output
checkResults 3 Validate against thresholds
countFailures 8 Count failures by category
printFailureGuidance 4 Print guidance messages
printLengthGuidance 1 Length-specific guidance
printReadabilityGuidance 1 Readability-specific guidance
printAdmonitionGuidance 1 Admonition-specific guidance
run 6 Main orchestration (was 35)

Each function became testable independently. Table-driven tests worked.

Same pattern for the parser:

Before: Parse() complexity = 21 After: Max complexity = 10 across 6 focused functions


The Breakthrough

After refactoring, coverage jumped:

Final Coverage:

Package Before After Improvement
cmd/readability 36.8% 97.5% +60.7%
pkg/analyzer 94.7% 99.1% +4.4%
pkg/config 89.8% 100.0% +10.2%
pkg/markdown 95.1% 98.8% +3.7%
pkg/output 99.6% 99.6% -
Total 85.8% 99.0% +13.2%

From 85% to 99% without exotic test infrastructure. By making code simple enough to test.

293 tests total. Every test focused on one responsibility.


The Enforcement: Single Source of Truth

We raised the threshold from 80% to 95% and made codecov.yml the single source of truth.

Both CI and pre-commit hooks read from it:

# In both .github/workflows/ci.yml and .pre-commit-config.yaml
THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/")

One definition. Zero drift. Change the threshold in one place, enforcement updates everywhere.

Pre-push hook blocks commits below threshold:

- id: go-coverage
  name: Check test coverage threshold
  entry: >
    bash -c '
    THRESHOLD=$(grep -A2 "project:" codecov.yml | grep "target:" | head -1 | sed "s/.*target: *\([0-9]*\).*/\1/") || THRESHOLD=95;
    gotestsum --format testdox -- -race -coverprofile=/tmp/coverage.out -covermode=atomic ./... &&
    COVERAGE=$(go tool cover -func=/tmp/coverage.out | grep total | awk "{print \$3}" | sed "s/%//") &&
    if (( $(echo "$COVERAGE < $THRESHOLD" | bc -l) )); then
      echo "Coverage ${COVERAGE}% is below ${THRESHOLD}%"; exit 1;
    fi
    '
  stages: [pre-push]

Coverage became a blocking gate, not a suggestion.


The Security Signal

This wasn't about quality for quality's sake. It was about certification.

OpenSSF Best Practices Passing badge criteria we met (and exceeded):

  • test - Automated test suite exists (Passing requirement)
  • test_invocation - go test ./... works (Passing requirement)
  • test_most - "Most code" tested (Passing guideline, not enforced)
  • Our standard - 95% coverage (exceeds even Gold's 90%/80%)
  • test_continuous_integration - CI runs tests (Passing requirement)
  • dynamic_analysis - Race detector enabled (Passing requirement)

Without the Passing badge, you don't signal security maturity to auditors and enterprise users. By targeting 95%, we set our young project up for eventual Gold certification with standards already ingrained.

Coverage became a security signal. Not because tests prevent bugs (they do), but because coverage thresholds prove you take quality seriously enough to measure and enforce it.


The Tooling Stack

Getting to 99% required the right tools:

Coverage Reporting:

  • Codecov with component-based tracking
  • Per-package coverage breakdown
  • Trend analysis over time

Test Execution:

  • gotestsum for readable output and JUnit XML
  • Race detector (-race) in every run
  • Codecov Test Analytics for failure tracking

Enforcement:

  • codecov.yml as single source of truth
  • Pre-push hooks for local enforcement
  • CI threshold checks blocking merges

Complexity Management:

  • gocyclo to identify refactoring targets
  • Strict mode enforcing complexity limits
  • Pre-commit hooks catching violations early

What Changed

Before: "We have tests and they're comprehensive."

After: "We have 99% coverage enforced at commit time, verified in CI, tracked in Codecov, and certified by OpenSSF."

The difference isn't the tests. It's the enforcement and verification.

Coverage went from a quality metric to a security certification requirement.

Implementation Details

See Test Coverage Patterns for refactoring techniques and Coverage Enforcement for CI integration and threshold management.



Started at 0%. OpenSSF Passing has no coverage requirement. We targeted 95% (above Gold's 90%). Hit a wall at 85%. Refactoring broke through. 99% became enforced. Coverage became the security signal auditors recognize.

Comments