Measuring Engineering Productivity: What Actually Works (And What Doesn’t)

Intro
Engineering productivity is one of those metrics that everyone wants, few understand clearly, and even fewer measure well. How should teams define it? Which metrics are useful — and which can mislead? Drawing on Laura Tacho’s piece “Measuring Software Engineering Productivity” (as published in The Pragmatic Engineer) plus related research, here’s a guide to help Presence Digital set up productivity measurement in a way that helps, not hurts.
What Is Engineering Productivity, Really?
One of the biggest challenges: productivity means different things to different people.
- Developers may think of productivity in terms of activity — lines of code, number of pull requests, commits, etc.
- Managers often focus on outcomes — delivery of features, impact, meeting deadlines, business goals.
- Executives may care most about efficiency, risk reduction, return on investment.
- What It Measures: Four key metrics: Deployment Frequency, Lead Time for Changes, Change Failure Rate, MTTR (Mean Time To Recovery).
- Pros / When It Helps: Good for tracking DevOps maturity; benchmarking over time and across organisations.
- Caveats / Watch-Outs: Can be gamed or misinterpreted; often doesn’t capture qualitative aspects (developer satisfaction, code maintainability). Also useful only if teams have good monitoring/observability.
- What It Measures: How long it takes from when a production failure affects customers until full recovery.
- Pros / When It Helps: Helps measure reliability & resilience. Critical for customer-facing systems. Good to do periodic drills if outages are rare.
- Caveats / Watch-Outs: Requires careful definition of “failure start” and “recovery end”. Don’t use MTTR alone; many hidden delays (detection delay, fix deployment, verification, etc.).
- What It Measures: A broader view: Satisfaction & well-being; Performance; Activity; Collaboration; Efficiency / Flow.
- Pros / When It Helps: Captures more dimensions; helps avoid over-focusing on just output. Helps spot human- or team-level issues, not just code metrics.
- Caveats / Watch-Outs: More complex to implement. Some dimensions are qualitative; data may be noisy or subjective. Requires good culture and trust.
- Define what matters for you
- Align on what productivity means: Is it delivery speed? Reliability? Innovation? Customer satisfaction?
- Involve developers, managers, product, and possibly customers.
- Start small, measure wisely
- Focus on a few metrics that matter, not dozens.
- Combine quantitative data (e.g. deployment frequency, MTTR) with qualitative feedback (surveys, retrospectives).
- Break the work into phases
- For example, for MTTR: detection → diagnosing → developing fix → testing → deploying → verifying. Measuring each subphase helps locate bottlenecks.
- Beware of metrics as targets
- If you reward based solely on e.g. number of PRs merged, people may optimize for that rather than what actually improves business.
- Metrics should inform decisions; not dictate behaviors without context.
- Review and adapt over time
- Productivity definitions may need to evolve as your product, team size, or market changes.
- Use benchmarking (internally & externally) but don’t blindly copy others.
- Use tooling & observability carefully
- Monitoring/alerting system must be reliable so MTTR etc. are accurately tracked.
- Tools like Flow (GitPrime renamed) are useful, but only when used with awareness of their limitations.
- Overemphasis on activity metrics (lines of code, number of merges) without context.
- Ignoring developer satisfaction, burnout, code health.
- Trying to measure everything at once → overwhelming dashboard, low adoption.
- Treating metrics as a stick rather than a guide.
- Kickoff workshop — gather representatives (engineers, product, leadership) to co-define “productivity” + success criteria.
- Pilot metrics in one or two teams — perhaps a product team & an infrastructure/ops team — to test what works, what data is available, what’s hard to collect.
- Set up a few baseline metrics
- MTTR (with subphases)
- Deployment frequency
- Change failure rate
- Qualitative feedback from the team (e.g. “What blocked us this week?”)
- Build dashboards / regular reviews — but ensure conversations happen alongside the data: “What’s behind the numbers?”
- Iterate — adjust which metrics are emphasised, improve instrumentation, refine definitions.
