Skip to main content

61 posts tagged with "engineering-metrics"

View all tags

Observability Stack: Datadog vs Grafana vs Honeycomb

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

An SRE lead at a mid-size fintech told me the quote that defines 2026 observability decisions: "Datadog is the iPhone of observability — expensive, polished, and I wish I had a choice." The market has three credible positions now: Datadog as the integrated default, Grafana as the open-source-first alternative, and Honeycomb as the wide-events specialist. Each is optimized for a different failure mode, and picking the wrong one doesn't show up in the first quarter — it shows up as a $2M annual bill and a team that still can't answer "why was latency spiky on Tuesday?"

CNCF's 2024 Annual Survey reported that 86% of cloud-native organizations use OpenTelemetry in some form — which sounds like the market is standardizing. In practice OTel is a pipeline, not a destination; every shop running it still picks one of these three stacks (or Splunk, New Relic, Dynatrace — we'll touch those briefly) to actually store, query, and visualize the data. Honeycomb's own observability maturity research shows that teams adopting wide-events cut investigation time on novel incidents by 40-60%, but only when the culture adapts — tooling alone doesn't deliver the lift.

Retail Engineering: Online + Brick-and-Mortar Metrics

· 10 min read
Artur Pan
CTO & Co-Founder at PanDev

An engineering director at a 400-store regional retailer put it cleanly: "Every time we ship a feature that makes the website faster, we hear applause from marketing. Every time we ship a feature that lets a store associate do their job in half the clicks, we hear silence — and then the quarterly numbers move." Retail engineering is the discipline of serving two populations (shoppers and store associates) and two physical realities (the warehouse and the store floor) from the same codebase.

McKinsey's 2024 State of Retail report found that 73% of shoppers used multiple channels for a single purchase journey — browse mobile, try in-store, buy online, return curbside. Every one of those transitions is an engineering surface: the product-detail page has to know store availability, the BOPIS (buy online, pickup in store) flow has to reserve inventory atomically, the returns kiosk has to un-reserve it. A 2023 IHL Group study documented $1.75 trillion in global retail out-of-stock losses — many of which trace back to inventory-service latency or sync failures, not physical stockouts.

Time Zones and Engineering Velocity: Real Data

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

A distributed team with 5 hours of timezone spread has a median lead time of 6.8 days per change. A colocated team in the same codebase — same language, same size, same PR size — has a median lead time of 3.2 days. That's not a rounding error. That's the timezone tax, and it roughly doubles at every additional 3-4 hours of spread. GitLab's 2023 remote-work report estimated "3-5 hours of overlap" as the sweet spot for async-friendly teams, and our IDE-heartbeat data across 100+ B2B companies says the same — but with the extra detail of where exactly the time goes.

This isn't an article about whether remote work is good (it is, for many teams). It's about the specific ways that timezone spread slows delivery, and what measurements tell you whether your distributed team is paying a 2× lead-time penalty or learning to live with it.

Payments and Banking Engineering: Compliance + Speed

· 10 min read
Artur Pan
CTO & Co-Founder at PanDev

A payments engineering director told me the sentence that captures the whole vertical: "We have two stopwatches running. One measures how fast we ship. The other measures how many years we'll be paying for the mistake we ship fast." Everything else in payments engineering is a tradeoff on that pair.

The Bank for International Settlements' 2024 Annual Economic Report documents that global cross-border payments cleared $190 trillion in 2023, with payment technology handling roughly 1.4 billion daily transactions. Nilson Report, the card-industry reference, tracks industry fraud losses at around $33 billion globally per year — that's roughly 6 basis points on card volume, paid for by the engineering quality of the platforms in the middle. An engineering team shipping a regression into the authorization path doesn't get fired for shipping slowly; they get fired for the 40-basis-point spike on the next week's reconciliation report.

Terraform Adoption: Metrics for Infrastructure Teams

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

The team adopted Terraform 18 months ago. Deploys are slower than the old click-ops setup, reviews take longer, and three of your best engineers now spend a full day per week on Terraform plan output. Senior leadership asks whether the migration was worth it, and nobody has a clean answer. The honest one is: you never defined what "worth it" looks like in metrics. HashiCorp's 2024 State of Cloud Strategy reported that 76% of enterprises adopted IaC, but only 31% measured its outcomes against pre-adoption baselines. The CNCF's 2023 Annual Survey found a similar gap for infrastructure-as-code tooling generally.

This article is a measurement framework for infrastructure teams already using Terraform, OpenTofu, or Pulumi. It doesn't debate whether IaC is worthwhile — that ship sailed. It defines six metrics that show whether your adoption is healthy or decaying, plus the benchmark ranges from 37 companies in our dataset that run Terraform in production.

Kubernetes Engineering Observability: What to Track in 2026

· 7 min read
Artur Pan
CTO & Co-Founder at PanDev

A platform team running 11 production Kubernetes clusters has 94,000 metrics scraped every 15 seconds, 2.4 TB of logs per day in Loki, and a Grafana instance with 340 dashboards. When their VP of Engineering asked "are our teams shipping reliably on K8s?", nobody could answer in under an hour. They had cluster observability. They had zero engineering observability.

These are two different problems. Cluster observability tells you whether pods are healthy. Engineering observability tells you whether engineering on top of those clusters is healthy — whether deployments are fast, whether rollbacks are rare, whether developers are waiting on infrastructure or fighting with it. Most K8s shops have solved the first and ignored the second. The 2024 CNCF annual survey reported that 68% of enterprise K8s users struggle with "making observability actionable", which is a polite way of saying they have metrics but no decisions come out of them.

Travel and Hospitality Engineering: Booking Platform Teams

· 10 min read
Artur Pan
CTO & Co-Founder at PanDev

A former Expedia engineer told me the quote that should be pinned above every travel-engineering team's desk: "We don't ship software — we ship promises about the future availability of physical objects." An Amadeus GDS query returns inventory that's simultaneously being consumed by 50+ competing distribution channels. Your code has to reconcile that in under 400ms or the user gives up.

Phocuswright's 2024 travel-technology report pegs the global online-travel industry at $1.06 trillion in gross bookings, with roughly 38% flowing through technology platforms that sit between travelers and suppliers. Amazon Web Services' travel-vertical analysis documents that peak-season traffic on booking engines routinely exceeds 15× the yearly baseline — more extreme than any other e-commerce vertical except Black Friday retail. Engineering teams built on "just scale horizontally" assumptions discover, on the first December, that search-cache misses on an unreachable GDS generate cascading failures 90 seconds deep.

AdTech Engineering: Data-Heavy Teams and Productivity

· 7 min read
Artur Pan
CTO & Co-Founder at PanDev

In our IDE dataset of 100+ B2B companies, engineers on AdTech platforms ship 38% fewer pull requests per month than engineers in SaaS tooling — and produce more customer revenue per head. Meanwhile The Trade Desk disclosed it processes over 13 million ad requests per second. Scale like that reshapes what "productive" means. A PR count that would look alarming in a consumer app is perfectly normal when a single configuration line is deployed across 10 million QPS.

AdTech engineering is different, and measuring it with generic DORA-only dashboards misses the point. This article lays out what data-heavy teams actually spend time on, what the numbers look like across the 14 AdTech companies in our dataset, and which productivity signals matter more than throughput for real-time bidding, attribution, and ad-server work.

Media and Streaming Engineering: Building for Peak Load

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

When Super Bowl LVIII streamed on CBS in 2024, peak concurrent viewers hit 123 million — a number that isn't a KPI, it's a physics problem. Disney+'s Ahsoka finale generated 14 million account logins in a 15-minute window. Netflix's Tyson-Paul fight in late 2024 failed visibly on Twitter because the streaming stack buckled at ~60 million concurrent streams. Media engineering is not optimizing for average throughput. It's optimizing for the one hour per quarter where your graphs go vertical.

The companies that do this well share a specific team shape, a specific release cadence, and a specific set of measurement habits that don't apply to most B2B SaaS. Pulling DORA metrics off a streaming platform and comparing them to a CRM is apples and typhoons. This is a field guide for the engineering leaders who run — or are about to run — a media platform through peak.

Logistics Engineering Metrics for Delivery Platform Teams

· 7 min read
Artur Pan
CTO & Co-Founder at PanDev

A delivery platform's engineering team runs a fundamentally different workload from a B2B SaaS team. The courier mobile app pings location every 3-5 seconds. The dispatcher console expects sub-200ms order assignments. Route-optimization jobs crunch combinatorial problems overnight and need to finish before dawn shifts start. A 2024 McKinsey report on last-mile logistics pegged the cost of a single hour of dispatcher downtime at $12,000-$35,000 for a mid-size regional carrier.

This shape of work changes what engineering metrics actually matter. DORA four keys still apply, but the team-health and delivery-performance picture shifts. Here's the metric stack that fits logistics platform teams — and the places where "copy a SaaS DORA dashboard" misleads you.