Real-time data pipelines have become a competitive necessity. When our retail client approached us with 5 million daily events and batch jobs running every hour, the business pain was clear: merchandising decisions were being made on yesterday's data. The solution was a streaming architecture built on Apache Kafka and Google BigQuery.

Architecture Overview

Our architecture uses Kafka as the central nervous system — all event producers (POS systems, mobile apps, warehouse management) publish to dedicated topics. A fleet of Kafka Connect connectors handles source-side ingestion, while custom Kafka Streams processors handle enrichment and deduplication before data lands in BigQuery via the BigQuery Storage Write API.

The Challenge: Schema Evolution at Scale

The biggest engineering challenge wasn't throughput — it was schema evolution. With 40+ event types and producers maintained by different teams, schemas change frequently. We implemented Schema Registry with Avro, enforcing backward compatibility rules at the connector level. Any schema-breaking change triggers an automated PR review process before deployment.

BigQuery as the Analytical Sink

BigQuery's Storage Write API was a game-changer for our use case. Unlike the legacy streaming insert API, it supports exactly-once delivery semantics and significantly reduces costs for high-volume writes. We partition all event tables by ingestion date and cluster by event_type and customer_region — query performance improved by 60% compared to our initial non-clustered setup.

Results

After 3 months in production: end-to-end latency dropped from 2–4 hours to under 8 seconds at the 99th percentile. The merchandising team now runs hourly inventory rebalancing instead of daily, directly attributed to a 12% reduction in stockout events during the holiday season.