Shift Left: Preventing and Fixing Bad Data in Event Streams
At a high level, bad data refers to any data that doesn’t conform to expected formats or standards. It can creep into data sets in a variety of ways, and cause serious issues for all downstream data users. In Apache Kafka®, event streams are built on an immutable log, meaning that once data is written, it cannot be edited or deleted. While this immutability is a core feature, it also introduces unique challenges, and requires extra caution when producing to, and managing data in Kafka.

At a high level, bad data refers to any data that doesn’t conform to expected formats or standards. It can creep into data sets in a variety of ways, and cause serious issues for all downstream data users. In Apache Kafka®, event streams are built on an immutable log, meaning that once data is written, it cannot be edited or deleted. While this immutability is a core feature, it also introduces unique challenges, and requires extra caution when producing to, and managing data in Kafka.