Pioneers of modern incident management

Modern incident management draws inspiration from the past but owes much of its recent evolution to the bold experimentation of tech pioneers. Here are some key players worth noting:

Pioneers of modern incident management.png

Steve Capps: Developer of the “monkey.” In 1983, an Apple Macintosh engineer, Steve Capps, developed a “monkey” that generated a series of rapid-fire random user interface inputs. The monkey proved to be a valuable testing tool for identifying failures in software applications, eventually inspiring the idea of “chaos engineering” and the “chaos monkey.”

Jesse Robins: Creator of “game day” In the early 2000s, Amazon engineer Jesse Robins, known as the “Master of Disaster,” created a program called Game Day. The idea was to simulate real-world incidents by introducing significant failures in software systems. Game days are still used to help teams improve their incident response and operational readiness.

Netflix Engineers: Chaos engineering & chaos monkey A major Netflix outage in 2008 led the company to migrate from hardware servers to AWS, and their engineers needed a way to ensure system resilience in the new highly distributed architecture. 

Inspired by Jesse Robins’ game day, they devised the idea of chaos engineering, which intentionally introduces disruptions in critical systems. The goal is to identify and fix systemic weaknesses before they cause real issues. Chaos monkey, inspired by Capps’ software monkey, was built to randomly turn off servers and test how well systems handle failures and failovers.