Red Team cover

Red Team

by Micah Zenko

Red Team by Micah Zenko explores military and security strategies that anticipate adversarial moves, unveiling vulnerabilities from the capture of Osama bin Laden to security breaches. It offers insights into protecting organizations by thinking like the enemy, emphasizing the importance of red teams in both public and private sectors.

Seeing From the Outside In

How can you challenge your own assumptions before reality does it for you? In Red Teaming, Bryce G. Hoffman argues that institutions—from militaries to corporations—can escape groupthink only by learning to see themselves from the outside in. Red teaming, he explains, is not about dissent for its own sake but a structured process to test plans, vulnerabilities, and beliefs against realistic adversary behavior. The discipline’s origins trace back to the Vatican’s thirteenth-century Advocatus Diaboli—the Devil’s Advocate charged with scrutinizing arguments for sainthood. That office, later abolished, embodied the principle that progress depends on empowered critique.

Hoffman’s book builds a unified theory of institutional self-questioning. A red team creates the conditions to think what others dare not; it acts as a mirror held by skeptics who emulate adversaries, simulate crises, and construct alternative interpretations of evidence. The final goal is not rebellion but resilience—helping leaders confront blind spots and make decisions that survive friction.

Core Methods and Philosophy

You learn three foundational methods: simulations, vulnerability probes, and alternative analyses. Simulations rehearse the future so you can test how plans behave under stress (as with the Navy SEAL raid on Osama bin Laden’s compound, repeatedly rehearsed and “red teamed to death”). Probes emulate adversaries—like Sandia National Laboratories’ IDART breaking into nuclear software and airport systems to uncover hidden weaknesses. Alternative analyses create competing narratives from the same data, as President Bush’s two parallel teams did in evaluating Syria’s Al Kibar site in 2007. Each technique helps expose assumptions that untested plans conceal.

Historical Roots and Institutional Lessons

Military origins shape modern practice. The U.S. Army’s University of Foreign Military and Cultural Studies (UFMCS) institutionalized critical thinking through metacognition and structured doubt. Yet exercises such as Millennium Challenge 2002—the $250 million war game scripted to guarantee a friendly win—illustrate how hierarchy can corrupt testing. Hoffman shows that red teaming thrives only under honest leadership: General James Amos’ effort to embed red teams in Marine Corps command structures succeeded only where commanders valued dissent. The lesson applies widely—without top-level buy-in, red teams become ceremonial “kids at a card table.”

Applications Beyond the Military

In intelligence, red teaming balances analysis against bias. The CIA’s post‑9/11 Red Cell was built precisely to “make seniors uncomfortable” through contrarian memos that questioned orthodoxy. By contrast, Team B’s politicized 1976 experiment—stacked with ideological hawks—shows what happens when alternative analysis becomes propaganda. In homeland security, vulnerability probes like the FAA’s pre‑9/11 team proved prophetic but were ignored, whereas NYPD’s Ray Kelly used red‑team‑style table‑tops to transform training for Mumbai‑style attacks. In each domain, success correlates with seriousness: leaders who act on findings gain foresight; those who simply commission tests gain paperwork.

People, Tools, and Culture

Hoffman stresses that red teaming is human work. The best practitioners are curious misfits—skeptical yet diplomatic, fearless but tactful. They use structured methods such as weighted anonymous feedback, premortems, and dissent mapping to surface unobvious insights. Training and rotation prevent institutional capture; diversity in viewpoint prevents bias replication. Sandia’s IDART formalized this professionalism, turning red teaming into a repeatable craft with clear engagement rules and validation cycles. Whether probing cyber systems, running business war games, or stress‑testing infrastructure, the guiding norm remains autonomy combined with accountability.

From Discipline to Ethos

The final message is ethical as much as procedural: “You cannot grade your own homework.” Red teaming, done well, builds organizational humility—a willingness to test cherished ideas against harsh reality. When ignored or abused (scripted war games, politicized panels, performative audits) it becomes theater. When respected and institutionalized through leadership support, balanced independence, and genuine follow‑through, it becomes a societal safeguard against surprise and hubris. In short, Hoffman urges you to make red teaming not an event but a way of thinking.


Designing Effective Red Teams

Building a red team isn’t about adding rebels to an org chart—it’s about creating a disciplined system of realism. Hoffman identifies six best practices that transform red teams from symbolic dissenters into practical reform engines, grounded in lessons from both military and private sectors.

Leadership Support and Structural Independence

First, a red team must have visible support from the top. Without executive backing, its access and audience vanish. General Amos’ Marine Corps reform illustrated this truth: where commanders embraced his 2010 mandate, teams produced meaningful critiques; where staff dismissed it, work degenerated into “special projects.” Independence must also be balanced with empathy—teams too distant lack relevance, while embedded ones lose objectivity. The Al Kibar intelligence test achieved this equilibrium by dividing analysts into separate cells—one proving, one disproving—a structure that preserved both access and impartiality.

Recruiting the Right People

You need courageous, literate skeptics who combine intellect with finesse. UFMCS and the CIA Red Cell select intellectually curious contrarians who can brief and write persuasively. Rotation prevents capture—staff spend limited time in red‑team roles to keep fresh perspective. Training matters: courses emphasize metacognition and bias awareness so analysts recognize their own cognitive traps before judging others. (Note: This parallels IARPA’s later Macbeth project for bias gamification.)

Tools, Variety, and Adaptability

Method diversity keeps targets honest. The Army’s “weighted anonymous feedback” card trick and premortems illustrate how controlled anonymity surfaces unfiltered ideas. IDART’s simple playbook—open‑source hacking tools and iterative validation—makes re‑testing repeatable. You should rotate techniques across scenarios: simulations for planning, probes for defense auditing, and alternative analyses for intelligence or business strategy.

Follow‑Through and Cadence

Findings must provoke action, not bureaucratic archive. HealthCare.gov’s ignored pre‑launch warnings epitomize failure through inaction. Establish feedback loops: validate fixes after implementation, set a Good Idea Cut‑Off Time to prevent paralysis, and prioritize high‑impact decisions for deep testing. Red teaming is about rhythm—probe major projects hard, routine processes lightly.

Designing red teams demands culture, not just structure. Recruit adaptive contrarians, rotate them before capture, diversify techniques, and—most of all—secure leadership that listens and acts.

If you build a red team as Hoffman prescribes, you’ll gain a disciplined sensor for blind spots and flawed plans—an organizational immune system that strengthens rather than destabilizes.


Military and Intelligence Lessons

The modern red‑team tradition was forged in the crucible of the military and refined in intelligence. These domains demonstrate both the necessity and fragility of structured dissent—how power hierarchies need challenge yet instinctively suppress it.

War Games and Cultural Resistance

From Cold War simulations to Iraq wargames, the military used red teams to emulate opponents and test doctrine. UFMCS institutionalized the art of “thinking about thinking,” equipping officers with frameworks for premortems and cultural empathy. But Millennium Challenge 2002 remains the cautionary classic: Lieutenant General Paul Van Riper’s OPFOR defeated the U.S. fleet—and was then restrained by scripted rules to produce a politically convenient win. Hoffman highlights this as red teaming’s dark side: decision‑makers often crave validation more than truth.

Command Climate and Implementation

James Amos’ Marine Corps challenge showed that institutional adoption depends on culture. Where leaders valued feedback, red teams influenced safety and withdrawal planning; where staff dismissed dissent, they were marginalized. This pattern mirrors corporate dynamics: red teams succeed when executives treat critique as an operational asset, not a nuisance.

Intelligence Red Teams and Analytical Integrity

In the Intelligence Community, alternative analysis evolved to counter cognitive bias. Team B’s 1976 misfire and politicization proved that composition matters more than rhetoric. George Tenet’s post‑9/11 CIA Red Cell corrected this by institutionalizing permanent contrarian analysis—memos designed to discomfort senior readers. The Al Shifa missile incident (1998) showed the cost of omitting red teaming entirely: a misjudged strike based on one soil sample and narrow compartmentalization.

Competitive Analysis as Risk Stewardship

The 2011 Bin Laden raid illustrates disciplined use. Multiple teams debated probabilities and produced independent confidence levels, enabling the president to synthesize competing judgments into a reasoned decision. Hoffman’s insight: alternative views don’t eliminate uncertainty; they structure it for better executive choices.


Homeland and Infrastructure Security

In homeland security, red teaming shifts from analytical critique to physical and operational testing. Hoffman uses these cases to show how bureaucracies react to uncomfortable evidence—and why acting on it can save lives.

Unheeded Warnings and Lessons Learned

The FAA’s covert team, created after the Lockerbie bombing, repeatedly breached checkpoints and documented systemic failures. Yet leadership ignored reports, leaving vulnerabilities that 9/11 exploited. Hoffman’s lesson: evidence unacted upon is failure disguised as diligence.

Constructive Probing and Real Action

MANPADS assessments after the Mombasa missile attack applied adversary modeling to pinpoint feasible launch sites near American airports, enabling realistic counter‑measures. Similarly, the NYPD’s tabletops under Commissioner Ray Kelly turned hypothetical crises into tangible policy. His buy‑in produced concrete logistics: armed narcotics officers stationed citywide, rapid recall systems, and blueprint repositories for major hotels.

Infrastructure Vulnerabilities

GAO’s probes of radiation smuggling and PG&E’s Metcalf substation shooting expose infrastructure fragility. Hoffman urges leaders to use red teams as preemptive diagnosis—not reactive autopsy. Successful homeland red teaming requires not just detection but authority to enforce fixes.

Red teaming for public safety is accountability in its purest form: test what could kill you, listen when it hurts, and act before the next headline forces your hand.

These examples underline the price of denial—and the power of credible adversary simulation when leaders choose humility over comfort.


Business and Technical Applications

Corporations face adversaries too—competitors, hackers, and self‑deception. Hoffman translates military realism into executive practice through business war games and penetration‑testing case studies, proving that disciplined dissent crosses every industry boundary.

Business War‑Games: Stress‑Testing Strategy

Mark Chussil’s quantitative simulations and Benjamin Gilad’s psychological role‑plays represent two dominant models. Chussil’s data‑driven games expose profit deltas beyond intuition; Gilad’s rival‑empathy exercises, grounded in Porter’s Four Corners Model, reveal behavioral logic unseen in spreadsheets. Success requires CEO sponsorship and independent facilitation—the same leadership dynamic proven essential to military red teams. Executives must attend but not dominate, enabling frank debate and unexpected insight.

Cyber and Physical Penetration Tests

White‑hat tests mirror battlefield probes: scope, recon, exploitation, and remediation. The iSEC Partners femtocell breach showed how modest hardware flaws expose vast networks, forcing Verizon to patch vulnerabilities promptly. Similarly, physical tests by experts like Chris Nickerson combine social engineering and lock‑bypassing to reveal real‑world exposure. Executives become believers only when they witness their own defenses fail firsthand—a truth Hoffman calls the “CEO shock factor.”

Practical Management of Tests

Whether commissioning cyber or physical probes, you must scope realistically, discreetly inform insiders, budget for remediation, and validate fixes. Treating tests as compliance theater yields reports without safety. Red teaming in business is less about menace, more about foresight—and it works only when curiosity outruns comfort.


Failure Modes and Ethical Boundaries

Hoffman warns that red teaming’s power can be neutralized or abused. Knowing these failure modes keeps the discipline honest—and ensures your dissent doesn’t turn into deception.

Rigged Tests and Politicized Panels

Scripted war games like Millennium Challenge 2002 and ideologically stacked groups like Team B reveal the harm of manipulation. Red teaming demands free play and composition diversity to yield truth rather than confirmation bias.

Capture and Tokenism

Institutions that absorb red teams into bureaucracy—turning them into routine compliance cells—destroy their independence. Hoffman calls this “organizational capture.” The FAA’s pre‑9/11 experience also illustrates tokenism: even accurate findings mean nothing when managers lack authority or courage to act.

Overuse, Underuse, and Neglect

Testing everything paralyzes, testing nothing ossifies. Calibrated cadence and executive attention prevent burnout or complacency. Worst of all is being ignored—the illusion of due diligence covering real exposure. McKinsey’s unrealized HealthCare.gov warnings prove that unread truth is wasted insurance.

Ethical Limits

Misuses extend to freelance or harmful tests. The Kirkwood High School incident—in which journalists staged an unauthorized infiltration—shows why authorization and coordination matter. Responsible red teaming simulates risk for defense, not publicity. The principle is simple: do no harm while revealing how harm could occur.


Automation and the Future of Red Teaming

Red teaming is evolving through automation, real‑time adversary modeling, and cognitive‑bias training. Yet Hoffman insists the human element remains central—the mix of creativity and courage that algorithms cannot replicate.

Automated and Continuous Testing

Cyber frameworks like Raphael Mudge’s Cortana scripting and Philip Polstra’s portable labs automate intrusion paths and scale repetitive tests. Labs such as Sandia’s IDART already integrate such tooling to simulate attacks continuously. These technologies create permanent readiness—digital versions of military war games running endlessly in code.

Cognitive Training and Bias Reduction

Simultaneously, intelligence agencies pursue metacognitive education. IARPA’s Macbeth platform gamifies bias recognition, teaching analysts to counter anchoring and confirmation heuristics. Combined with red‑team feedback, such programs raise the baseline quality of analysis before dissent becomes necessary.

Human Judgment and Storytelling

Automation scales breadth; humans deliver depth. Improvisation, empathy, and narrative framing still persuade leaders to act. Hoffman emphasizes that a concise, vivid scenario—showing a CEO how an attack impacts family photos or market trust—can achieve what data alone cannot. Machines simulate threats; humans convert insight into conviction.

Balanced Future Practice

Red teaming’s future blends continuous automated scanning for hygiene with targeted human campaigns for high‑risk foresight. As Hoffman concludes, automation multiplies reach while human judgment ensures action—the combination that makes future institutions truly antifragile.

Dig Deeper

Get personalized prompts to apply these lessons to your life and deepen your understanding.

Go Deeper

Get the Full Experience

Download Insight Books for AI-powered reflections, quizzes, and more.