The Bestseller Code cover

The Bestseller Code

by Jodie Archer & Matthew L Jockers

The Bestseller Code delves into the analytical world of a computer algorithm that unlocks the mystery of bestselling novels. By examining over a thousand hits, the book reveals patterns in themes, style, and characters that predict a book''s success, offering invaluable insights for authors and publishers aiming for the top of the charts.

The Science Behind a Bestseller

Why do some books explode onto bestseller lists while others vanish without a trace? In The Bestseller Code, Jodie Archer and Matthew L. Jockers argue that literary success isn’t random—it’s measurable, predictable, and even algorithmic. Drawing on five years of research and thousands of novels, the authors reveal that the DNA of a bestseller can be decoded using computer models that track themes, style, plot, and character patterns that consistently capture readers’ hearts.

At the center of their fascinating study is the “bestseller-ometer”, a data-driven machine learning model that predicts, with about 80% accuracy, whether a manuscript will land on the New York Times bestseller list. What makes this revolutionary isn’t just the algorithm—it’s the idea that language itself carries hidden signals of success. In this model, successful stories aren’t lucky anomalies; they share fundamental linguistic and emotional structures that resonate with human psychology and culture.

From Literary Intuition to Data-Driven Insight

Archer, a former Penguin editor, and Jockers, a Stanford digital humanities researcher, started from a simple question: what makes readers choose one book over another? Could patterns that appeal to millions be identified mathematically? Like archeologists of narrative, they fed thousands of books—both bestselling and obscure—into computational models, analyzing over 20,000 textual features such as verbs, pronouns, nouns, sentence length, and punctuation. What emerged was a startling discovery: bestsellers aren’t accidents of taste. They follow certain recurring narrative rhythms and emotional arcs that pull readers into an experiential trance.

The algorithm that resulted was a new form of literary criticism at scale. It could detect the heartbeat of stories—rising tension, conflict, resolution—across genres, and even assign each book a numeric score for its bestselling potential. This machine didn’t know author names or reputations. Yet, it correctly identified hits like Dan Brown’s Inferno (95.7%), Michael Connelly’s The Lincoln Lawyer (99.2%), and even debut novels like Kate Jacobs’s The Friday Night Knitting Club and Jessica Knoll’s Luckiest Girl Alive with near-perfect confidence.

The Myth of Random Success

Publishing lore long held that bestsellers are like lottery wins—rare, unpredictable flashes of luck. Archer and Jockers dismantle that myth. Editorial “gut instinct” and marketing budgets aren’t enough to explain why a particular story grabs humanity’s imagination. The authors point out that even brilliant editors turned down future classics like Harry Potter, The Help, and Lord of the Flies. If industry veterans can’t reliably foresee success, maybe machines trained on massive datasets can.

Their findings suggest that what we call “memorable literature” reflects structural clarity. The language of bestselling novels tends to balance emotional conflict and resolution while maintaining a rhythm of empathy and human closeness—especially in relationships. These books satisfy readers’ psychological cravings for connection, self-reflection, and transformation. Archer and Jockers discovered, for instance, that the most predictive topic was “human closeness”—moments of intimacy, friendship, or shared vulnerability—not just romance or sex. This universal theme appears across bestsellers from John Grisham to Toni Morrison, from literary to commercial fiction.

The Promise—and Provocation—of Data in Publishing

For writers and industry insiders, The Bestseller Code is both thrilling and unsettling. If machines can analyze unpublished manuscripts and score their potential, could data disrupt human creativity itself? The authors insist their model isn’t about replacing editors or writers—it’s about enhancing understanding. By identifying what resonates with readers, publishers can support new authors who might otherwise languish in slush piles. In theory, the algorithm is democratic—it doesn’t care if you’re famous or unpublished, rich or poor. It cares only about stories that move people.

Beyond pure prediction, Archer and Jockers explore deeper implications: how literature mirrors collective culture, why we’re drawn to certain narrative arcs, and what these patterns reveal about desire, fear, and hope in the modern psyche. In their closing chapters, they show how computers may soon imitate editors, helping publishing shift from intuition to scientific insight. Yet they also remind readers that technology can’t replace the humanity at fiction’s core. Art may yield data, but emotion drives storytelling.

Ultimately, The Bestseller Code invites you to see books not as mysterious works of destiny, but as deliberately engineered language systems—networks of emotion, rhythm, and theme that speak to the human condition. Whether you’re a writer, reader, or skeptic, it challenges the idea that taste is subjective and proves that the secret to storytelling success may be written—quite literally—in the code of words themselves.


Decoding Theme: The Power of Human Closeness

In their search for narrative DNA, Archer and Jockers discovered that one central motif repeatedly surfaces across bestsellers: human closeness. Whether it’s Grisham’s lawyers seeking justice or Danielle Steel’s families healing from loss, emotional intimacy remains the strongest predictor of reader engagement. This finding challenges assumptions about what readers want—not sex, violence, or grand spectacle alone, but meaningful connection.

The Machine’s Eye for Empathy

By teaching computers to read millions of words, the authors found clusters of words that signal intimacy: “conversation,” “touch,” “smile,” “understand,” “love,” “together.” These linguistic fingerprints mark where human connection occurs. In novels that chart emotional landscapes—such as The Help, House Rules, or The Paris Wife—the proportion of such moments predicts reader appeal. Bestsellers devote about 30–40% of their narrative to these topics, creating room for readers to feel seen and mirrored.

Grisham and Steel: Emotional Architects

Archer and Jockers call John Grisham and Danielle Steel the “godparents” of bestselling fiction because their work epitomizes balance between theme and accessibility. Grisham’s courtroom dramas rely on fairness, love of justice, and redemption—his heroes “need” and “want,” signaling active emotional agency. Steel, meanwhile, masters domestic connection. Roughly one-third of her pages dwell inside homes, exploring family bonds and healing. Both authors fulfill readers’ craving for recognition: we see ourselves in their characters’ struggles for belonging.

From Topic to Theme

The difference between a topic (what happens) and a theme (why it matters) reveals why “human closeness” triumphs. Romance, crime, or politics may provide settings—but theme anchors experience. When E. L. James’s Fifty Shades of Grey was fed into the algorithm, it scored high not for eroticism, but for emotional connection. Beneath its sensational premise lies a story about vulnerability and trust, universal themes that transcend genre. As Stephen King notes, “people love to read about work and relationships”—James merges both in moments of conversation, conflict, and intimacy.

Lessons for Writers

If you’re crafting fiction, the data suggests focus beats complexity. Bestselling authors revolve around one or two dominant themes with clarity. Novels that juggle six or more dilute emotional impact. Success favors concentration—a third of the book on one coherent emotional thread. The takeaway? Readers connect when they can emotionally inhabit the world you create.

Key Insight

It’s not the grand adventure or clever twist that makes readers turn pages—it’s the quiet heartbeat of empathy. Bestsellers reflect a culture’s yearning for connection amid chaos.


Mastering Plot: Emotional Rhythms and Perfect Curves

What if the secret to pacing a novel lay in its emotional frequency? Archer and Jockers cracked the code of bestselling plotlines through sentiment analysis—tracking how emotion rises and falls through a story. They discovered seven dominant plot shapes in modern fiction, each echoing ancient storytelling forms. The bestsellers, they found, pulse in rhythmic waves of tension and release, charted with remarkable symmetry.

The Physics of Feeling

Using linguistic markers of positivity and negativity (“love” vs. “hate,” “smile” vs. “cry”), their algorithm plotted hundreds of novels as emotional graphs. Bestsellers like Dan Brown’s The Da Vinci Code and E. L. James’s Fifty Shades of Grey share nearly identical curves: steady alternations of high tension and resolution every few chapters. Readers subconsciously respond to this beat—a dopamine rhythm of hope, fear, relief, and anticipation. Brown’s code-hunting chase and James’s romantic powerplay may differ in genre, but structurally, they trigger the same emotional reward cycle.

The Seven Plot Archetypes

  • Rags to Riches – growth from struggle to triumph
  • Tragedy – downfall from hubris or error
  • Comedy – confusion to resolution and togetherness
  • Rebirth – transformation through crisis
  • Voyage and Return – adventure leading back to insight
  • Quest – pursuit of purpose or truth
  • Overcoming the Monster – confrontation with evil or fear

Each bestseller doesn’t need novelty—it needs rhythmic emotional motion. Archer and Jockers show that plot success lies in proportion and pacing. A story that hooks within 40 pages (Grisham’s rule) and maintains regular emotional fluctuations keeps readers physiologically engaged. Too few shifts cause boredom; too many create fatigue.

Rhythm Over Formula

Rather than copying formulas, writers can internalize rhythm. Bestsellers “breathe”; they alternate emotional highs and lows with consistent spacing. Think of narrative like music—steady beats build momentum. The Da Vinci Code works because its cliffhangers occur predictably, satisfying the reader’s unconscious need for balance. Similarly, Fifty Shades closes each emotional chapter with either physical pleasure or painful separation, guiding tension like a melody.

Key Insight

Plot mastery isn’t about dramatic twists—it’s about composing emotional rhythm. Like music, stories resonate when their highs and lows form perfect curves readers can feel as well as read.


Style DNA: Every Comma Matters

Style, Archer and Jockers show, is more than grammar—it’s fingerprint. Their analysis of thousands of novels uncovered microscopic habits that distinguish compelling prose. Frequency of verbs, conjunctions, commas, and contractions like “I’d” or “don’t” can predict bestseller success as powerfully as themes can.

The Linguistic Genome

Using stylometrics—a computational fingerprinting of style—the authors explore why J. K. Rowling’s secret novel The Cuckoo’s Calling was unmasked as her own by data rather than confession. Every writer unconsciously reveals themselves through patterns in everyday words. When Rowling tried to write “like a man,” computers spotted her consistent use of prepositions and pronouns. Style, it seems, resists disguise.

What Bestseller Style Looks Like

Analysis revealed that bestselling writers use simpler, cleaner sentences. They favor contractions and direct speech (“didn’t,” “can’t,” “we’re”) because readers feel conversational connection. They avoid overuse of adjectives and exclamation marks, preferring plain nouns and verbs. Characters “do” more than they “describe.” Sentences move quickly and rhythmically, allowing readers to project emotions rather than parse syntax.

Interestingly, the word “very” declines in successful writing, while “really” rises—a linguistic shift toward authenticity. These findings echo the advice of Strunk and White’s The Elements of Style and Stephen King’s On Writing: clear prose over ornamentation. Bestsellers, Archer and Jockers note, avoid gaudy “Christmas tree” sentences laden with adjectives; they choose clean fir trees—elegant, direct, alive.

Gender and Training in Style

The authors discovered a surprising pattern: women more often master bestselling style. Female debut authors—Paula Hawkins, Kathryn Stockett, Gillian Flynn—combine professional backgrounds in journalism with linguistic precision. Their prose mirrors article writing: concise, emotional, sensory. Male Pulitzer winners, by contrast, embody a more “literary masculine” style—longer sentences, abstraction, complexity. Yet both can succeed when balance is achieved; Dave Eggers’s The Circle sits perfectly between the two (52% feminine, 48% masculine).

Key Insight

A bestseller’s prose feels spoken, not recited. Every comma matters because rhythm builds trust—the subtle cadence that makes a reader forget they’re reading and simply feel.


Character and Agency: What the Girl Needs

Characters, Archer and Jockers assert, are the engines of reader emotion. Their actions—captured in verbs—predict a novel’s success. Bestseller protagonists act, decide, and transform; non-bestsellers hesitate, murmur, or cling. This dynamic shapes everything from thrillers to domestic dramas.

Action vs. Passivity

Machine-reading thousands of verbs, the authors found that winning characters “need,” “want,” “love,” “tell,” and “work”—verbs of agency. Lesser ones “wish,” “suppose,” and “hesitate.” In novels that sell, characters are creators of fate, not victims. The most successful figures—from Mitch McDeere in Grisham’s The Firm to Lisbeth Salander in The Girl with the Dragon Tattoo—act decisively within moral chaos.

Rise of the Feminine Noir

Perhaps the most striking recent trend, the authors note, is the surge of “girl” titles—Gone Girl, The Girl on the Train, Luckiest Girl Alive. In each, female leads are not damsels but dark agents. Their verbs reflect rebellion: “decide,” “kill,” “plan,” “escape.” These novels subvert traditional domestic norms, turning home into battleground. The data calls this pattern “domestic noir”—stories where psychological tension replaces glamour. Readers crave this fusion of mystery and intimacy, in part because it mirrors modern unease about identity and control.

Emotion as Fate

The bestseller’s protagonist is always in motion. Even love stories frame emotion as action: characters fight, choose, transform. As the Greek philosopher Heraclitus claimed, character is destiny. Archer and Jockers prove that linguistic motion equals narrative power. Passive verbs lead to forgettable plots; verbs of passion and pursuit create cultural icons. When Rachel in The Girl on the Train kills her abusive ex, her act isn’t only revenge—it’s cathartic purification.

Key Insight

Readers follow verbs. The characters who act—from courtroom fighters to conflicted heroines—drive emotion, plot, and meaning. In fiction and in life, agency sells.


Algorithms and Authorship: The Future of Literary Imagination

One of the most provocative sections of The Bestseller Code asks: if computers can read fiction, can they write it? The authors explore early experiments like Darius Kazemi’s National Novel Generation Month, where programs generate 50,000-word novels through automated scripts. Yet machine-written stories like True Love.wrt remain mechanical—grammatically valid, emotionally hollow.

Human Imagination vs. Artificial Imitation

Computers can mimic syntax and semantics, Archer and Jockers note, but they can’t recreate empathy. The machine can “understand” love linguistically but not emotionally. Referencing science fiction classics like Blade Runner and Her, the authors argue that creativity requires consciousness—the intuitive sense of what it means to be human. Even when AI composes novels with metaphors, readers detect the absence of felt experience.

The Real Frontier: Collaboration

Instead of replacing writers, computation can enhance understanding. Imagine analyzing your manuscript with an algorithm to refine pacing or emotional balance. Jockers predicts that future authors will use analytics as creative mirrors, identifying narrative rhythms much like musicians use metronomes. This partnership of human intuition and machine cognition may yield more insightful storytelling, not less.

The Poet Still Matters

In their epilogue, the authors reaffirm: data can measure clarity, coherence, and rhythm—but not soul. Literary art depends on choice, irony, empathy, and imagination. A computer may identify what makes The Circle a masterpiece of structural design, but only Eggers could write it with human dread and wonder. Algorithms can read, but authors still must dream.

Key Insight

The fusion of humanity and computation defines fiction’s future. Machines can teach us how we read—but only humans can teach machines why stories matter.


The Circle: The Algorithm’s Perfect Novel

In a twist worthy of satire, the authors’ model crowned Dave Eggers’s The Circle as the paradigmatic bestseller—the manuscript scoring a perfect 100%. The irony? Eggers’s novel critiques technology’s invasion of privacy and human complexity. Yet mathematically, it embodies every hallmark of market success: strong female lead, balanced linguistic style, human closeness, modern themes, and rhythmic emotional curves.

Eggers as Model Author

Eggers blends journalistic clarity with literary depth, a hybrid of “feminine precision” and “masculine structure.” His protagonist Mae Holland navigates a utopian tech company that becomes dystopian. Through her desire to belong, Eggers explores transformation and isolation—the book’s most predictive themes. The machine’s selection demonstrates the harmony of emotional, thematic, and stylistic data points: human intimacy (3%), technology (21%), and workplace drama (4%) combine to form the algorithm’s perfect proportions.

Why It Works

Mae’s journey follows the archetype of Rebirth—the emotional plot shape most pleasing to readers. Eggers orchestrates symmetrical highs and lows, from the euphoria of success to moral decline. His prose, clean yet intimate, pulses with the rhythm Archer and Jockers identified: alternating emotional tension every few chapters. The irony deepens as his story reflects society’s obsession with perfection—the same impulse driving publishing’s reliance on data.

The Human Lesson

Eggers’s inclusion as “the one” confirms the authors’ central claim: narrative success is neither accidental nor soulless. Even a dystopia about digital conformity can achieve bestseller resonance when its emotional ratios mimic human experience. The algorithm “winked,” Archer writes, because it unknowingly endorsed a warning against its own logic—a testament that storytelling remains humanity’s final frontier.

Key Insight

The Circle closes the loop: technology can predict emotion, but not replace it. Even the algorithm’s perfect book is, at heart, a plea for human truth.

Dig Deeper

Get personalized prompts to apply these lessons to your life and deepen your understanding.

Go Deeper

Get the Full Experience

Download Insight Books for AI-powered reflections, quizzes, and more.