Word Entropy Calculator

📖 Shannon distribution analyzer

Measure how evenly words are distributed with Shannon entropy, normalized entropy, effective vocabulary, top-word dominance, stopword toggles, and paragraph-level entropy.

🎯Entropy presets

Load a realistic text sample, then adjust token rules and stopword handling to compare distribution shape rather than simple word frequency.

⚙Entropy settings

Text to analyzeThe calculator estimates uncertainty in the next word token: higher entropy means a broader, more even word distribution.

Token mode

Case handling

Stopword toggle

Entropy unit

Minimum token length

Distribution rows shown

Paragraph entropy mode

Dominance alert threshold

Custom stopwords

Load a preset or paste text to calculate Shannon entropy.

Entropy

0.00

bits/token

Distribution uncertainty.

Normalized entropy

0.0%

of maximum

Entropy adjusted for vocabulary size.

Vocabulary diversity

0.0%

type-token ratio

Unique words divided by counted tokens.

Top-word dominance

0.0%

largest share

Most dominant token.

Formula breakdown

Inputs

Shannon steps

Waiting for text

📊Entropy snapshot

Effective vocabulary

2 to the power of entropy in bits.

Evenness gap

0.0%

Distance from a perfectly even distribution.

Paragraph spread

0.00

Highest minus lowest paragraph entropy.

Dominance status

Uses your selected alert threshold.

🔢Token distribution and contribution

Rank	Token	Count	Probability	Entropy contribution	Dominance
Run the calculator to see token probabilities.

📝Entropy by paragraph

Paragraph	Tokens	Vocabulary	Entropy	Normalized	Top token
Paragraph entropy appears after calculation.

📐Entropy reference bands

Normalized entropy	Distribution shape	Top-word signal	Editing interpretation
0-45%	Narrow and repetitive	One term dominates	Check for accidental echo or keyword stuffing.
46-68%	Focused but varied	Theme words stand out	Often useful for notes, summaries, and tightly scoped copy.
69-84%	Balanced distribution	No single extreme word	Common for polished prose with clear subject variety.
85%+	Very broad or diffuse	Flat distribution	Review if the passage feels unfocused or list-like.

🔍Stopword mode comparison

Mode	What changes	Best for	Expected entropy effect
Include stopwords	Keeps function words	Flow and style balance	Often lowers dominance but may mask topic words.
Exclude common	Removes high-frequency helpers	Topic distribution	Often reveals stronger content-word dominance.
Exclude bookish	Removes text-analysis terms	Reviews and notes	Helps keep book, chapter, and reader from skewing results.
Custom or combined	Uses your house list	Project-specific comparisons	Best when comparing drafts with repeated required terms.

🧭Comparison grid

Balanced proseMedium-high normalized entropy with moderate type-token ratio and no extreme top-word dominance.

DialogueCan show lower entropy because pronouns, names, and short function words intentionally repeat.

Study notesOften lower entropy because key terms repeat to anchor evidence, themes, and questions.

AbstractsUsually focused but information-dense, with stopword exclusion revealing method and result terms.

💡Entropy tips

Compare normalized entropy: Raw Shannon entropy grows with vocabulary size, so normalized entropy is better for comparing short and long passages.

Read dominance with context: A high top-word share is not always bad. Names, themes, and required terms can be intentional anchors.

DISCLOSURE: This post may contain affiliate links, meaning when you click the links and make a purchase, I receive a commission. As an Amazon Associate I earn from qualifying purchases.

Before you edit, paste some text into this word entropy calculator to see a chart of paragraph-level distribution patterns, stopword effects, effective vocabulary, top-word dominance, normalized diversity, and Shannon entropy. These gives you an idea of how your text is structured. Track them over time.

Even though the name might sound technical, everything you write has word entropy: when you’re drafting something, every word you type establishes a pattern of surprise and repetition. Too much repetition makes things feel flat; too much variety make things feel scattered. Word entropy will help you strike this balance.

How Word Entropy Helps You Write Better

Based off information theory, entropy is a way to measure uncertainty using Shannon entropy. In other words, it tell you how predictable the next word is likely to be. If sentences repeat the same words over and over again, the reader can predict what’s coming up, keeping the score low. When vocabulary are spread out fairly evenly over multiple words, there’s more surprise with each new token (driving the score upwards). After deciding whether or not you want to filter common function words and determining your token rules, the calculator do all the math for you.

Comparing parts of the same document makes the tool work its magic. Sentences with fewer long word and higher pronoun counts tend to be more dialogic, this decreases entropy intentionally. Research summaries usually aim for a tighter focus on terms related to methods or results. Neither way are inherently right or wrong. The key question is: do the patterns match what I intended?

Dominance by a few top words might indicate deliberate theme setting, or it could expose an unconscious echoing of themes in what you write. Toggle stopwords on/off to observe how much of the pattern is driven by content vs. These are glue words.

One of the most helpful things about normalized entropy is it solves the problem that the raw Shannon score increases with vocabulary. It’s not fair to compare the bit score of a long chapter with that of a short paragraph. When you divide the score by the theoretical max for the given vocabulary, you gets a percentage scale that can apply to texts of varying length.

You’ll find that fiction tends to fall right in the middle of the balanced spread, just where we’d expect it to. Early-reader books is lower, as expected. The bands on the chart turn those percentages into clear categories: narrow and repetitive, focused but still varied, balanced, or very broad.

Few writers realize how much stopword treatment shifts outcomes. If you leave all those “ands” and “the’s” in, your scores of domination declines; function words wash away nouny prominence, letting the thematic skeleton emerge. That’s valuable too. It shows topical focus, while the other view show stylistic rhythm. You can toggle between these perspectives for a new look at the same bits.

Then there’s paragraph level entropy. You might find that the beginning of a piece is beautifully varied and the middle turn to dull summarizing prose. Seeing that break coming in time allows you to change course before it occurs for the reader.

Good vocabulary gives a gut feeling about how rich a passage is; this is basicly a count of how many different words could be used to create the same amount of information. This is not a substitute for taste. Gentle repetition are a benefit to a child’s book; precise language serves law (which can appear constricted on paper).

The calculator isn’t meant to eliminate revision; it takes out the guesswork so you no longer revise in the dark. It makes a linguistic quality that you couldn’t of seen before real: a thing you can measure, compare, and deliberately mold. This isn’t to say writing isn’t an art; it just gives you a quiet ruler in your hand against which your drafts no longer sound as if they’re repeating themselves by accident. They’ll begin to sound as though all their echoes was chosen.

Word entropy calculator

How Word Entropy Helps You Write Better

Subscribe To Email List