Data Storyology

(This text was first published on the great blog Snarketing 2.0 on April 26th, 2013. Reproduced with permission )

It’s conventional wisdom by now that, with all the data we have to analyze, we have to find the “story.” Experts like Tufte have done wonders to improve our capabilities regarding data visualization and presentation — but that’s different from the understanding the story that the data is telling.

A recent HBR blog post titled How to Tell a Story with Data offers the following points of advice: 1) Find the compelling narrative; 2) Think about your audience; 3) Be objective and offer balance; 4) Don’t censor; and 5) Edit, edit, edit.

My take: My points of advice differ. And I think we need more rigor (dare I say methodology) regarding data storytelling.


I don’t have an issue with “find the compelling narrative” and “think about your audience” but these points are actually part of a broader process that the article doesn’t define.

Think of data storyology — the art and science of telling stories with data — as having two broad components: 1) Finding the story in the data, and 2) Telling the data story.

If I were to draw a picture, it would look like a yin/yang diagram, not a flow.

Finding the story in the data is an iterative process that involves utilizing data management and statistical tools to cut and analyze data. But in also involves applying human judgment and experience to figure out what the “story” is.

The HBR blog author describes “finding the compelling narrative” as:

“Giving an account of the facts and establishing the connections between them. The narrative has a hook, momentum, or a captivating purpose. Finding the narrative structure will help you decide whether you actually have a story to tell.”

I wish he would have left that last sentence off. If you find a narrative structure, you have a story. Whether or not that story is worth telling is a different issue.

Finding the narrative structure is more than “giving an account of the facts and establishing a connection,” however. In fact, the “account of the facts” is probably the least important part of the story because it’s the part that many people either already know or think that they know.

The interesting part of the narrative is the why, who, and when (more so than the what).  The “what” is the plot, but the “why” is what gives the plot some depth. And just as poor character development in a book diminishes the quality of the book, leaving out the “who” in a data story produces an incomplete (and potentially boring) story.


Finding the story is just the Yin part of the equation. Telling the story is the Yang.

This is where the “think about the audience” part comes in. Good data storyologists (or data artists) often define or uncover multiple stories in the data. Those stories likely have different levels of appeal to different audiences. Telling the story starts with defining who the audience is for the data story, and which of the data stories that were defined is most relevant, or how those stories tie together.


At this point, however, my opinions veer from the blog author’s.

Telling the data story is anything BUT being objective and balanced. Data storyology is about educating, influencing, and motivating people. As a data artist, the last thing you want to do is be objective and balanced. You want to draw upon your insights, opinions, and experience — which are all subjective — to tell the best story. The article says that “a visualization should be devoid of bias.” Perhaps a point for future discussion, but I think that this is simply impossible.

The article also says that “Balance can come from alternative representations (multiple clustering’s; confidence intervals instead of lines; changing timelines; alternative color palettes and assignments; variable scaling) of the data in the same visualization.”

First off, this is a very narrow interpretation of “balance,” in that relates to just visualization. Data storyology is about more than just data visualization. Visualization is not the story.

In addition, I would encourage any budding data storyologist to “censor like hell.” The absence of censorship equals data dump.


With a story and an intended audience, there’s still the art of telling the story.

A number of years ago, the analyst firm I worked for brought someone to train us on the art of storytelling. Still one of the best training sessions I’ve ever had.

The story trainer told us to think about the development of a story in terms of the story’s impact on the audience’s mood, and to strive to achieve the following mood pattern:

To summarize, think of the story development as: 1) Stuff is happening (neutral mood), 2) Things are going to get worse (or the things that are happening will cause problems, doom, despair) 3) Stuff happens or will happen to make it all better.

Story example: Little red riding hood is walking in the woods (#1), she gets captured by the big bad wolf (#2), she gets saved by the Woodsman #3).

Data story example (in financial services): Consumers are fed up with paying the high cost of checking accounts (#1), new providers are coming into the market to steal banks’ customers and drive profitability even lower (#2), banks can deploy new technologies and marketing analytical techniques to provide new forms of value to consumers to retain them and make them more profitable (#3).


All the talk about the rise of data scientists misses the boat, in my book. We need people who can take the data, and not just find the story in the data, but to tell the story in a way that educates, influences, and motivates people. That’s not science — it’s art. It’s data storyology.