Monday, 22 August 2011

Extracting data, making conclusions, writing fiction

This is a guest post by Caleb J Ross as part of his Stranger Will Tour for Strange blog tour. He will be guest-posting beginning with the release of his novel Stranger Will in March 2011 to the release of his second novel, I Didn’t Mean to Be Kevin and novella, As a Machine and Parts, in November 2011. If you have connections to a lit blog of any type, professional journal or personal site, please contact him. To be a groupie and follow this tour, subscribe to the Caleb J Ross blog RSS feed. Follow him on Twitter: Friend him on Facebook:

Matt Tuckey, curator of this here blog, posted earlier this month about an experiment he conducted to measure the impact of linking existing blog content via twitter by way of trending topics (or “trend tailing” as I will so cleverly call it from now on). This type of testing for the sake of data collection is right up my alley. And to merge the world of words with the world of data…consider me more interested than a person should be.

Matt’s chart speaks for itself, really. The results seem quite impressive on the surface. So I’ve decided to mine my own content for some interesting data, both in terms of my website traffic as well as general writing data that, honestly, serves my own curiosity more than the possible efforts of others.

First, tag clouds. Tag clouds are a simple visual representation of word density for a specific selection of text. I was first truly turned on to tag clouds when developing my The StoryVault project (best viewable on a mobile phone or tablet). After categorizing my stories with descriptive tags, I created a quick tag cloud out of curiosity. Little did I know that a visual representation of my work would help me to truly see the overarching thematic content of my fiction. Here’s the cloud:

Where before I would stress over describing my own work, I now had a concrete string of words. Grotesque fiction dealing with children, domestic issues, and obsession. That pretty much says it all.

But I went a step further and created a tag cloud using actual words from my novel Stranger Will. This one is different in that I did not first describe the work manually by creating tags, but instead used existing words. The result:

Most satisfying here is the prominence of the words eyes, room, enough, and waits. Each of those words implies an isolation that definitely permeates the novel, but to see my intentions distilled and verified is a rush.

Second, I want to look at the most popular pages on my site. Keep in mind that because my site is a blog, with constantly updated content, new content obviously doesn’t have the same historical reach as existing content. However, looking only at traffic since January 2011, the pages below are my top ten. Notice that three of these pages are part of my Unexpected Literary References series:


This isn’t surprising, as I have long known that this series is popular. What it tells me is that if I want to leverage some of my traffic in hopes of converting visitors to book buyers, I should find a way to work my own fiction into the series posts.

Lastly, I want to look at outlying peaks to judge any correlation between a specific piece of content or traffic source and site activity. Notice the traffic peak on April 3rd. What went on here?

The culprit: A post about how authors should choose between self-publishing and legacy publishing. The takeaway here (without scrubbing too deep into contextual traffic patterns) is that people like information about publication choices. The rising popularity of eBooks and POD printing is likely the reason for this interest.

I love data. I will continue to love data.

No comments: