Web Stats Data Mining
September 22nd, 2008 · by David Bradley >> 2 Comments
What can online businesses learn from their web logs? According to Xueping Li of the Intelligent Information and Systems Laboratory, at the University of Tennessee, Knoxville, and colleagues, analyzing the information you can gather about the traffic to your website, whether that’s from raw access data on your server or standalone script-based systems such as Google Analytics, GetClicky, or Statcounter.
As the Web has grown, so competition has intensified. This is most certainly true among ecommerce sites with the likes of Amazon and Barnes & Noble vying for each other’s business just as intensely as news organizations battle for regular readers. It is even true in the blogosphere where feedcount and post:comment ratios are apparently all important, despite the thin veneer of sociability among the A-listers. Everyone wants more visitors and more interaction because that translates into sales one way or another, whether you are selling books, news, or your personal opinions.
This competition among businesses and the creation of an effective web presence is critical to attracting new customers/readers/community members and retaining current customers/readers/community members and so to the success of the business/community site/blog.
The features of a website such as its design and security and how it evolves over time influence whether a customer will revisit the site or make a transaction.
Li and colleagues have developed a way of looking at web logs so that webmasters can evolve their web sites to boost and retain visitors significantly. Their approach is based on what they describe as simple, yet effective descriptive statistical techniques that reveal the relationship between traffic workload and visitor domains names and geographic locations. The regularities and patterns that emerge can shed light on how to design a better web site and enhance its performance.
They point out that websites that remain unchanged for months or even years after their initial inauguration, cobwebs, you might call them, do with a few notable exceptions, lose their initial burst of customers very quickly. With the rapid growth of the web and intensified competition among businesses, creating and maintaining an effective web presence is a significant challenge. It is no coincidence that perhaps with some exceptions among the “essential” sites, successful websites tend to be those that are dynamic and whose information is the most sophisticated, diverse, and exciting.
But, there is no point in simply changing for the sake of it, if those changes are not informed by insight gleaned from current and past traffic and visitor interactions. Of course, there is nothing new in suggesting that we check our stats and adapt our content and design to optimize for visitors. However, the system described by Li and colleagues in their IJECRM publication (full reference below) explains how universal trends might be plucked from those stats. Their proof of principle is based on sourced historical data, but they explain that there is no reason why a site should not do data mining on its live web stats and so become iteratively dynamic.
Xueping Li, Laigang Song, and Alberto Garcia-Diaz (2008). Adaptive web presence and evolution through web log analysis Int. J. Electronic Customer Relationship Management , 2 (3), 195-214















2 responses so far ↓
Wayne Smallman // Sep 27, 2008 at 9:33 am
Hmm, intriguing.
I’d have liked to have seen a little more detail about how these patterns work.
That said, you have provided a link to the document, so I can always have a read later…
David Bradley // Sep 28, 2008 at 7:48 pm
Yes, maybe I could splice in some additional information. Drop me a line if you’d like a PDF of the paper to do a follow up for blah.
Leave a Comment