Use and abuse on the web
February 2nd, 2011 by David Bradley >> No Comments
Analysing user statistics across websites is an issue that often comes under scrutiny from privacy advocates worried that marketing companies are exploiting their personal data to track their behaviour and target them with advertising. The issue is, of course, a double-edged sword. Many of us would prefer that our online behaviour is not being monitored in any way, but if advertising is inevitable then allowing marketers to display pertinent advertisements could be less annoying than seeing the random offerings that everyone else sees.
But, web usage can be used in a much more positive way than propping up the profit lines of marketing companies. Google‘s FluTrends (http://www.google.org/flutrends/) has become an essential global tool for monitoring outbreaks of influenza. It works by spotting unusual search activity centered on topics related to the disease. Similar search tracking could readily be used to spot other trends or the emergence of new diseases. Tracking and analysis technology similar to that used by marketing teams could also be put to good use in spotting important global, national and regional trends. It might also have negative connotations in detecting and stifling uprisings and militant activity too, which could be a serious problem in particular parts of the world.
Regardless of whether you perceive it as a good or a bad thing tracking is happening across the globe, its impact on the lives of you, your family and friends will depend on whether you live in a so-called free or totalitarian state. You might think that cookie-control plugins and next-generation browsers that claim to prevent tracking will help, but if “they” want to track you, “they” will.
Meanwhile, Dong Li of the École des Mines d’Alès in Nimes, France, and colleagues Anne Laurent and Pascal Ponceletat the University of Montpellier, point out that much of the research into tracking and web usage analysis has focused on gathering data on common behavior. But, of far more interest would be data on unusual behavior. The odd activity that is to the left, or right, of the norm.
The team has now analysed patterns of web usage and discovered that various rules can be extracted that allow unexpected web usage to be predicted. They call their approach WebUser (Web Unexpected Sequence Rules) and describe it as “a belief-driven framework for mining unexpected web usage in session sequence databases.”
They suggest that such rules could be applied for “web content personalisation and recommendation, site structure optimisation and critical event prediction.” That hints at marketing again but could equally allow a company to offer a better service (or a net neutrality skewed service if you like) but could also be used to spot when something is about to go viral, whether that’s the latest pop video or a popular revolution.
Dong Li, Anne Laurent, & Pascal Poncelet (2011). WebUser: mining unexpected web usage Int. J. Business Intelligence and Data Mining, 6 (1), 90-111
To evaluate their approach, the team performed a series of experiments on three web access log files, including a very large log file of a BSD UNIX online discussion forum, a large log file of a customer support forum of an online game provider, and a small log file of a university library Web portal. All log files were converted to session sequence databases and anonymized. They then applied their WebUser algorithms to the data to effectively extract the rules for each data sequence.
Related articles

"Deceived Wisdom: Why What You Thought Was Right Is Wrong" from David Bradley. Available now on 

