I was reading Ian Thomas’s latest post this morning, and his ideas prompted me to express some of mine I’ve had lately. Nothing really definitive yet, but worth sharing with you I believe, especially since these are the kind of ideas that would need inputs from a lot of people.
I would like to propose that our field, Web Analytics, should establish standards in how web sites are tagged, so that 1) data could be analyzed by any application from complying vendors, and 2) in such a way to make it easy to integrate with other enterprise data, even loadable in a data warehouse somehow.
Let us first examine the benefits from #1. Having standards in how a web site is tagged would really reinforce ownership by the true owners, i.e. the site owners themselves. Data could easily be transferred from one vendor to another, eliminating dependency on proprietary data structure, which would make competitiveness solely rely on functionalities. Of course, data (“logs”) would have to be readily available, which is unfortunately not the case now (and to me Google Analytics biggest drawback, for example). It could also be possible for site owners to tag and collect the data themselves, as with WebTrends SDC, and then decide where to get it analyzed; either using a locally installed version of a product, or dumping their data on the vendor’s servers.
Another big advantage would be that measures (visitors, visits, campaigns, what have you) would basically mean the same to everybody. Over with the days when analyzing the same logs with different products meant getting different numbers all over (well, at least when doing this was even possible using web server logs).
As for #2, defining data structure with enterprise integration in mind would certainly force us to revisit what we really need to measure on a web site, due to practical limitations of data warehouses. Would everything have the same value? Should every site collect every bits, or collect the most important actions/events? Well, I don’t know, but we would definitely need to rethink the value of collecting all page views, for example, at least for those not in publishing (and thus not relying on an advertising based business model). I am aware that making web data structured in a way to make it readily available to BI products would also bring the question of what is Web data and data coming from the Web, i.e. where would Web Analytics stops and BI starts. But that’s a different debate.
Ian’s idea of making Google the universal collector of data is interesting. However, I think we can go a step further with what I am saying here by making even the collection totally a matter of choice. Also, Google would have to make the data/logs available to anyone who needs their data, instead of relying only on the brand for result validation, as it is currently the case with GA.
Am I crazy? Is this even feasible? Would vendors see their benefits in this? Well, you are more than welcome to add your own more intelligent ideas here.