NOTE: So, I’m starting republishing some of the most interesting posts I did on The Big Integration, and will try to do so only with those that still make sense today.
It is a great pleasure to start this new series of interviews about web data and customer data integration with June Dershewitz. June is a seasoned and well-known Web Analytics expert who has had extensive experience with such projects.
So, here it goes:
JW– First off, tell us about your background, and how long you have been fully involved in Web Analytics.
JD– I’ve been involved in our field since 1999, when I took a job as a web analyst at a startup. After that I spent time at a data warehousing consultancy, developing and using a first-generation commercial web analytics application. For the past 3+ years I’ve been a freelance consultant, helping clients in a variety of industries improve their practice of web analytics.
Tell us about one case where you had to integrate web data with back end ones.
I’ve had a number of experiences with integration – I’ve done implementation work to get things set up in the first place, I’ve managed reporting systems that contain web data plus other stuff, and I’ve gotten to do detailed analysis projects using integrated data. The particular case I’ll outline here has to do with implementation.
As I suspect is the case with many large companies, I came into a situation where there was, on one hand, a commercial web analytics tool (owned by my group), and on the other hand, an existing customer data warehouse (owned by another group). Our task involved creating an export from the web analytics app into the data warehouse.
Now, I’ll admit we weren’t starting from scratch. The data warehousing group had been getting a similar export from a legacy tool, but that tool was in the process of being decommissioned as we standardized on our new commercial web analytics application. Since we were replacing an existing export, the structure of our data was predefined – each record was to have a customer id, a URL, the date on which that customer viewed that URL, and a count of page views.
In our web analytics tool we created a custom report that contained the data we needed for the export. Once we got this report working in a test environment we did a round of QA to ensure that it synced reasonably well with the legacy export (we were realistic and therefore not too nuts about getting a perfect match). Since the data warehouse group had very rigid standards regarding the properties of the export, an engineer on our side wrote a postprocessing script in order to prep the data for handover. Last of all we set a delivery schedule and finalized everything with the data warehousing group.
In the end we were able to use our new export and eliminate the legacy one, so I’d call our effort a success.
What were the biggest challenges?
Our biggest challenges had to do with 1) people and 2) tools. I’ll say a bit about each.
1) Whenever you’re passing data between groups, it’s important to have a clear dividing line regarding responsibilities. It seems the closer you get to that dividing line, the more each group expects the other group to actually do the work. Our groups were pretty good at communicating, though, so we were able to avoid stalling when this sort of issue came up.
2) I’m not going to get into tool-specific details, but I will say that we had to work within the limitations of our tool. We spent a bit of time struggling with questions such as the following: Is there a limit to the amount of data we get in a report, and if so, how much control do we have over extending this limit so that our export doesn’t get truncated? When we export to a file, how much choice do we have regarding file naming conventions, or delimiter, or header row? Why the heck is it so hard to get date as a dimension in a report? None of these limitations made our job totally impossible, but we had to work within the constraints of what we were able to get from our tool.
What were the most important pieces of learning in doing that integration? Was doing it adding a lot more to the picture?
In this particular case the integration point already existed, so the data didn’t shed any new light on our customers. It was an important first step for the business, though, because it showcases the fact that we can integrate using data from our commercial web analytics tool. I’m certain it will pave the way for other integration projects, and it’ll definitely be easier now that we’ve got an example to work from.
What do you think, in general, of how well Web Analytics and back end applications can be integrated?
I wouldn’t say integration is a piece of cake, but it can be done. I’d love to see more businesses actually doing it, and those who already are, doing more of it. I think the limiting factors are – once again – people and tools. Most every integration project will take the coordinated effort of people from different groups, so everyone must be willing to work together. Also, we all know there’s a staffing shortage in web analytics; hopefully more integration projects will make it onto the roster as companies hire people who are qualified to handle the work. In terms of tools, the basics are in place but the features are limited. It would be great if commercial web analytics vendors improve the ways in which we are able to pull data out of their applications (but I won’t hold my breath).
In, say, 3 years from now, do you believe that integrating all those databases will be common practice? If so, who will take the lead: Web Analytics or Business Intelligence?
Without a doubt, I believe that more companies will be looking beyond stand-alone web activity data by then. Whether the integration happens within a web analytics application or within another business intelligence application will be a matter of preference on the part of the company. As I mentioned earlier, many large companies already have an established corporate data warehouse, and in such cases the web activity integration is likely to happen as a feed out from the web analytics tool to that warehouse. However, for some companies (especially companies that consider the web channel of primary importance) it may make sense to pull data in from other sources and treat the web analytics application as the center of everything. Whatever winds up happening, I know we’ll all be better off because of it and I look forward to continued involvement in this area.
Thanks for the questions, Jacques! I hope your readers find this interview useful.
Don’t worry June, I am sure they did!!! Thanks again.
This interview was conducted via email. I didn’t edit the answers, except for typos.