Allen Stern suggested yesterday on Centernetworks that the time is approaching that somebody will sue Alexa due to its poor web metric tracking. That it hasn’t happened yet is perhaps a mystery, and I don’t disagree that it might one day happen, but it presumes one thing: that we take any Alexa type service seriously.
Web metrics are an interesting game. Like TV ratings, web metric firms take a sample of viewers and extrapolate those figures to represent the entire internet audience. Statistical measurement may be a scientific study, but ultimately the figures are only ever as good as the sample, and the smaller the sample, the higher the chance that the results are inaccurate. Web analytics services have the same issue: they can’t possibly know the true traffic of every site, they can only create a calculated guess based on the sample size.
What follows is a brief overview of each player, how they gather their stats, and what I think of them. The conclusions will no doubt be disagreed with by some, but in noting that, it can only be expected when the stats each service provides are subject to debate themselves. Ultimately though, the better we understand the methods used, the better understanding we may have of the results.
Alexa has long been the web’s favorite whipping boy in web analytics. The Amazon owned service was founded in 1996 primarily as a recommendation service, with traffic stats being pulled from each installation of the Alexa toolbar. For most of its history, the toolbar was Alexa’s only source of statistics, and was only available for Internet Explorer.
Things changed earlier this year when Alexa started fresh, announcing that they would be resetting their stats and using external data as well as their toolbar data to report web traffic.
The methodology has always been flawed, and although it may be marginally better now, it is far from perfect, and yet people keep on using Alexa. The simply reason why: Alexa has become the defacto standard in web analytics because they have always been the most open in reporting their stats. Alexa also allows for easy comparison between sites, creating demand for the tool as a way to measure sites against competitors, even when the statistics themselves may have been flawed, after all, they are flawed for everyone, including the competition.
NASDAQ listed comScore has a $610 billion market cap, so although you may not be able to freely access their statistics, they are a true success story in the space. I’ve had access to a comScore account in the past, and the depth and variety of statics they offer is brilliant. They also have huge sample audiences, said to be in the millions in the United States alone, but the methodology is rather interesting.
comScore gathers data through monitoring software distributed under brands including PermissionResearch and OpinionSquare. One such offering is the Marketscore Internet Accelerator, a Windows tool that offers users (as the name suggests) quicker internet access. comScore’s particularly corporate friendly Wikipedia entry suggests that comScore is no longer using such services, so I can only go on that, but in previous years comScore used what has been called by leading publications spyware and malware to build its sample audience. There was also a confirmed case (per the Wikipedia entry) of this software being installed without permission, and even stories at one stage of comScore tracking tools being bundled with Kazaa.
comScore will no doubt claim they are clean today, but they still rely on installed tracking software that is often bundled as part of other packages as an incentive to install. The real question is whether this data collection method delivers the best and most accurate statistics in the market.
Like comScore, we don’t see a lot of Hitwise data in the wild as they sell the data and don’t offer it freely on demand. We do see Hitwise stats though on a regular basis through controlled releases, and the company has a solid outreach program with blogs and the media.
Hitwise gathers its data by buying the information directly from internet service providers. In Australia they buy surfing data from Telstra, the country’s biggest ISP, and in other places have similar deals in place.
Hitwise data isn’t absolute, and like the others they take the sample data and extrapolate that across all internet users, but the data isn’t sullied by toolbars or internet accelerators, so it is somewhere closer to being representative. Of all the web metrics firms, I’d put my most faith in Hitwise data, with the proviso that like any set of stats, it can never be perfect when it doesn’t capture all data.
Relative new comer Quantcast is growing in popularity with a service that offers free comparable statistics to Alexa.
Quantcast gathers its data two ways: from a panel, and directly. The methods behind the panel data isn’t explained by Quantcast, but it would be fair to presume that it would be somewhat similar to comScore, given that Quantcast doesn’t offer a toolbar or install of its own, at least from the Quantcast site. Quantcast allows web site owners to embed code on their site so Quantcast can track traffic directly, delivering actual statistics for participating sites as opposed to panel only estimates.
Quantcast claims that their combination of panel and direct data gathering delivers a holistic model of internet audiences. In my experience, their embed under reports stats, but not so outrageously as to make it unusable. Quantcast is the natural successor to Alexa: it may not be perfect, but they at least give aggrieved site owners the opportunity to correct the stats by feeding real data into the system, and that is always going to be a positive.
Compete freely offers basic data on US traffic, but unlike others ignores non-US traffic, a sort of stunted statistical tool. Data gathering includes a combination of ISP’s, opt-in panels, application providers, and users of the Compete toolbar.
Compete has become popular is some circles, but I not sure why. The stats seem reasonable, but in a global economy, US only traffic might have some use for advertisers, but it will never show the complete picture. Compete stats should only ever be used when considering a US audience. They should NEVER be used as being indicative of true traffic for any site.
Google launched Google Trends for Websites earlier this year, and it looked like Mountain View might have been on a winner. The service pulls data from various external services, Google search and Google Analytics, although as Google has told me previously, not directly, and only anonymously…which I think means that the Google Trends stats don’t actually reflect reality even if Google has the data at hand. The service has possibilities, and expect Google to offer more data in the future.
No provider of web metrics offered today is perfect, but some services are better than others. For corporate metrics Hitwise offers the best bet, gathering its data from ISPs so the incoming data spread is wider than relying on panel statistics from installed tracking software. For free stats, Quantcast offers a similar depth to Alexa but with the added bonus that over 10 million sites are contributing real data.
(image credit: McHumor)