X

Analytics spam – How accurate is your Google Analytics reporting?

We’ve all suffered with email spam for years.  Despite the combined efforts of Google, Microsoft and our Corporate IT Departments, we still get unsolicited mail in our inboxes and lose genuine mail to the dreaded spam folder.

For most, it’s an annoying fact of email life. Whilst it can certainly affect productivity, for most business users it doesn’t pose any major risks.

A new kind of spam in town

Since the beginning of 2015 there has been an increase in a new, altogether more insidious kind of spam. Unscrupulous individuals are polluting our web analytics data with “ghost” traffic, fake referrals and even goal completions.

Unlike email spam, this is more than a minor annoyance. Left unchecked, the inflated numbers could have a serious effect on existing marketing and business reporting. That in turn could impact decisions made based on Google Analytics data.

Analytics spam, really?

Yes really. It’s not a small scale problem either. Take a look at the chart below which shows real example from a site with a good level of traffic.

orifup_1

If you were the marketing manager responsible for the website you’d be feeling pretty happy about the traffic profile for the last six or seven months.

However, if we show that with the spam removed you’ll see that a large proportion of the traffic since April has been fake.

orifup_3

Imagine the potential impact on a smaller site where the spam traffic could easily overwhelm the real numbers.

What are they doing?

Spammers are hitting your analytics is two ways. First, they are hijacking your Google Analytics code and adding it to a dummy site. That creates genuine looking, “ghost” traffic that is recorded against your Google Analytics account.

Importantly, it also allows them to create false Google Analytics “Events”. If you’re using events to drive your GA goals, this can have serious implications for your KPI reporting.

Secondly, they are setting up links to your site on a dummy page and using an automated “bot” to click on them. This shows in your reporting as a genuine referrer sending traffic to your site.

What are they trying to achieve?

Like all spammers, they want you to visit their site. When you look in your reporting, here’s what you are likely to see: –

2015-07-13_10-50-08

Wow! Look at all the traffic that came from… So you visit the site where they try to sell you something, usually analytics related.

There are also genuine companies who are being duped by unscrupulous “Agencies” to drive traffic to their sites using these techniques.

What does that mean for my reporting?

The initial thing you’ll notice is that there is a big increase in “Referral” traffic, especially to your Home page. That will increase the number of Users, Sessions and Pageviews: –

2015-07-13_10-41-57

Because the spam bots only visit one page, and for only a short time, this results in lower values for engagement such as Pages per session, Time on site and Bounce rate: –

2015-07-13_10-42-30

How can you fix it?

Thankfully, fixing the initial problem is relatively simple if you understand Google Analytics filters and segments.

To stop people “ghosting” your website, you need to set up a hostname “Filter” so that Google Analytics only records traffic generated by your website. Care needs to be taken, if you are recording traffic from multiple sites, to include all the valid domains.

Analytics Edge have a great post explaining the technical process.

Filtering only works from the day it is set up so you should aim to do this as soon as possible.

That  leaves your existing data which will still be polluted with the spam traffic. To clean the historical reports you’ll need to create a new “Segment” in Google analytics that filters out the spam.

Analytics Edge also explain how to set this up and have created a shared segment in the Google Analytics solutions gallery.

Once imported, the new segment will show in the drop-down list: –

2015-07-13_10-47-40

It’s important to remember to use this for future reporting to ensure spam is filtered out.

An ongoing battle

As with it’s email cousin, the battle against the new spammers is ongoing. Each week, they find different, more devious ways to pollute our data.

The hostname filter is a one-off fix which all sites should implement as a matter of course. This will remove a large percentage of the fake traffic.

The filtered segment will need to be updated with new domains and referrers periodically to exclude new traffic sources.

If you outsource your analytics to us then you can rest easy. We’ve already added the filters to your GA account and will be regularly updating/distributing the filtered segment.

If you’re unsure about how to add the filters and segments then drop us a line. It’s a relatively quick process to fix the initial problem and will restore confidence that the reported numbers are from real users.

Tags:

About the Author

Over the past twenty years, Mike has had an exciting and eclectic range of roles. From Global Web manager for a FTSE 100 corporate, via CTO for an internet startup to COO at one of the largest B2B and B2C digital publishers. As MD at Compound Partners, Mike is responsible for content marketing consulting on strategy, delivery and analysis.

Leave a reply

Your email address will not be published. Required fields are marked *