My wife has a small blog for her hobby. I am, for my sins, her IT guy for it. She posts to it about once a week, maybe twice, and of course, after every post there’s a small uptick in views. All nice and normal.
About a couple of months ago, I began to notice something weird happening. There was one post in particular that was getting 50-75% of all the views per day. There was nothing too striking about the post itself, but we laughed it off, imagining that it was obviously remarkable enough that it still attracted new views (the way the blog engine works is that it posts a cookie when you visit a post for the first time so that your subsequent view(s) don’t get registered again). My reasoning after a little while was that there was some heavily frequented page out there (my guess: Pinterest) that had a link to this particular post and people were intrigued enough to click the link. In droves.
And the views kept on piling up for this post. At the time of writing, this post has 4 times as many views as the next most popular post on her blog and 10 times as many as the number of views for the “About This Site” page.
Finally, I decided to write a bit of code to log the HTTP_REFERER for visits to the blog. Oh my.
This is a quick screenshot from SQL Server Management Studio for this log file from the site, for this blog post, showing a very dubious set of HTTP_REFERER values. Over Thanksgiving. Every one of them hits the page three times, in very quick succession, obviously discarding the cookie. In fact, this is scripting at work; it’s not like there’s some dumbass clicking on a link. I visited a couple of these sites (in a very locked down, user-incognito browser) and most of them look like article and post aggregators.
Yes, agreed, HTTP_REFERER is pretty much useless these days. What with Google encrypting search terms for privacy reasons, and many sites/browsers not even using it, and it being way too easy to fake (like these sites), it’s ultimately discardable and dodgy information.