Saturday, February 1, 2014

Crowd-Frauding: Why the Internet is Fake

Power in human societies derives from the ability to get people to act together. Armies, religions, governments, and businesses have dominated societies using weapons, beliefs, laws and money to exert collective effort. In modern societies, mass media have emerged as a similar organizing power.

A new kind of collective organization, mediated by the internet, code and connections, is emerging as another avenue of power. It's no longer ridiculous to think that social networks, crowd-sourcing and crowd-funding could achieve the social consensus, action and compulsion that were the province of governments, armies and religions. As the founder of a crowd funding site for ebooks, I'm naturally optimistic that the crowd, connected by the social internet, will be an immensely powerful force for good.

I'm also continually reminded that the bad guys will use the crowd, too. And it won't be pretty.

In December, our site, Unglue.it, began to get a surge of new users. But there was something fishy. The registrations were all from hotmail, outlook, and various dodgy-sounding email hosts. The names being registered were things like Linette, Ophelia, Rhys, Deanne, Agueda, Harvey, Darcy, Eleanore and Margene. Nothing against the Harveys of the world, but those didn't look like user handles. Of course it was registration bots coming from many different IP addresses.

But they were stupid registration bots- they never complete the registration, so the fake accounts can't leave comments or anything. It wasn't causing us any harm except it was inflating our user numbers. It was mystifying. So I started studying why bots around the world might start making inactive accounts on our site.

And that's how I learned about the dark side of the crowd-force. The best example of this is a program called Jingling, also known as FlowSpirit. It's been around for 5 years or so.

Jingling is an example of a "cooperative traffic generation" tool. It's software-organized crime. Crowd-frauding, if you will.

It works like magic. You download the Jingling software and install it on your computer. You then enter the URLs for four webpages that you want to promote. (or more, if you have a good internet connection.)  Although the user interface is in Chinese, you can get annotated instructions in English on YouTube or websites like theBot.net. Once you've activated Jingling, the webpages you want to promote start getting hundreds of visitors from around the world. The visitors look real, they click around your page, they click on the advertisements, they register accounts on websites, they click "like" buttons and follow you on Twitter.

Meanwhile, your computer starts running a website-visiting, ad-clicking daemon. It visits websites specified by other Jingling users. It clicks ads, registers on sites, watches videos, makes spam comments and plays games. In short, your computer has become part of a botnet. You get paid for your participation with web traffic. What you thought was something innocuous to increase your Alexa- ranking has turned you into a foot-soldier in a software-organized crime syndicate. If you forgot to run it in a sandbox, you might be running other programs as well. And who knows what else.

The thing that makes cooperative traffic generation so difficult to detect is that the advertising is really being advertised. The only problem for advertisers is that they're paying to be advertised to robots, and robots do everything except buy stuff. The internet ad networks work hard to battle this sort of click fraud, but they have incentives to do a middling job of it. Ad networks get a cut of those ad dollars, after all.

The crowd wants to make money and organizes via the internet to shake down the merchants who think they're sponsoring content. Turns out, content isn't king, content is cattle.

Jingling is by no means alone; there are all sorts of bots you can acquire for "free". Diversity of bots is enforced because the click fraud countermeasures only attack the most popular bots; new bots are being constantly developed and improved.

What does this mean for advertising, ad-supported websites, and the internet in general?

It means that the internet rich will get richer and power will concentrate. Let me explain.

I used to run my own mail server. It was a small process on one of my old machines. I was a small independent internet entity. The NSA couldn't scan my emails and I could control my mail service. But as spammers cranked up their assault, it became more and more complicated to run a mail server. At first, I could block some bad ip addresses. But when dictionary attacks could be run by script kiddies, running my own email server got to be a real drag. And because I was nobody, other people running mail servers started blocking the email I tried to send. So I gave up and now I let big brother Google run my email. And Google gets to decide whether email reaches me or gets blocked by spam. That doesn't make me happy, and I still get a fair amount of spam. (Somehow the SEO and traffic generation scammers still get through!)

It's probably the same way that kingdoms and countries arose. Farmers farmed and hunters hunted until some bad guys started making trouble. People accepted these kings and armies because it was too much trouble for farmers and hunters to deal with the bad guys on their own. But sooner or later the bad guys figured out how to be the kings. Power concentrated, the rich got richer.

So with the crowd-frauders attacking advertising, the small advertiser will shy away from most publishers except for the least evil ones- Google or maybe Facebook. Ad networks will become less and less efficient because of the expense of dealing with click-fraud. The rest of the the internet will become fake as collateral damage. Do you think you know how many users you have? Think again, because half of them are already robots, soon it will be 90%. Do you think you know how much visitors you have? Sorry, 60% of it is already robots.

Sooner or later the VCs will catch on to the fact that companies they've funded are chasing after bot-clicks and bot views. They'll start demanding real business models; those of us older than forty may remember those from college. And maybe reality will have a renaissance. But more likely, the absolute power of Google, Amazon, Apple and the rest will corrupt them absolutely and we'll suffer through internet centuries of dark ages (5 solar years at least) before the arrival of an internet enlightenment.

Until then, let's not give in to the dark side of the force, OK?

Notes:
  1. Last year, I wrote about some strange Twitter bots. I now think it's likely that the encoded messages I saw are part of a cooperative traffic generation scheme. If you're trying to orchestrate a vast network of click-bots, what better way to communicate with them than twitter?
  2. There are now disposable email hosts that will autoclick confirmation links. These email hosts are the registration spammer's best friends. A list is here.

3 comments:

  1. Fake robots not the only problem, but also an iceberg of unused or limited use real account profiles: how many times do you create an account to "test" out a promising new service, only to find that the account remains largely inactive after the initial first wave of usage, and sometimes dormant for years.

    The intention is to "come back" for more exploration at some later date when the service might prove more useful or when you have the time for further review. But in reality, you often never get the time to re-evaluate or get service from competing offering.

    Additionally -- you have "account inflation" from the "log-in-to-see-anything" policy used by many sites (the "viral" account model). I can think of several major web service accounts I maintain only because the website doesn't offer any open "read-only" mode -- you have to log in to view content.

    ReplyDelete
  2. I see the same thing with FOSS4Lib -- accounts are requested, seem valid enough to make it through the manual activation process (I review account requests before they are created -- this comes after Drupal's built-in please-validate-this-request-by-following-a-link-in-an-email process), then lie dormant forever. I've been meaning to write a job that deletes these new accounts after a certain amount of inactivity to reduce the chance that one of these credentials will be used later, but all-in-all the whole thing is annoying.

    Thanks for the link to the disposable e-mail accounts list. It'll be worthwhile to see if any of these have been used on FOSS4Lib, but a lot of the account requests FOSS4Lib sees are like what you are seeing -- lots of Hotmail, Outlook, and Yahoo addresses.

    ReplyDelete

Note: Only a member of this blog may post a comment.