Types of Bots: An Overview

Learn more about all the different varieties of bots, and what they can do for you

Introduction

There are tons of bots out there.  So many bots, with so many different reasons to exist .  And the context that’s used to discuss them can vary wildly – some people are focused on the utopian possibilities of bots, others are focused entirely on all the bad stuff bots can do.

With the space that bots exist in being so large, it can be challenging to get your head around all the different things that people can mean when they refer generally to “bots”:

group mis-understanding

“I’m glad we’re all on board with the bot initiative….”

 

In this article, we’ll lay out a framework for thinking about the space:

If you Google “what are bots?”, 5 of the top 10 results returned are focused on the negative aspects.  From the security company Norton, you see the title “Cybercrime – What Are Bots?”.  From reputable publications like Time magazine, there are alarming titles like “How Bad Bots Are Destroying the Internet”.  You see similar results (~50% negative results) when you Google “different types of bots”.

In fact, if you take the Google search engine results page for “what are bots” and run it through AlchemyAPI, the keyword extraction and sentiment analysis scores show exactly how bleak the picture is:

google what are bots SERP

Well, that’s interesting!  And it paints a pretty scary picture of the world of bots, which I don’t think is (entirely) deserved.

Yes, there are bad people out there who create virtual mountains of spam, and even worse people who break the law, and sometimes they use bots.  But the promise of bots as agents for good, in the form of gained productivity and new business opportunities, is also tremendous.

So, the first lens we’ll use to look at the bot landscape is “good bots” versus “bad bots”, along with some examples of bots in each category.

Good Bots

  • Chatbots
  • Crawlers
  • Transactional bots
  • Informational bots
  • Entertainment bots: Art bots, Game bots

Bad Bots

  • Hackers
  • Spammers
  • Scrapers
  • Impersonators

Good Bots

Chatbots

  • My definition of chatbots is very narrow, on purpose: chatbots are bots that are designed to carry on conversations with humans, usually just for fun, and to test the limits of the technology.  Chatbots usually have a “personality” similar to a human, and there usually isn’t a goal for the interaction other than to see what the chatbot says.
  • There’s another common usage of “chatbots” that essentially includes ALL bots, (i.e. if it’s automated, and you carry on a conversation with it, it’s a chatbot).  I think this is confusing, and misses the point.  Specifically, there’s a promise inherent in the word “chatbots”.  The promise is that you will be carrying on a conversation with the bot, and understanding human language is a really hard problem.  Therefore, we shouldn’t be setting the expectation that users are going to be having a “chat” with a bot.  People do weird things like anthropomorphize bots and even try to have sex with them.  When given space to ask natural language questions, people will routinely say BIZARRE things.  For all these reasons, “chatbot” is fine in the narrow context of human-like AI that is designed to be chatted with for fun. But we shouldn’t be setting the expectation that the bots we’re building will have a natural language, conversational interface by branding them as “chatbots”.
  • ELIZA is the godmother of all chatbots.  The bot runs a simple question-and-response script that automatically generates responses to questions, in a style similar to a psychotherapist.
  • Cleverbot is a more advanced example that uses AI to learn from interactions.
  • Tay is a Microsoft AI chatbot that converses with people via Twitter. It got big press early in 2016 when it was launched, and people quickly began trying to break it, making it say awful things.

Crawlers

  • These bots run continuously in the background, primarily fetch data from other APIs or websites, and are “well-behaved” in that they respect directives you give them.
  • For example, you can “hide” your entire website from search engines by blocking search engine spiders in your site’s robots.txt file, keeping all of your site’s content out of Google or Bing, or Yandex, or whatever.
  • Search engine spiders are crawlers that extract URLs from documents, which are then passed off to the indexing infrastructure to download the content from each URL, which is then parsed, and built into a searchable index.
  • Googlebot and bingbot are the two most common examples of search engine spiders.
  • Other crawlers include bots that monitor other systems for change.
  • Pricing Assistant is bot that monitors ecommerce websites for price changes.
  • Alerbot monitors websites for server uptime, website errors, bugs and performance issues.

Transactional bots

  • Bots in this category act as agents on behalf of humans, and interact with external systems to accomplish a specific transaction, moving data from one platform to another.
  • Since bots can interact with any endpoint that has an API, Transactional bots can do LOTS of things, and lots of custom solutions are to be expected here.
  • Transactional bots fit into the area of robotic business process automation (BPA), which is expected to grow from $180MM in 2013 to $5B by 2020.  
  • The upside of BPA is increased productivity.  The downside is losing human jobs.  Oxford released a study in 2013 Oxford released a study estimating that 47% of current human jobs were in jeopardy due to automation.  This tension between technological progress and human costs is very contentious, to say the least.
  • Birdly is a Slackbot, which you activate via specific /slash commands, and it will go retrieve specific data for you, e.g. a customer record from Salesforce.
  • x.ai has given their bot a human persona, Amy Ingram, who interacts with people via email to automatically find meeting times for distributed teams.

Informational bots

  • Bots in this category surface helpful information, often as push notifications, and include things like breaking news stories.
  • Techcrunch has a personalized news recommendation bot that pushes content to you via Facebook Messenger or Telegram.
  • Some informational bots broadcast data as it becomes available.
  • @MassBudgetBot tweets when Massachusetts state budget earmarks are approved.

Entertainment bot: Art bots, Game bots

  • Art bots are designed to be appreciated aesthetically.
  • Deep Drumpf uses deep learning, applied to transcripts of speeches, to learn how to speak like Donald Trump.
  • RealHumanPraise takes positive movie reviews from Rotten Tomatoes, and replaces actors with Fox News personalities, and tweets every 2 minutes.
  • Video game bots function as characters, often for humans to play against or to practice and develop skills in first-person shooter games.
  • There are tons of video game bots.
  • Another category is bots you play games against.
  • Detective is an example of this kind of bot, which is a variation of a Turing test. You text back and forth with another “agent” (human or bot) and attempt to figure out which it is.
  • Google Assistant has all kinds of games you can play, including Emoji Riddles, Emoji Detective, etc.

Bad Bots

If you’re interested in really getting into the weeds on bad bots, I highly recommend Distil’s 2017 Bad Bots Report. It’s one of the most comprehensive and digestible summaries of bad bot activity I’ve come across.

Hackers

  • Hacker bots are designed to distribute malware, deceive individual people, attack websites, and sometimes entire networks.
  • These bots exploit security vulnerabilities to inject code into the victim’s site.
  • Hacker bots can create denial of services (DDoS) attacks by distributing their attack across many different proxies, and are designed to have browser-like signatures.
  • Google has said that 180% more sites were hacked in 2015 vs 2014.
  • Once enough hacked computers have been taken over, they can be used for a variety of nefarious purposes.
  • Individual computers that are affected are known as “zombies”
  • Networks of infected computers are known as “botnets”.

Scrapers

  • Scraper bots are designed to steal content (email addresses, images, text, etc) from other websites.
  • Scraped content is often remixed and pumped back out as published pages.
  • Published pages are designed to capture human visitors who are searching for specific keywords, and those visitors are monetized via advertising (AdSense is a classic example).

Spammers

  • Spambots are designed to post crappy promotional content around the web, and ultimately drive traffic to the spammer’s website.
  • Forum and comment spambots are a classic example here. Bots post garbage content into a forum or blog comments, with links to their spam site.
  • Volume of spambots has declined in recent years, due to search engines making these tactics unprofitable.

Impersonators

  • Bots in the Impersonator category are designed to mimic natural user characteristics, making them hard to identify.
  • Impersonators also include propaganda bots that are designed to sway political opinion one way or another, often by drowning dissenting opinions.
  • Turkey, Mexico, and other nations have used Twitter impersonator bots for these purposes.
  • Researchers at the University of Southern California studied the use of social bots in the 2016 U.S. Presidential election, and concluded that “the presence of social media bots can indeed negatively affect democratic political discussion rather than improving it, which in turn can potentially alter public opinion.”

How Should We Categorize Bots?

Bot classification is arbitrary. To be clear, we’re all just making it up. Literally.

In the examples I pointed out, above, I made five primary categories for “good bots”.  John Borthwick, CEO of betaworks, says there are six types of bots. Others say seven, or four. What’s the right way to categorize bots? Is it by the domain they function in? Their purpose? Who chooses?

Inevitably, the author (me included) comes up with some model that makes sense to their mind at that moment. The problem is that the moment changes, while the model stays the same, and you’re left with a messy jumble of opinions.

If you look at two bot directories, BotList and Chatbottle, you can see the differences in how personal definitions impact classification.

Botlist data shows that the Personal bot category is one of the largest categories for Facebook Messenger bots. Chatbottle data shows that Personal bots is one of the smallest categories. Who’s right?

Because the market is going to change, I’m going to focus on a heuristic rather than a strict classification system.  It’s a high-level classification of bots into either “generalist bots” or “specialist bots”.

Generalist bots vs Specialist bots

The heuristic of “generalist bots” versus “specialist bots” is helpful is that it recognizes a primary market dynamic: Huge companies like Google, Facebook, Amazon, Apple, etc, are all building bot-like services.

On the one hand, this can seem discouraging for many bot developers. How can my dinky little bot compete with Siri? Or Google Now? The answer is that your individual bot doesn’t have to compete.

Big companies seem to be focusing on the “generalist” aspect of bots. Take for example this statement from Google’s Hal Varian on predicting what to build:

One easy way to forecast the future is to predict that what rich people have now, middle class people will have in five years, and poor people will have in ten years. It worked for radio, TV, dishwashers, mobile phones, flat screen TV, and many other pieces of technology.

What do rich people have now? Chauffeurs? In a few more years, we’ll all have access to driverless cars. Maids? We will soon be able to get housecleaning robots. Personal assistants? That’s Google Now. This area will be an intensely competitive environment: Apple already has Siri and Microsoft is hard at work at developing their own digital assistant.

Most of the big companies seem to be focusing on “generalist” bots, while many companies and individual bot developers are building “specialist” bots.

These “generalist” bots will have an important role to play, in that they may often be the first step in a multi-step process to complete a task.

I may start my interaction by saying “OK Google”, but where I go from there will depend on what job I need done.  Google can route my request, but if I need to fight a parking ticket, I need a very specialized bot like DoNotPay.

In this model, individual “specialist bots” will be the last mile of connection between the customer and the job-to-be-done.

Generalist bots can understand what you’re asking, but they don’t have the specific domain expertise to follow through with action. This is especially true in the enterprise, where solutions are hard to generalize. Specialist bots will provide the domain-specific knowledge necessary to accomplish many valuable tasks.

This dynamic of generalists-vs-specialist bots is one of the most helpful heuristics I’ve come across in thinking about how to classify bots.

generalist vs specialist bots

Image credit

The thing I like about this heuristic is that it recognizes the massive size of the space we’re in, and points out the potential there is in building highly specialized bots. DoNotPay is not only an amazing, practical and useful service, but it was built by a single 18-year-old. Think about that.

Bot Intelligence

The next lens we’ll use to think about bots is related to intelligence of the bot.  Some bots use elements of machine learning (ML) and artificial intelligence (AI) in order to understand language, process complex requests, and manage dynamic outputs.  And while it’s true that some bots are heavily reliant on AI & ML, other bots are far simpler.

The media seems to spend a fair amount of time talking about bots in the context of AI.  As a result, my suspicion is that many people are conflating the two concepts (bots + AI = all bots are intelligent agents).  That’s not the case at all.

Bots exist along a continuum.  At the simple end there are Script Bots, and at the complex end are Intelligent Agents (called “Cutting edge bots” in the graphic):

bot intelligence continuum

Image credit

Script Bots

The simplest bots are script bots.  The entire interaction is based off of a pre-determined model (the “script”) that determines what the bot can and cannot do.  The “script” is a decision tree where responding to one question takes you down a specific path, which opens up a new, pre-determined set of possibilities.  It’s basically like a Choose Your Own Adventure (for those old enough to remember Choose Your Own Adventure, or books).

The important thing to recognize with a script bot is that the bot’s domain is necessarily limited.  If a customer service bot allows you to select from red, blue, or green, and you try to select magenta, the interaction fails.  Limiting the interaction with a bot by defining a narrow set of acceptable inputs might feel restrictive, but there are strong arguments for it.  By being very explicit about the limits of the bot’s domain (and grammar of acceptable responses), you keep the interaction very directed, and the quality of the user experience stays very high.

Sometimes a script bot may use natural language processing (NLP) on the front end of the interaction, to parse out words that may match an answer in their script.  This is enticing, but kinda dangerous from a user experience perspective.  Language is a really hard problem.  If you give people the impression that they can talk with the bot the way they would talk with a human, the bot may have a hard time understanding the inputs.  This leads to aggravating error-recovery behavior in this example with the Poncho weather bot:

poncho-nlp-interaction

Script bots that want to use NLP as a “chatty” front end need to think very carefully about this.  People will be people, and they will go off script.  How does your bot handle these unplanned-for interactions?

One method is to fail over to a human customer service agent, which brings us up the Bot Intelligence Continuum to Smart Bots.

Smart Bots

Much of the excitement around bots focuses around the *possibilities* of bots, given the massive advances in ML and AI in recent years.  And some of this excitement is well-founded.  Many bots have a heavy server-side processing component, which allows them access to massive computing power in understanding and responding to queries.  Couple that with the open-sourcing of AI software libraries like Theano and TensorFlow, and you have the ingredients for some amazing human-bot interactions.

Many of the bots getting the most media coverage leverage AI for the first response mechanism.  If the interaction takes a turn that the AI can’t handle, the system falls back on a human agent to sort things out.  Examples of this are Clara, Fin and Facebook M.

When you think about the AI + Human Agent model, it seems like a natural for customer service applications.  Maybe you just want to know how much your next bill is, or when it’s due, which would be easy enough for a bot to handle.  If the query gets much more complex (“Why didn’t I get the bill credit I expected?”) then the interaction is transferred to a human agent.

Intelligent Agents

Intelligent Agents is a deliberate kludge of all the other customer-facing AI technology.  They range from DeepMind’s AlphaGo to Tesla developing self-driving cars.  This is a very diverse, rapidly accelerating space.

The main differentiator between Intelligent Agents and Smart Bots is that Intelligent Agents they are designed to be autonomous.  If operating correctly, they should require no human intervention in order to perform their tasks correctly.  Google’s self-driving cars are designed without steering wheels for humans, because they shouldn’t be necessary.  x.ai has a bot that schedules meetings for you, Amy Ingram, and she manages all the back-and-forth with zero oversight.

Because research in artificial intelligence and machine learning is accelerating so rapidly, this area is the most difficult to make predictions about, and the most difficult to encapsulate.  But the public also needs to have their expectations set correctly.  We’re still at least 50 years from being able to expect something that is true Artificial General Intelligence (AGI).  So we can marvel at self-driving cars, at the same time we realize it’ll be 2066 before we get to fall in love with Her.

How Many Bots Exist?

Coming up with credible numbers around how many bots exist is very difficult. Each bot platform (messenger app) has a vested interest in making their ecosystem look healthy, and so they’re inclined to inflate their numbers.  As of spring 2017, the claims are as follows:

Topline numbers like this from technology companies should be looked at with a jaundiced eye.  Tech companies want to give the impression of critical mass and momentum, so they tell us how many people signed up.  Along this line, the number I’m most inclined to believe is Microsoft, who says developers have “signed up and gotten started.”  The topline “signed up” numbers are going to be WAY larger than the number of developers who actually create things, much less launch them.

On the side of Good Bots, we tend to have platform operators who will inflate numbers to make their platform ecosystem look healthy. On the Bad Bots side, we have developers who are deliberately trying to obfuscate themselves.

If we want to narrow our focus to Good bots that live and work in messaging platforms, one of my favorite services for surveying the landscape is Chatbottle. They rely on users submitting their bots, which means that these bots are intended to be used (as opposed to a developer tinkering on the weekend) and they’re at least somewhat production-ready.  Chatbottle’s counts are:

If we expand our focus beyond bots that live in messaging apps, on the Good bots side, Udger has been documenting known internet bots that interact with websites (which ends up being a mix of Crawlers, Transactional and Informational bots according to our framework).  It counts 868 known bots – some active, some inactive.  And these are just the bots that interact with websites!  It doesn’t even begin to count how many bots exist inside other platforms (e.g. Slackbots, Telegram bots, WeChat bots, Twitter bots, etc).

On the Bad bots side, there is a list called the Register Of Known Spam Operators (ROKSO) that is maintained by SpamHaus.  This list provides very detailed information on individual spam operators, including their names, addresses, shell companies they’ve created, examples of their spam, etc.

It’s impossible to know exactly how many bots of each category and kind there are.  And it’s probably not hugely important what the exact numbers are.  What’s more interesting is trying to understand the market dynamics relative to where the market is moving.

The market is developing quickly.  I think with the launch of major bot platforms including the Microsoft Bot Framework and bots for Facebook Messenger, we’re about to see an explosion in Transactional and Informational bots that are focused on B2C audiences.  People will be building bots that allow you to make some sort of commercial transaction or get a tidbit of information just to capitalize on the huge audience of Messenger.

On the B2B side, I also think we’ll see an explosion of Transactional bots, but the transactions will likely be exchanging data between platforms, rather than buying stuff.  The classic examples here are Slackbots that people build to help them accomplish specific tasks at work.  Slackbotlist counts 121, and there are probably many more than that.

I certainly think that some standouts will appear early on (e.g. Birdly for filing expense reports, Statsbot which surfaces Google Analytics data and was acquired by Google, etc).  But the market is still in its infancy, and I think that more and more companies will look to develop their own bots for their own specific needs as they realize just how powerful these bots can be for communication and productivity.