How far are we from building systems with commonsense? One often-heard answer is: not in the near future, while the realistic answer is: we don’t know. Last year, I spent some time trying to build a system that can do better than an information retrieval baseline in taking fourth-grade science exam (which still has a ways to go to gain a passing score of 65%). I failed hard. Here’s an example to get a sense of the difficulty of these questions.
Tay was built to learn the way millennials converse on Twitter, with the aim of being able to hold a conversation on the platform. In Microsoft’s words: “Tay has been built by mining relevant public data and by using AI and editorial developed by a staff including improvisational comedians. Public data that’s been anonymised is Tay’s primary data source. That data has been modelled, cleaned and filtered by the team developing Tay.”
As IBM elaborates: “The front-end app you develop will interact with an AI application. That AI application — usually a hosted service — is the component that interprets user data, directs the flow of the conversation and gathers the information needed for responses. You can then implement the business logic and any other components needed to enable conversations and deliver results.”
The promise of artificial intelligence (AI) has permeated across the enterprise giving hopes of amping up automation, enriching insights, streamlining processes, augmenting workers, and in many ways making our lives as consumers, employees, and customers a whole lot better. Senior management salivates over the exponential gains AI is supposed to deliver to their business. Kumbayah […]
If AI struggles with fourth-grade science question answering, should AI be expected to hold an adult-level, open-ended chit-chat about politics, entertainment, and weather? It is thus encouraging to see that Microsoft’s Satya Nadella did not give up on Tay after its debacle, and Amazon’s Jeff Bezos is sponsoring an Alexa social chatbot competition. I love this below quote from Jeff:
The upcoming TODA agents are good at one thing, and one thing only. As Facebook found out with the ambitious Project M, building general personal assistants that can help users in multiple tasks (cross-domain agents) is hard. Think awfully hard. Beyond the obvious increase in scope, knowledge, and vocabulary, there is no built-in data generator that feeds the hungry learning machine (sans an unlikely concerted effort to aggregate the data silos from multiple businesses). The jury is out whether the army of human agents that Project M employs can scale, even with Facebook’s kind of resources. In addition, cross-domain agents will probably need major advances in areas such as domain adaptation, transfer learning, dialog planning and management, reinforcement/apprenticeship learning, automatic dialog evaluation, etc.
The plugin aspect to Chatfuel is one of the real bonuses. You can link up to all sorts of different services to add richer content to the conversations that you're having. This includes linking up to Twitter, Instagram and YouTube, as well as being able to request that the user share their location, serve video and audio content, and build out custom attributes that can be used to segment users based on their inputs. This last part is a killer feature.
When you have a desperate need for a java fix with minimal human interaction and effort, this bot has you covered. According to a demo led by Gerri Martin-Flickinger, the coffee chain's chief technology officer, the bot even understands complex orders with special requests, like "double upside down macchiato half decaf with room and a splash of cream in a grande cup."
Some brands already seem to be getting the balance right. A bot needs to capture a user's attention quickly and display a healthy curiosity about their new acquaintance, but too much curiosity can easily push them into creepy territory and turn people off. They have to display more than a basic knowledge of human conversational patterns, but they can't claim to be an actual human -- again, let's keep things from getting too creepy here.
Poor user experience. The bottom line: chatbots frustrate your customers if you are viewing them as a replacement for humans. Do not ever, ever try to pass of a chatbot as a human. If your chatbot suffers from any of the issues above, you’re probably creating a poor customer experience overall and an angry phone call to a poor unsuspecting call center rep.
Unlike Tay, Xiaoice remembers little bits of conversation, like a breakup with a boyfriend, and will ask you how you're feeling about it. Now, millions of young teens are texting her every day to help cheer them up and unburden their feelings — and Xiaoice remembers just enough to help keep the conversation going. Young Chinese people are spending hours chatting with Xiaoice, even telling the bot "I love you".
In 1950, Alan Turing's famous article "Computing Machinery and Intelligence" was published, which proposed what is now called the Turing test as a criterion of intelligence. This criterion depends on the ability of a computer program to impersonate a human in a real-time written conversation with a human judge, sufficiently well that the judge is unable to distinguish reliably—on the basis of the conversational content alone—between the program and a real human. The notoriety of Turing's proposed test stimulated great interest in Joseph Weizenbaum's program ELIZA, published in 1966, which seemed to be able to fool users into believing that they were conversing with a real human. However Weizenbaum himself did not claim that ELIZA was genuinely intelligent, and the introduction to his paper presented it more as a debunking exercise: