In the era before smartphones, compulsive list makers like me used to rely on notepads and diaries to jot down tasks, make to-do lists and schedule meetings. On a really relaxed day, you could find me with colored post-its, labels and highlighters. Staying organized required a lot of effort (and was also a guilty pleasure).
Then came the digital era and the executive diary and the note pad slowly began to disappear. Technology ushered in convenience and flexibility in the form of smartphones. These days, even during the busiest of times, I still manage to make my lists, book appointments and set reminders, all through one single application –– the virtual assistant.
Siri, Jeannie and a multitude of apps with human monikers have replaced my dog-eared diary. As efficient and understanding as these digital assistants are (have you ever known Siri to lose her temper or show attitude?), you have to share them with millions of other users.
What if you didn’t want a mass-produced, generic assistant? What if you wanted your own personal assistant? Perhaps one with Scarlett Johansson’s voice? Turns out, it’s not so hard.
To find out how you could go about building one, I spoke with our resident language-processing guru, Siva, who walked me through the steps needed to have your own digital Jeeves.
As Siva explained, at its most basic level, a digital assistant mimics human behavior. The key is to identify and formalize into rules those actions that have clear patterns. Of course, no system can be programmed to respond to all the combination of questions or understand the variations in language.
But the system can teach itself (self-learning). Over time it can identify common elements and incorporate your preferences to deliver only those results that matter to you.
So how does this work algorithmically? The functions that a virtual assistant performs can be divided into two types: questions and tasks.
Consider the question: “Who is the President of USA?”
This is an example of a precise question. The computer identifies key words such as the ‘name of the country’ (USA) and ‘title’ (President). It is also able to discern “temporal information” or details around time from “is”. This eliminates any ambiguity that the question pertains to the current president among the list of US presidents.
If instead, I had posed the question as “Who was the President of USA?”
Now the assistant is likely to get confused, as there are many past Presidents and without more details it won’t know what I am looking for.
The more specific your question or task is, the more accurate the assistant’s response will be. Lets consider another case. I ask my assistant: “What is the address of the Mexican restaurant?”
The phrase “Mexican restaurant” is too ambiguous and the application is likely to search for all Mexican Restaurants at random as there is no information provided about location. You could add “near-by” or better still if you rephrase the question as “What is the address of the Mexican Restaurant on 21st and Lexington, New York?” This is easier as there are more keywords and the application is clear about the requirement and fetches you relevant details.
Moving onto tasks, here are a few scenarios.
I tell my digital assistant: “Please schedule an appointment with Dr. Jim Stevens in Oakbridge at 3:00 pm on Nov 24th.”
This task is quite straightforward to execute. That’s because the attributes for “who”, “what” and “when” are clearly stated and the application performs exactly as directed.
If instead I just say: “Please schedule a meeting with Jim, on Wednesday.” Here the “date”, “time” and “location” are missing and the computer won’t be able to complete the task satisfactorily. As these details are not explicitly stated, the application assumes default values, which are already stored in the system.
But it’s not very interesting or practical if every question that I ask or every task that I assign has to be spelt in detail. The key to getting maximum value out of your virtual assistant is customization.
There are two approaches to training your virtual man Friday: supervised or unsupervised algorithms.
When the application is trained by means of a supervised algorithm, there is a set of predefined instructions and tools that the application mechanically picks up while performing a task or carrying out a request.
In the case of unsupervised training approach, the application is not required to follow predefined instructions or rules because by design it arrives at the answer/task on its own. In this approach, the application extracts a set of keywords based on its user’s past behavior. These keywords in turn trigger a relevant action. For instance, the user’s language preferences, his/her daily alarm timings, etc. are configured in this manner.
This is a simplistic view of what is actually a complex system, but at a high-level it describes how one might think about building your own personal Jeeves. In follow-on posts we’ll dive deeper into the realms of pattern recognition and machine learning.