jul17-28-475158419

Many people would agree that scheduling meetings is tedious. Perhaps you have experienced an email chain like this:

Jenn, a potential client: Hey! What day/time works for a quick call next week?
You: (toggling between calendar app and email) I’m wide open Monday.
Jenn: (several hours later) Sorry. Traveling that day. How about Wednesday at 10 AM?
You: (checking your calendar app again) That should work. Your office?
Jenn: My office is great. Maybe we should see if Emad can join?

This back-and-forth can carry on, and it can get even more challenging when people use different calendaring systems or meet across different time zones. Not only are these exchanges time-consuming, they also obliterate our ability to focus on more demanding tasks.

An informal survey confirmed our suspicion that others felt similarly. We asked about 100 information workers, in a wide range of industries and roles, to identify tiresome tasks they regularly do that are not part of their primary work duties. The most onerous task people citied was scheduling meetings.

While online calendar sharing tools like Outlook and Google calendar and polling tools like Doodle make scheduling less cumbersome, we still have to stop what we’re doing to use them while we switch from the task at hand to fiddle with our scheduling tool of choice. Things get even more complicated when the people we want to meet with use different tools, since many don’t work well together.

With all the progress happening in artificial intelligence (AI), we wondered if we could create a virtual assistant that could handle the conversational back-and-forth required for scheduling meetings, much the same way that executive admins schedule meetings for CEOs.

There is a long history of AI research around how to build digital personal assistants, but none of the early work on AI scheduling has taken off. There are several reasons for this. First, in the workplace, business users have very little tolerance for mistakes. If the AI assistant is not a model digital employee, people will quickly lose patience and stop using it. The current state of AI is not yet ready to guarantee such a high-performing assistant. Second, there is the chicken-and-egg problem: Good AI needs a lot of data, and to get a lot of data you need real usage, but that’s hard to achieve without a reliable system. Third, scheduling scenarios can be complex — there are one-on-one versus many-person meetings, in-person versus remote meetings, meetings that are postponed and need to be rescheduled. People also use their calendars differently: Some use appointments as tasks, and others block out time when they are free. And people also have unique preferences over time that are hard to capture: Some prefer clustered meetings, while others like them spread out. Finally, there are subtle social considerations involved with scheduling, like the relative status between people or the urgency of a meeting.

Our virtual assistant solution would need to solve for all three of these issues, as a human assistant would. But where to start? We took a step back and considered long-standing rapid-prototyping approaches in design. These involve building and testing lo-fi prototypes before gradually iterating on higher-fidelity — and more expensive — designs. For example, a designer might initially show paper prototypes to a group of users to rapidly collect user feedback. Then they might build some wire-frame mock-ups to test them with users in a slightly more realistic setting, mirroring the types of interactions users see in the end product. Finally, the designers move to a “Wizard of Oz” prototype, where users will experience an interface that looks and feels real but behind the curtain is a human researcher pulling the strings and controlling the interface.

We decided to go with the Wizard of Oz approach, but took it one step further. Here is how it worked. We invited a handful people from a few companies to sign up for our system, which we later named Calendar.help. Then they simply added the virtual assistant to the Cc line when sending meeting invitation emails. The longer-term goal was that the AI-driven virtual assistant would take it from there, looking at schedules and creating calendar invites for optimal days and times. But at the outset, the virtual assistant was actually us parsing every single invitation email, looking for optimal solutions, and scheduling meetings. Although this was a lot of work, with nearly two years of iterations, it allowed us to get a product into people’s hands early so we could observe their behavior and iterate quickly. It also gave us a deeper understanding of the problem we were trying to solve — and let us start evaluating which portions of the work could eventually be performed through AI. We were delivering accuracy and collecting excellent data that could potentially bootstrap an AI solution down the road.

This approach showed us what people would expect from a virtual scheduling assistant. It allowed us to create workflows that could be broken into narrower microtasks, such as extracting the location of a meeting or determining whether the meeting should take place face-to-face or on the phone. With that done, we continued with our humans-in-the-loop model and hired a staff of workers to perform those microtasks. They formed a sort of digital assembly line, with one person surveying people’s calendars to suggest optimal times, another looking at available locations, and another working to reschedule the meeting if needed.

The advantage of creating these microtasks was three-fold. First, it focused the worker’s attention on one thing at a time, reducing errors and making the workflow systematic enough that task workers could come in and out of the system. Second, because the microtasks were well-designed, they helped us collect high-quality data that we would later use to automate the process. Third, a system with this level of granularity helped us create a variety of machine learning models to understand natural language, so specific microtasks could be executed automatically. For example, we used people’s responses to time options for the meeting to help us understand how they express their preferences. This enabled us to build and train a machine learning classifier to automatically perform this step in the future.

A key takeaway is that the scheduling bot was fully functional from day one but became more efficient through use. A benefit of having humans in the loop early on was that they were able to understand people’s actual needs first, while collecting conversational data on how people interacted with the assistant. Each additional interaction provided data that helped us better understand which scenarios were most important to automate and what data was needed to do so. The team has since continued to improve on Calendar.help, which is now publicly available for people with Office 365 or Google calendars to preview.

We learned that keeping humans in the loop when building a virtual assistant does have its limitations and costs. For example, we observed that some invitees did not like the idea of working with a virtual assistant. Some people saw it as producing extra work for them because they still had to respond to the assistant’s emails; others were put off by the perception that the assistant was a bot. The social dynamics of involving bots in existing businesses interactions are evolving, but it’s definitely something to keep an eye on. Over time, if virtual assistants become more common, some of the friction might be reduced.

We also learned that it’s important to be transparent about the human-in-the-loop architecture so that users can make product decisions that align with their privacy expectations. Having a virtual assistant with humans in the loop is not always the right option for everyone. For example, medical doctors have heightened responsibilities to their patients’ privacy, and may be selective about including a third party for scheduling. Transparency lets people decide what is right for them.

Overall, we believe that creating and using systems like Calendar.help to manage routine tasks is an easy way for companies to leverage AI in their daily business practice. We didn’t want to constrain ourselves to what was currently possible with AI. Instead, we wanted to build something people need and want, and then use that product vision to determine how to make smart investments in automation and language understanding. Furthermore, the approach we took to build Calendar.help can be used to create in-house AI systems that, like ours, make use of off-the-shelf AI technologies. We were not AI experts going into this — and we’re still not. The AI tools are already out there. Our job was to figure out the process that would best take advantage of them.

Original Article