Vision Assistants 2/27/2018


Piccolo was founded on the idea that the most magical interactions happen when the underlying technologies are invisible. With Piccolo, you don’t need to press any buttons, find any remotes, get out your phone, wear a watch, or say anything.

A Vision Assistant is similar to a Voice Assistant, in that they both aim to make life simpler. The difference is that a Vision Assistant uses a camera and computer vision algorithms to figure out what’s going on. We’re convinced that 10 years from now, the most exciting experiences will be built around vision, not voice. And we think about the future in roughly 3 phases.


Phase 1 - Gestures

Today our product lets users control things with gestures. Users can point at devices to turn them on/off and do more subtle gestures to change the TV volume, adjust light brightness, and so on. What we hear overwhelmingly is that gestures are faster, more intuitive, more discreet, and just more fun to use. What seems more like the future - (1) saying “Hey Alexa, can you mute the volume on the TV?” while sound is playing and she can barely hear you, or (2) doing a shush gesture to instantly mute?


Phase 2 - Autopilot

With vision, you have complete awareness of the environment which means you can put the home on autopilot. Your home will understand you so well that you rarely have to tell it to do anything. That means:

  • Some devices, like lamps, will rarely be adjusted. They will know to be on if you’re around, dim as it gets later, turn off if you take a nap on the sofa, and just slightly illuminate the room if you have to get something in the middle of the night.
  • Each person has their own set of preferences. So for things like volume, temperature, and airflow, those will just update as the people and the environment change.

Phase 3 - Platform

Similar to app stores for phones, there are countless vision apps that will be created for the home. And just how apps like Lyft and Airbnb were hard to imagine before we had smartphones, we suspect that the most interesting applications aren’t obvious today. Here are some of the things we’re excited about:

  • New apps. Can an app detect when an elderly person falls or recognize other medical emergencies? Can an app guide you through different exercises and tell if you if your form is bad? Can an app tell you where exactly in the room you left your phone and keys?
  • Integrations with existing apps. Could Netflix provide recommendations for the specific people watching, instead of the 1 person that’s signed in? Could scheduled reminders play a sound if you’re sleeping but vibrate otherwise?
  • Smarter hardware. Could fans follow people, instead of oscillating back and forth? Could your Espresso machine make your favorite drink with a single button press because it knows who pressed it?
  • Voice-vision fusion. Could you trigger Alexa just by gazing at Echo instead of saying “Alexa”? Could vision give context to voice, so that if you’re holding something and say “Order 5 more of these”, Alexa knows what to do?

Since no one company can make all the applications, and great ideas come from many places, there should be a platform that lets anyone make vision apps. But this is only possible if someone takes care of foundational computer vision technology, privacy controls, app deployment, and other obstacles. Our ultimate goal is to become the platform that removes those obstacles and lets anyone create and deploy vision apps.


TLDR
  1. Make a camera that lets you control stuff with gestures.
  2. Make the camera really smart, so that it can also control things on its own.
  3. Create a platform that lets anyone build and deploy vision apps.

It’s an exciting time to work in computer vision - most of these ideas were hard to imagine 2 years ago. If you’re excited about building the first true Vision Assistant, you can contact us.


Marlon & Neil