. @tedlouie pointed me to this post on in-home non-cloud voice control: http://interconnected.org/home/2020/05/26/voice. It starts with “Why can’t I point at a lamp and say ‘on’ and the light come on?” I’m all for the multimodal concepts described in the post, at least in spirit. 1/
But the layers to get something like this to work for everyone are very complex. Just from a usability perspective, some people will never be comfortable with saying only “on” or “off” when their dialect uses different vocabulary. 2/
Off the top of my head, I know in some regions people say “cut the light” for example. And then some people never spontaneously use short phrases when addressing anyone, including devices that use speech input (e.g. “Light please”). 3/
And some people are not comfortable using their speech in a command and control way—they prefer stating their motivation, need, problem (e.g. “It’s too dark in here”). 4/
There are acoustic problems with short phrases as well, even with multimodal inputs. A lot of parallel processing, contextual understanding, & reasoning is needed to get something like this to work. 5/
You can follow @maryparks.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: