Microsoft wants you to talk to your PC and let AI control it

Blisterexe@lemmy.zip · 2 months ago

Microsoft wants you to talk to your PC and let AI control it

sugar_in_your_tea@sh.itjust.works · 2 months ago

Even if they solve the regional dialect problem, there’s still the problem of people being really imprecise with natural language.

For example, I may ask, “what is the weather like?” I could mean:

today’s weather in my current location (most likely)
if traveling, today or tomorrow’s weather in my destination
weather projection for the next week or so (local or destination)
current weather outside (i.e. heading outside)

An internet search would be “weather <location> <time>”. That’s it. Typing that takes a few seconds, whereas voice control requires processing the message (a couple seconds usually) and probably an iteration or two to get what you want. Even if you get it right the first time, it’s still as long or longer than just typing a query.

Even if voice activation is perfect, I’d still prefer a text interface.

setVeryLoud(true);@lemmy.ca · 2 months ago

My autistic brain really struggles with natural language and its context-based nuances. Human language just isn’t built for precision, it’s built for conciseness and efficacy. I don’t see how a machine can do better than my brain.

sugar_in_your_tea@sh.itjust.works · 2 months ago

Agreed. A lot of communication is non-verbal. Me saying something loudly could be due to other sounds in the environment, frustration/anger, or urgency. Distinguishing between those could include facial expressions, gestures with my hands/arms, or any number of non-verbal clues. Many autistic people have difficulty picking up on those cues, and machines are at best similar to the most extreme end of autism, so they tend to make rules like “elevated volume means frustration/anger” when that could very much not be the case.

Verbal communication is designed for human interactions, whether in long-form (conversations) or short-form (issuing commands), and they rely on a lot from the human experience. Human to computer interactions should focus on those strengths, not try to imitate human interaction, because it will always fail at some point. If I get driving instructions from my phone, I want it to be terse (turn right on Hudson Boulevard), whereas if my SO is giving me directions, I’m happy with something more long-form (at that light, turn right), because my SO knows how to communicate unambiguously to me whereas my phone does not.

So yeah, I’ll probably always hate voice-activation, because it’s just not how I prefer to communicate w/ a computer.