Voice Input
I went for a bike ride this morning and while I was riding I thought of something I need to do today. I didn’t want to stop, take out my phone, remove my sunglasses, and type in the “to do” into my calendar.
So instead, I took out my phone, opened up a calendar entry, hit the microphone button, and spoke into my phone, then hit save. I did all of that in about three or four seconds. When I got home and looked at my calendar, the entry was perfect.
But you know what? I rarely, if ever, do that. I could do it all the time. But I don’t. I think I use voice input on my phone a few times a year.
I am not sure why that is. The voice input on Android is very good. I suspect the same is true on iPhone. And I am increasingly using our Amazon Echo for information. So why don’t I talk to my phone more often? I am not sure.
So I started a Twitter poll this morning to see what others do.
Here it is:
Do you use voice input on your phone?
— Fred Wilson (@fredwilson) August 18, 2016
Please participate in the poll and/or leave a comment. I am curious to see what people are doing these days with voice input.
Comments (Archived):
What happens if we go to all voice on phones and apps are gone?
Voice into a phone. Whoever heard of such a thing.”We shall not cease from exploration, and the end of all our exploring will be to arrive where we started and know the place for the first time.” – T.S.Elliott
But who is listening, chat bots?
How would we play ‘Pokemon Go’ or ‘Angry Birds’ with voice? How do we tell our avatar to aim at a specific pixel and to fire with a specific velocity and style?
“SHOOT!!!!….NOW!!!….HIGHER!!!…FASTER!!! OH COME ON!!!!!!” 😉
That’s a very interesting thought wow this worked I have voice let’s see how many mistakes voice commenting will be that’s not too bad I’m very impressedhow do you put in punctuation
Siri or Cortana take a grammar course.
> how do you put in punctuationSpeak into your phone, call a human and ask them to do it.YOU did it all by voice!
Windows phone does it automatically. Miss it.
I agree that the voice input works well. I have an Android phone and often use voice input to dictate a quick text message rather than typing it in, particularly when stopped at a red light in traffic. It’s fast and accurate.I think I shy away from using voice input on my phone in front of others. I don’t think there is wide social acceptance of people talking to their phones when not in an obvious conversation. Not sure why, as I don’t hesitate to talk on a phone call when walking down the street. But, dictating a text or a note to myself still feels awkward. But, again, I’m old school anyway.
There are some things it’s better to text so that eavesdroppers can’t hear us calendar where we’ll be at some date-time. We don’t know if the eavesdropper is a potential burglar collecting information on when we won’t be at home.
I use the voice input for taking notes occasionally when driving. But it’s not very accurate. I think the simpler and shorter the voice is, the more success in accuracy. I use it in the car more for fun & experimentation than real utility. I tried CarPlay yesterday, and it requires Siri to be ON, so had some fun with hits & misses. We used voice on The UBI, Echo’s predecessor, in the sunroom for a few months & the most fun was to ask it to play a song, or get the weather.Last May while walking in Paris, we couldn’t find a store we were looking for after circling the area, so I stopped and asked someone if they knew about it. The guy pulled his smartphone and voice searched on Google the question, and got the answer. I thought, Hmm- I could have done that!
Ha! LMGTFY on the street. Would have been funnier if he held his phone to your mouth and asked you to repeat what you said.
is that a minority pressure group?
No
I often use voice input for setting timers/alarms. When I rented a car the other weekend I connected my phone to the car audio system using a USB cable (and my handy vent clip) to use as navigation and music. While it was clear the voice commands understood the words I was saying “using Hey Siri!” which was neat, the phone did not successfully find the playlists I asked for. A fun one to do when you are out with friends is ask Siri “what song is this” and Siri will listen and tell you.
Humid a day for a bike ride.My wife uses voice input all the time to set reminders. Up until recently, I rarely used voice input for anything. I’m starting to use it but mostly to look up addresses.I started getting frustrated with the default keyboard I was using, after a couple of epic incorrect spellings, I said to myself “Dope, just hit the microphone!”Don’t know why I don’t use it more.
All the kids are doing it for search. And by “all the kids”, I mean my two.A lot of other people feel weird speaking into a phone. Go figure.
I think one of the issues is the modern trend of open offices means that voice input is akin to no privacy.
Whisper mode
I rarely use it except to tell Siri to dial someone when I am driving.
I do it many times every day. top use cases for me:- to get weather (Siri)- to create a to do (Siri)- to create a calendar entry (@Fred Siri/Google Now can do this. You dont have to open an app, saves you few more clicks)- To create a quick timer (when doing laundry or exercising) (Siri)- Set an alarm (Siri)
IMHO the main obstacle is initiation via touch. Echo solves this problem but the product is still a bit immature. When apps can be initiated and interacted with entirely by voice with very few false positives in a rich ecosystem then I believe voice will emerge with great significance.
Agree that friction is a big obstacle.Once you get used to using voice when texting (on iphone for example) it becomes habit. The thing is you already have the screen open so instead of hitting a key for a letter (many many times) you just hit the microphone key and talk. This is way easier than typing, the exception being maybe if you just want to ack something with “k”.Apple could solve part of this with the watch fairly easily (where that means even I can figure it out).
True. Is great to talk to a Moto or Windows phone without touching it.
I use voice input to respond to texts and emails once in a while, but that’s about it.Android’s voice recognition is really good, so I also wonder why I don’t use voice input more often.Fred or anyone else on this thread, do you ever use voice input to write blog posts? It makes so much sense to just dictate then edit, but I haven’t come around to doing it.
I just left this comment with my voice only
Voice input is the only way my 6 year old will interact with our TV. “Alexa, play bubble guppies”.My wife is contantly googling things by saying “ok google…”Personally, I’d rather not say my quries aloud 😉
“OK Robo-suck…”
My use has grown from basically zero to a few times per week in recent months. When we talk to teens about how they use their phone/apps, voice is huge.
summer #lolz
love this! 2 things: 1. in my experience android really good with commands. 2. if you drive R and A approach 0.
Blame Siri. I’ve been so frustrated/disappointed with Siri that I guess I gave up on all voice commands, assuming (falsely) that Siri was a good as it got.
Just tried Siri and it worked perfectly. I was surprised.
“Call John Smith, work” “Call Peter Dietz?””No, call John Smith work””Calling Susan Smith, mobile”FACK!!!
I tempt anyone with an iPhone right now to pull up Siri, say “calendar” and see what happens.
It summarized my calendar for today. When I said “open calendar” it opened up the Calendar app. Seems to work well.
Worked perfectly.
You sir are a liar! Or as Siri would say, you are a “calling Bob Marsh, mobile”
I am glad you mentioned this because uploading on Soundcloud with my phone is what I use. The quality is impressive. (Soundcloud plug could be inserted here.)
I’m infrequent. But this motivated me to try a couple new things while eating blueberry pancakes for breakfast. Couldn’t find microphone in calendar app at a glance and then realized Siri must have it. Within seconds had scheduled an appointment to call myself at 2pm tomorrow. (Screenshots attached for other novices) Along with learning about other conflicting appointments I wasn’t even aware of. Perfect productivity lesson for the day. Thanks.
You can do it that way (my example above) however the microphone in calendar app, like anything else, is located on the keyboard)….
Gmail app on AndroidThen, why someone can’t seem to get their head around enabling Android tablets to create labels for saving Gmail from the Gmail app is perplexing. I’m confident one of you can figure this out.
Sam Lessin and Andrew Kortina feel your pain, big part of why they’re building Fin
This is where the Apple Watch (/other wearables) would be incredibly valuable. I’ve added reminders from my Watch while cycling, BUT, asking Siri for directions is terrible. It takes too long to load, I have to drop my wrist before completing the search. Or Siri sends me to a different state. Pulling out my phone while cycling is a huge pain point. Efficient voice controlled directions from the Watch would be a game changer.
+1. When I run or cook, it is awkward to use my phone. It is much easier to tell my Apple Watch “set a 10 minute timer” or “show me my calendar”.
I use Siri to listen/respond to text messages while commuting on my bike especially to tell my wife where I’m at (I have a long-ish commute and she worries). I also use voice input for reminders and calendar events. These are the main use cases for voice that make my life a little easier
Do I answer this question I have now pushed the voice button on the iPhone and entering thiscomment without making any corrections so let me hear your laugh now
Why is there no space between thiscomment. voice is usually very good at separating words
Here’s what would be better. A device connected to your bike that’s connected to your phone — or a smartwatch, fitbit type device, whatever — that you could just click, tell it what to do then continue riding.Here’s a really cool device I’ve been watching. It would be excellent tie in biometric (fingerprint, maybe) activation with voice recognition – https://www.smarthalo.bike/
I also think that voice adoption is tied to how people think. Some people use talking as an ideation process, some use writing, some use sketching, some use visualization.I bet the best input on how to improve/refine voice features will come from blind people.
Yes this a good point. I know the professors that Apple bought out to develop their touch interface. They were building input systems for disabled people that could not use a mouse and keyboard.
I had to learn the hard way, paid “full tuition”, that some women have some (A) astoundingly high capabilities and/or (B) mind-cracking, severe limitations, with both (A) and (B) just beyond belief for men and just super tough for men to understand. Still it is essential to understand.So, about the time my guesses seem to fit all the data I have, along you come showing that you are doing well on (A) and with no hint of (B) and, thus, in my efforts to understand pushing me back to kindergarten to start over. Gads.Ah, some men don’t understand that they should understand and don’t care. Maybe they are the lucky ones. Some men understand that they should understand but just give up — too much botheration. Some men, like me, really try to understand, and then along you come and toss all our hard work into the trash to get us to start over.I’m not nearly the first such frustrated man: IIRC G. B. Shaw wrote about a man who wondered “why can’t a woman be more like a man?” and where in his example of a man the evidence was that point by point he still couldn’t be sure if she was or wasn’t.I’m willing to learn, wanting to learn, waiting to learn, but it doesn’t look like I’m learning!G. B. Shaw couldn’t figure it out. Apparently Shakespeare in Taming couldn’t. The men who knew Helen of Troy failed to understand. My failures have a lot of company!Gee, poor Tristan — I can feel his pain:https://supercultshow.files…She’s drop dead gorgeous but won’t even look at him! I knew a girl like that, drop dead gorgeous but wouldn’t look back. Somewhere there’s a secret course where they teach girls to do that? Only the drop dead gorgeous girls? Why shouldn’t she be more like the woman inhttp://imgc.artprintimages….There she is being really nice to him, but, of course, he may fail to return! So, she might be nice to him if he’s about to die! Is that the going price? Not much future in that!Of course, both of those pictures have the same artist — Edmund Blair Leighton. Maybe he understood women! Ah, even with his good evidence here, not much chance!I thought that I had a lot of it figured out, and then you ruined it. I think I’ll just give up! And then I’ll blame you :-)!
True, Susan. We hacked together Digital Cane with Alexa and GE Predix traffic flow data to help blind people, the elderly and kids cross the road safely and navigate their terrain.The iPhone camera was to be their eyes whilst Alexa was their voice.
I just checked out digital cane. Really fantastic idea.
99% of the time I hit the voice input accidentally on my phone. I like to use it a few times a year to dictate something long, especially while driving. I think a large issue is the voice is very loud and you have to share what you’re doing with everyone around you.
frequently. never when riding my bike, which is my safe haven from the intrusions of web tech. i try to remember thoughts in the old fashioned way.
This stuff has come a long way, thanks to recent predictive tech, and I use it almost exclusively when writing on Android.I fondly remember (Jamie Siminoff’s) Phonetag in 2008-10 or so. That produced some hilarious transcriptions at times, but gets tons of credit for being so early in a consumer application.
I change my wife’s contact in my phone to ‘naggy farty pants’. I then walk into the room she’s in and say “Siri call Naggy Farty Pants”, her phone rings and I fall around laughing. Not sure it’s a use case you’re looking for.
a gentleman and a scholar, sir. *hat tip*
🙂
I’m a Swype-typer (the thing that finally pried my Blackberry out of my hands) but the Google voice interface is WAY more accurate than even careful Swyping (or typing for that matter). Even the suggestions are better/more intelligent/complete. However, if I need to go from voice back to typing (e.g., for a word that won’t be in the dictionary) it’s awkward.The main reason I don’t use voice more is I don’t want to be bothering people around me. It’s good for the car except you still almost always have to look at the phone and hit buttons to get to where the voice input starts. It would be a home run if from the home screen I could say “OK Google, open a new email to Jane with subject ‘late for dinner’ and body ‘I’m stuck in traffic but will pick up a bottle of Chardonnay on the trip home.’ then send.”
I’m suspicious that chat bots could go the same way. Every time I test doing something with Siri (weather, music, messaging) I find it works well, but nothing triggers me to use it automatically. Chat bots may get past this easily as many people find speaking to their phone a weird experience, but it’ll be a challenge.
Read the Wikipedia article on haptic tech. UI/UX psychology has been nearly hard-wired to haptic input. Voice input is the opposite end of UX “feeling”. This may be the biggest challenge to establishing mass, voice input market-base? https://en.wikipedia.org/wi…
I use Siri sometimes when I’m driving. That’s about it. Talking to a bot in public feels socially awkward to me. I don’t own an Echo, but I imagine the difference there is that you are in the privacy of your home. It’s more of a social issue than a technical one, I suspect.
sorry, but which calendar app is Fred talking to? on Android or iOS?
Academic/Conceptual Question:What will happen to the written language as Voice, Video, AR and VR take over the world? Will there eventually not be need for it?
Siri is too bad for me to use it frequently. Google search app has a good voice search but unfortunately the app takes forever to open. The only voice input i use everyday and wish it was everywhere is alexa on amazon echo
Whether it’s person or a IVR, voice to me seems very bad at accepting structured data with a high level of accuracy. I hate shouting on the phone. I hate having to spell my last name – composed of two very common English words -and people still misspelling it (or the machine getting it wrong).I like typing because I recognize that I’m inputting a set of structured data and, Six Sigma fashion, if the data goes in right a whole bunch of errors are prevented further in the process.
It’s because typing is a “never not”. Adoption usually tracks much much closer to “never not works” than “easy and works really well”, “works well”, “works”, “it’s getting better”, and “it’s possible”.My experience is voice is getting better on Google. Siri and dumber than my dog.
This post has inspired me to use it more. I have not tried the weather Kama stop navigation comma oral arm in the past. But we’ll do it in the futureThe paragraph above was voice..won’t try that again!
Yeah, I use it a lot while driving, “Siri, set a reminder for 4pm to call John” or the like.
I use the voice input often, but many of my friends don’t mostly because they are not familiar with the capabilities and not used to it. I have shown many friends how to use voice input and now a lot of them use it. I think that this is mostly an issue of a slowness of voice input adoption. We are so used to entering info into our computers/phones by typing that the leap to voice input is a little strange and different. It might take a while, but input by voice will continue to rise and eventually be the main way a majority of information will be entered into our devices.
I use siri (and/or voice input) on the iphone toa) email myself a note or a reminder. Works very well instead of typing.b) set alarms and calendar events. One step no need to open an app on iphone….
I actively made myself use voice 18 months ago and after about 3 months I was using it almost daily on my android phone. Add tasks, set alarms, send quick messages and navigation primarily. I then switched back to iPhone and Siri just doesn’t work at the same level. It struggles with background noise more than Google seemed to do and seems to be much more sensitive to changes in my voice – more often than not it doesn’t recognise me – rarely had that problem with Google Now.
I’m in the exact same situation as you Fred. The few times I have used it are while I’m biking to work and need to respond to a message. It’s incredibly useful and gets the job done, but I never use it otherwise.Two reasons I can think of why I don’t use it as often as I would expect:- I type differently than I speak. This creates weird messages that simply don’t work or get my point across the way I would like.- There are still many mistakes being made by Siri. The point is that I don’t want to use my hands, but if it hears a word wrong then I have to use them anyway. Defeats the purpose.
I definitely started using voice input on my phone more after being “trained” at home by my Echo.
I use it all the time on Android (calling, texting, whatsapp, google search). It is awkward if you have to edit anything, though. And other people talking can get in the way.I used to have a Moto X which was always listening (most phones are only listening when the screen is on or when it is charging). I used voice input even more with that phone since it was truly hands free.
get an android watch and then you don’t need to get your phone out and as an aside also get golfshot for your phone and watch and have the distances to the green on your wrist, a first for me today but was great
Fred, you clicked too much. Here’s how I add a reminder to my calender:1. “Ok Google’2. ‘Add reminder to reply to Fred Wilson for 10:30am today’3. Google says ‘Reminder added to reply to Fred Wilson for 10.30am today’All done in 5 seconds without touching the screen.It doesn’t always get names right, but it’s usually close enough to remind me what it is I have to do.
I use Siri every day to send texts, start timers, set alarms/reminders and to google stuff periodically (eg “Google how tall is Juan del Potro). Nice time saver when I’m walking or driving.
Android voice recognition is rather good. The more you use the better it gets. Voice is my first choice to bring up phone dialer, SMS,maps, google search, phone settings (turn on off blue tooth, launch spotify. I also use voice to my calendar to create notification on my phone of my to do list that I dictate as I drive to work in morning.
Whatsapp’s voice message in place of text to friends is my most used app for this tech. It allows for far easier story telling and obviously voice intonation gives a different vibe to pure text and emoji…..and Google maps while driving.
I would love to use voice more, and am training myself. At first, I found it cringeworthy to talk to my device, just like I did not love public speaking at first.But now I’m working on my Tony Stark think outloud/talk to JARVIS, so I will sound very fabulous soon. This will lead to much more voice input!
I have long believed that voice interfaces are the future, but the current implementations are incomplete and that limits the usefulnesd. Voice should be a first class interface on are devices, not a afterthought.Every app installed should have a complete set of actions available, and these should integrate into the conversational interface on the device. There should be objects or categories that can be queried, such as “events on my calendar” and I should be able to ask “when is my next appointment with Dr. X”I think there are some business opportunities here, since at its core advertising is information. A “yellow pages with an opinion” could incorporate Foursquare or Yelp ratings.I currently use navigation features in Google Now and am frustrated on a daily basis by it’s limitations. If I’m in the car I should be able to ask for a business phone number, confirm what it thinks I wanted, then let me dial it. I should be able to ask what city a business is in, so it doesn’t start navigating me across the country. I should be able to tell navigation to avoid interstates or toll roads, not have to stop and find an option in the confusing menu system.
I agree with you. Once voice input is seamless and 99.99% accurate, it will likely become commonplace. Despite slick advertising, currently voice input essentially an afterthought that is in beta. I predict once the promise of wearables (such as Google Glass) is realized voice largely displace typing.In other words, I predict the designers of the Dick Tracy, Jetsons, and Star Trek communication systems will be proven correct.
Once your phone is out of your pocket, I think we’re all conditioned to just type, perhaps given poor early experiences with voice to text. I find myself more frequently talking into my Android smartwatch when pulling out my phone is inconvenient, but there is still social friction with speaking into your wrist. Dick Tracy may have been the only character capable of not looking strange doing it, though it should become more acceptable over time.
I use voice quite a lot.Android voice is (IMHO) significantly better than iPhone.
The rain in Spain stays mainly in the plain.We apologize for any inconvenience this.Not very accurate right?AAA are you listening?8 x 8 I AAI artificial artificial intelligence are you listening?This was dictated directly into the discus app.
I am always trying to use Siri, usually for taking short notes and setting alarms. Using it through the watch is quite amazing. I think it still lacks a bit of intelligence to interpret what you say in the proper context. It is voice plus context what makes the magic. When it does work it shines, but that is not always the case.
I think there are a couple of reasons why voice input is not more widely adopted:1) Stigma: There is a lack of social acceptance walking around talking to “yourself”. It reminds me of the bluetooth headset stigma. A big reason why Amazon Echo has succeeded where Siri and Google haven’t faired as well is that the use is limited to a comfortable home setting. No wonder Google decided to follow suit.2) Habit: We’ve been trained for years to use the keyboard/touch as the main form of input for human-computer interactions, be it a laptop or a smartphone. So that is now our natural instinct. I reckon this will begin to change as the next generation grows up with voice-based devices like the Echo.In May, Google announced that 1 in 5 on mobile app in the US are voice searches and share is growing.The attached charts from Mary Meeker’s Internet Trends report also paint a rosy picture.
Stigma: There is a lack of social acceptance walking around talking to “yourself”.I think this varies by the person. I mean in theory who really gives a shit what some complete stranger thinks when they see you talking. I got over this with bluetooth and for that matter hands free in the car (way way back) within maybe a few days.
That’s true but I can tell you based on conversations I’ve had with a few people that they are uncomfortable using Siri in a public environment because they might feel judged. Granted this is anecdotal evidence but still.Hands free in the car is again a contained, largely personal environment like your home.
I don’t want to talk to a smart phone. Of course, I don’t have a smart phone. Then, again, I don’t want one. And if I had one, I still wouldn’t want to talk to it. E.g., a smart phone would be a second computer that would partition my computer-based information, and the associated system management would be more work for me. And, yes, I would wonder if some voice response software really got the command correct.Put bluntly, computers are far beneath me, so far beneath I decline to talk to them!If I want a computer to do something, then I will give it a command; I don’t want to talk to it; and I especially don’t want any back talk, e.g., “Are you sure you want to close this file without saving changes?”, “Are you sure you want to delete this file?” after I gave completely unambiguous commands to close or delete the file.I can see it now: I’m driving at 70 MPH and the 18 wheel truck 50 yards ahead of me just had the tread of a tire come off; I want to slam on the brakes and go to the right shoulder of the road; and my car says “Do you really …” “CRASH” and I hit the tire tread and wreck the front of my car.I don’t want to talk to computers, and I don’t want back talk.
I’d suggest you get an android phone and experience it for yourself because the accuracy with which Google Now can interpret commands or take dictation is phenomenal
Thanks. Hmm …. Maybe!I have too much stuff and need to pack a lot of it into boxes. But, I want on my main computer a text file with for each box a list of what it contains. So, want to work with just the text via my favorite text editor.But, sure, when packing, it would be good just to talk into some good voice to text software and then e-mail the resulting text to my desktop computer!The spelling of the voice to text doesn’t have to be perfect because my desktop computer has a good spell checker with, right, my additions for my interests, e.g., authors of math texts.Sounds good. Thanks.
One thing I use voice input for constantly is texting. Much much easier than typing and I find it highly accurate on iphone.
I never use voice. Main reason is the feedback circle is too slow. When I’m typing, if there is a typo or an autocorrect error, I recognize it and can correct it immediately. If I try to use voice control or dictation, I can’t detect errors in real time. One error can screw up the whole UX.So whilst the overall error rate of voice may even be lower than typing, even that 1% chance is enough to have me steer away from it – just because the I/O mechanism isn’t designed to deal with that 1% chance.
my 11 year old son talks to the phone all the time as he is also constantly asking questions to Alexa and has gotten very comfortable with the idea of talking to a device.Most of us are way too used to texting and using the phone with our eyes and hands. We really have to change memory and behavior to use the phone in a different way. For the next generation, it may be a more natural way.I bet your poll would have different results for people < 16 vs. over.
I have a lot of hope in this truly 21st century generation. I have a few ‘test subjects’ around which amaze me every time I see them interacting with technology. One little guy is 1 1/2 years old and the word ‘iPad’ is already part of his lexicon. The other day he tried to grab one when he saw the white cord plugged, he turned his head towards me and asked.. charge? These people will rule the world really soon.
Yep. Technology to them is like the alphabets. They grow up with it, are not intimidated by it and it is a natural part of life.
Every time I try to use Siri, it works for the first one or two times, and then everything goes seriously off the rails and stoops herking.
I’d fall into the sometimes camp but this is actually a primary use for me of my Apple Watch. Often I’ll be doing something like driving, think about something I want to do and use Hey Siri to add it. IF it’s something that’s not easy to add via Siri I’ll just add a reminder to do the more complex thing.What I think is vastly underused are location based reminders. “When I get home, remind me to…” etc. That’s helpful because the reminder might be for something not urgent, but something that you want to do the next time you’re at a particular place (a store, work, whatever).
i liked the always listening google now on my moto x; since i switched to a samsung galaxy, hardly use voice input these days
It just feels weird talking to your phone 🙂 I’ve just started using voice input now because if I leave my finger on the button to long voice input pops up.
I think picking your nose makes double tapping and swiping awkward no matter the proximity to other people.
Touche