Is Speech Recognition Suitable
For You
Can this software help make you more
productive?
|
|
The dream of sitting
back in one's chair and dictating direct to computer
everything that one formerly typed is appealing, but for
most of us, unrealistic.
This is part of a series on
speech recognition software. See related articles
listed on the right. |
Reliable speech recognition is
something that has been long sought after, but only recently is
becoming practical on normal computers.
The extraordinary computing
power of a modern home computer, and the evolving
capabilities of speech recognition software now offer the
promise, and almost the reality, of being able to effortless
control and communicate with and via one's computer merely by
talking normally to it.
Read on to understand what
speech recognition is now capable of, and how to best use it in
your own work environment.
Is Speech Recognition Suitable
for You
It
goes without saying that not all tasks are improved by using
speech recognition software.
Clearly, speech recognition
is at its best when it is replacing the need to type words on a
keyboard. Nearly all of us can speak faster than we can
type (150+ words per minute for speaking, 60 or fewer words per
minute for typing), and so (at least in theory - see discussion below about
creative thought) talking rather than typing should make us more
productive.
Non typing tasks
But using a computer is more
than just simply typing text. We have to move between
programs, to scroll up and down; we need to move between fields
on forms, we need to click on buttons, we need to edit and
correct text that we have earlier typed. Even when we are
typing text, it isn't always in large sections, but sometimes it
is cut up into lots of small pieces.
The good news is that there
are ways and workarounds to perform all these tasks using speech
recognition. The bad news is that almost always, using
speech recognition to perform these tasks is more cumbersome,
clumsy, and slow than reaching for our mouse, pushing the cursor
around the screen, and possibly banging away at a few keys on
the keyboard too.
When it comes to tasks done with a mouse, voice control is seldom as good as
pushing the mouse around by hand.
Special words and phrases
It also goes without saying
that the computer needs to know a word before it can recognize
it. If you use a lot of special technical terms, this may
pose two problems. The first problem is that the computer
simply does not know the word you are using - this can usually
be corrected by adding it to the computer's vocabulary.
The
second problem is more obstinate. Even if the computer is
taught the word, it cannot be taught the word's
meaning quite so readily, which makes it more difficult for the computer to guess
when this new word may be the word you are actually using it in
a sentence.
There are special
vocabularies for the legal and medical professions that provide
turnkey solutions to people in these fields.
Still more considerations
It should also go without
saying that the computer needs to be able to hear you clearly
and recognize/understand your speech. Yes, of course this
means that you must speak clearly to start with. It also
means that you should use as good a microphone as possible, and
positioned appropriately.
It also means that you
should be in a moderately quiet environment where they will not
be other background sounds to "distract" the computer and to
interfere with its ability to clearly here and recognize what
you are saying. If you have your own office, and few
interruptions or external noises, then speech recognition works
well for you. If you are in a large open plan area, and
the photocopier or water cooler is immediately behind you,
then speech recognition is not so well suited.
A related concept is the
privacy of the material you are dictating. No one knows
what you are typing when you are using a keyboard, but anyone
who can hear you knows exactly what you are saying when you are
dictating to a speech recognition system and possibly also
playing back some sections of what you said.
Speech
Recognition and Creativity
This is an issue that you
may not even think about up front.
As regards the issue of
creative thought, I can only speak from personal experience.
I am a very fast touch typist, and over many decades and
countless millions of words, touch typing has become automatic and
instinctive to me, in the same manner as is walking. Just
like when we are walking somewhere, we are not concentrating on
moving our legs and where we put our feet, we just look to
where we are going and walk there "automatically", so too
is typing like that for me. There is no interference
between having a thought and putting it onto the computer
screen. The fingers work by themselves, indeed sometimes I
can even talk to someone about one thing while typing a
different thing on the computer.
But when I am dictating
rather than typing, I find I'm having to split my concentration
several different ways. As always, I am having to
carefully think about what it is I wish to express. But
then, I am needing to watch the words that appear on the screen
to detect errors as they occur.
When an error does occur, it
is more disruptive to correct. When I am typing, it is
nothing to quickly hit the backspace key several times, correct
an error, and then keep on typing, but when I am dictating, it
interrupts the workflow and takes more time to correct an error.
The disruption involved in
correcting an error has a subtle further impact as well.
Whereas when touch typing, it is never a big deal to change or
delete any piece of text; when dictating, the additional
complexity of editing with speech recognition software acts as a
subtle disincentive to polishing and perfecting one's prose.
One further point about
errors. Nine times out of 10, I instinctively know when I
type the wrong key. I do not need to be looking at the
keyboard when I'm typing, and neither do I need to be looking at
the screen. It just happens automatically.
But when I am dictating, I
have no way of knowing when the computer may or may not
correctly/incorrectly recognize the words I say, and so I'm having
to watch the output like a hawk all the time. I also need
to speak slightly more clearly, and the net result is that
the process of putting my thoughts onto the computer screen is
now interfering with the underlying thinking.
If you are not an
accomplished touch typist, then perhaps the difference between
speech recognition and less instinctive typing is not so marked.
You might find it easier to
adapt to a speech recognition system if you are used to
dictating letters (either directly or through a dictation
recording system) for your secretary to type.
Using speech to create written
communication
One last thing about the
creative process. I don't know this for sure, but I have a
gut feeling that when one is typing, one is writing in a
different style than when one is speaking. We all know
there is a difference between spoken and written English, and
oral and written expression. When one is speaking as a way
to create a written expression, I have a sense that the words as
captured on the page do not flow as smoothly when someone
subsequently reads them, as would be the case if typing the
text.
If you want an example of
this, look no further than the preceding paragraph. Upon
rereading, it seems terribly jumbled. I won't edit it, so
you can see it in its original form.
Summary Table of Considerations
Consideration |
Good
for Speech Recognition |
Bad
for Speech Recognition |
Ambient noise level |
Quiet,
controllable |
Noisy,
uncontrollable |
Type
of computer |
New and
powerful |
Older,
less powerful |
Language used |
Typical,
normal, limited specialty |
Unusual,
many one-off terms |
Typical computing tasks |
Word
processing, e-mail - lots of typing, not so
much mousing |
Design,
fiddly formatting, less typing, editing,
more mousing |
Acceptable level of error |
Moderate
- such as less formal e-mails and memos |
Low -
such as creating financial data and official
company statements |
Degree of thought/creativity |
Lower -
more routine tasks |
Higher -
more creative and complex tasks, |
Pre-existing typing skills |
Low/slow |
High/fast |
Keyboarding time per day |
High |
Low |
Need
for confidential content creation |
Low |
High -
if somewhere where other people can overhear
you |
|
Summary of Part 2 of this
Article Series
Obviously, if you work in a
very noisy environment, it will be difficult to use speech
recognition software. And if you are a very fast touch
typist, you'll get less benefit than if you hunt and peck with
two fingers.
The chances are that most
people will get some degree of benefit by selectively using
speech recognition software where it makes most sense, but not
everyone will benefit from this new productivity tool, all the
time.
In the
third part of our
series, we talk about accuracy rates, the type of
computer hardware needed to effectively handle the demands
placed on it by speech recognition software, and the surprising
difference between your experience with the system that might be
e.g. 97% accurate compared to a system that might be 98%
accurate.
(And, of course, there's
lots more good stuff in the subsequent parts of the series too,
to be released next week.)
Related Articles, etc
|
If so, please donate to keep the website free and fund the addition of more articles like this. Any help is most appreciated - simply click below to securely send a contribution through a credit card and Paypal.
|
Originally published
7 May 2010, last update
21 Jul 2020
You may freely reproduce or distribute this article for noncommercial purposes as long as you give credit to me as original writer.
|