Dictation

Nov. 19th, 2024 06:45 pm
rowyn: (studious)
[personal profile] rowyn
 

My wrists hurt. 


Partly because of this and partly because it's really hard physically typing over 5,000 words in a day, I finally decided to try dictation software. After however many years it's been since people told me “you should try dictation software”. I remember Maggie read a book 7 or 8 years ago about how to write 10,000 words per day and one critical tip is “use dictation software”. Imagine typing 10,000 words every day. Wrists. What even are they. I’ve been writing 5200-iish words per day for the last 22 days and my wrists are already annoyed with me.


But I didn't try dictation software. One reason I didn't try dictation software is that I don't think coherently out loud. My thoughts just ramble in many different directions, especially trying to tell a story out loud. It always seems like an insurmountable barrier: how would you even do that? When I'm having a conversation and telling someone about a story I wrote, or plan to write, I'm keenly aware of how directionless and wandering I am. 


I'm looking at the text that I'm writing -- well, speaking -- and it's a mess. But I’ll keep going.


I don't know where I'm going with this. One reason this is a mess is that this software doesn't punctuate anything.  It’s just building one enormous paragraph of Oops All Words, no periods, no commas, nothing.


So to decide on which transcription software to use, I did a Duckduckgo search. Yeah, I'm too hipster for Google these days. Did I already write a blog post about why I don't use Google? I don't think I did; I think I wrote a thread on fediverse? 


I don't use Google anymore because Google kept giving me a frontpage of Oops All Reddit and Quora. And I didn't want to know what Quora and Reddit had to say about, eg, whether all near-sightedness can be corrected with glasses. I think Google has an agreement with Reddit now only they can give search results from Reddit or something like that? So I was saved from Reddit in my search results by going to DuckDuckGo, which was great in my opinion. That's a feature, not a bug. I can't believe Google paid you for the privilege of giving me irrelevant results from Reddit. Anyway, that was the thing that did it even more than the AI summaries. I hate AI  but I could ignore them. But when the entire first page of results is just Reddit posts: nope. 


The best thing about this post so far is how it illustrates what I don't like about talking vs typing. Where am I going with this? Who even knows. Not me!


I said “woo!” but the transcription software noped out on transcribing it. “That's not a word. That's just a filler noise, like ‘er’ or ‘um’ or ‘uh’. I’m not transcribing those either.” (I typed them in for you). It doesn't transcribe when I laugh, either, which is good.


Anyway, I did a DuckDuckGo search for “best transcription software for Chromebook for writers”. Because I don't have a microphone for my desktop.


Believe it or not, it's the year 2024 and I don't have a microphone for my desktop. I used to have a headset that my workplace gave me, but I gave it back to them when I retired, with everything else. I guess I could use Lut’s microphone. He had a microphone that he used for gaming. I didn't even think about that. He had a gigantic monitor that  wider than both my screens put together, though somewhat shorter. I still don't use it. It’s hooked up to his computer. Eliyahu used it when they were visiting.  I'm not impressed by much about this software, but it spelled Eliyahu right on the first try, so I have to give it that. It did not spell it right on the second try, but hey, good job, you got it right once.


Anyway, the article from the search results, which I haven't gotten to yet because that's not how my brain works when I'm talking, suggested, as the top possibility for dictation software, Google Docs voice typing. Which is just an option on the Google Docs Tools menu: you select “voice typing” and click the microphone icon that comes up: there you go. So on the one hand, really easy to get started. On the other hand, no punctuation. Which is a downside in my opinion. For a while, I was talking on the Chromebook and also editing on the Chromebook, like a normal human being. And then it struck me that I didn't need to use the Chromebook for the editing part. I could talk with the Chromebook next to me so its microphone can hear me, but use my desktop’s keyboard to edit the same document on my desktop. Which is important, because half the point is to save my hands from having to do all this typing and using the Chromebooks keyboard to type is not the ideal ergonomic experience. It’s okay, but I’m trying to aim for minimal hand impact.This is not minimum to impact I'm still doing a lot of typing. But less typing than I would to have gotten this far otherwise. 


The article that recommended Google Docs’ built-in voice typing said that live transcription software is slow and you have to talk slowly and distinctly in order for it to understand all your words. I looked at this and I thought “are you really telling me in the year of Our Lord 2024 live transcription software is slow and doesn't get all the words?” Because my father uses it on his little Android tablet as an assistive device (he’s hard-of-hearing). And that sucker is fast. It keeps up with live conversation, no problem. It also has punctuation. And paragraphs. I am seriously tempted to go download it instead of continuing to use Google Voice typing because this no punctuation or paragraphs thing is not great.


The same article also suggested that instead of using dictation software you should use transcription services. Just record your book -- your entire book, or whatever it is you're writing -- and send it to a transcription service, have them run it through machine transcription, and then clean up the machine transcription for you. 


And one thing the article author touted as an advantage of this is: if you’re using dictation software, you’ll be tempted to edit it while it’s on the screen. And all I can think is: yes, you are right, I have edited this document a lot already since I started writing it. Speaking it? I don't know. Both.


But it's hard to imagine anything more nightmarish than having spoken an entire book out loud and getting the transcription back and then having to edit that. Because it's not just that I had to add the punctuation and paragraphs. That's fine. Not great, but fine. It's that I keep having to rearrange words because my thoughts don't make any sense when I speak them. This edited version is already much weirder and more rambling than my usual ‘I'm just posting to get some words out this morning before I go work on anything I care about’. I know that this article author’s concept is ‘if you edit while you're writing, you're just going to get caught up in editing, and you won’t get your book written. The first draft doesn’t need to be perfect! You just need to get that first draft out there.


My dude. I have four first drafts written and ready to be edited. I have gotten the first draft out. I am past that stage. I am so far past that stage. What I don't need is Yet Another First Draft, this one more desperately in need of editing than any first draft I have ever created before. Thank you. I'm good.


So for me, editing while you're writing is a feature, not a bug. I edit when I'm typing too. It's more natural than this “talk for a bit then edit” thing is. But I delete and rearrange sentences and add material and whatnot. That's the whole reason that my 4thewords word count is higher than my words-added-to-documents count. It's just part of the process. 


I keep turning the microphone back on and not saying anything so it times out. It times out in a kind of an annoying way if you’ve been talking a while and stop. Because instead of turning the microphone icon from red (listening) to black (not listening) because you stopped speaking, it leaves the icon lit but doesn't do anything when you start talking again. Turning off automatically makes sense: you don't want to use battery power when no one’s doing anything. But I do wish it would turn the microphone off icon off.


Anyway I kept turning the microphone back on and it would time out very quickly (and turn the icon black, so that part was nice?)  because I didn't start talking soon enough. Because I don't know what I want to write next on this rambling thing that I have created. If anything. 


Also the time out thing kind of encourages you to keep rambling. I don’t need that. I really don’t need a thing to encourage me to ramble even more when I have no idea what I actually want to add next. Thank you. 


I should maybe try different live transcription software.


Or, alternatively I could look at the instructions for the dictation software I'm currently using? Yes. That's a thing I could do. 


So it turns out Google voice typing will do punctuation, you just have to say the punctuation. “Comma”, “period”, “question mark”, “new paragraph”, etc. (I typed that bit, as you might guess.) I think some live transcription will guess at the punctuation? Pretty sure Zoom’s does, though it’s not very good at it. But I don't hate having to speak it. I have noticed that it doesn't automatically capitalize after a period, although it was clever enough to figure out that saying “a period” is not me telling it to place a period punctuation mark. Good work, Google Voice. 


I also fetched Lut’s microphone to plug into my computer. I even correctly figured out which socket it went in on the first try! (It's not a USB microphone; our computers are so old we have audio jacks and audio jack-using equipment). Which was more guessing than Anything else. “Well, I can't identify this icon. The plug is red and one port is orange and the other port is green. I guess I'll try the orange one? Orange is kind of like red? Maybe they’re color-coded.”


I was hoping using a microphone on my desktop, which doesn't have battery power, would convince Google Voice not to just turn itself off when I stop talking for a few minutes. This hope did not prove fruitful. Time to look at the instructions again!


Now, I've been playing with the voice commands. Saying “voice commands” brings up the list of voice commands, which is good to know I guess.  It turns out that when it stops listening, it's supposed to start listening again if you say the word “resume.” So now I have to let it time out and test this. Yeah, no it does not resume typing when I say the word resume. I also can't figure out what's up with the capitalizing. Sometimes it capitalizes! It capitalized the “yeah.” But not “sometimes” or “it” or “but”. Mostly it doesn't capitalize. Saying “capitalize this word” does not make it capitalize the word. Although sometimes it writes “capitalize this” and then deletes it and shows in the microphone icon “Google Voice heard ‘capitalize this’.” Maybe I need to select a word to capitalize?


So selecting a word and then telling it “capitalize this” is awkward. Maybe not more awkward  than grabbing the keyboard and editing everything. Especially since I have not yet gotten the hang of talking for long enough that it doesn't time out on me. And it seems like the workaround for that is just to leave the mouse cursor on top of the icon.  I've spent like the last 5 minutes editing this paragraph with my voice to fix the capitalization and punctuation. It's sort of magical and sort of super annoying? I guess more magical than annoying. It is definitely easier on my wrists.


Okay, no. Using voice to fix the capitalization problem is too annoying. I feel like this is a problem with my desktop and Chrome and that I didn't have this problem on my Chromebook?  A cursory DuckDuckGo search suggest this is not normal behavior for voice typing and gave me a “why don't you try resetting your settings in Chrome?” suggestion from Google Support. Which did nothing. So I'm gonna grab the Chromebook, and I'll try voice typing on it again.


Yeah, no. It has the same problem on the Chromebook. I hadn't noticed when I was using the Chromebook because I hadn't figured out punctuation at all when I was using the Chromebook. so there weren't any periods for it to fail to capitalize things after. I wouldn't say this makes Google Docs’ Voice Typing useless, but it is pretty annoying. 


I think I've played with this enough that I'm willing to try actually writing something that isn't about using dictation software. Although I would like to know a keyboard shortcut for turning the microphone on and off when it times out. Oh, it’s ctrl-shift-S and it’s labeled right in the menu. D’oh.


Subvocalisations

Date: 2024-11-21 11:43 am (UTC)
From: (Anonymous)
This reminds me of a programmer who for whatever reason could no longer type, and had to use voice transcription to code.
He set up a bunch of subvocalisations (like, clicks and grunts and whatever) to be shortcuts in his voice recognition software/coding UI of choice, and got to the point where it was just as fast as typing.

This was pre-2020 though, and I've no idea who it was. I do recall they were using quite high end transcription software. Something about dragons in its name?

Re: Subvocalisations

Date: 2024-11-21 06:45 pm (UTC)
tuftears: Lynx Wynx (Default)
From: [personal profile] tuftears
That's preeeetty expensive! o.o Maybe for businesses, site licenses and the like, but for individuals?...

Date: 2025-03-17 02:01 am (UTC)
alltoseek: (Default)
From: [personal profile] alltoseek
I don’t know about the punctuation bit, but I will say that oral storytelling is a skill that improves with practice and you will get better at the more you do. If you persist in using voice dictation, you’ll get less rambley and more focused.

May 2025

S M T W T F S
    12 3
45678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 6th, 2025 11:05 pm
Powered by Dreamwidth Studios