Topics

Are Speech Synthesizers Base, Midrange or Treble Heavy?


Bhavya shah
 

Dear all,

While the bulk of my question is captured in the subject line, I would
like to contextualize it slightly. Many of us buy headphones, speakers
and other audio products and as blind people, screen readers' speech
synthesis is what we listen to most frequently. I would imagine that
different text to speech engines would operate at different
frequencies, but I was wondering if there is a certain portion of the
spectrum - namely base, midrange, or treble - that speech synthesizer
occupy more commonly in general? If, for instance, the answer to this
question is base, would base-heavy or extra-base audio products be
conducive to a better experience for screen reader users? I am more
interested in thoughts on this subject overall, but regardless, I'll
share that I use ETI Eloquence>Reed at 65% pitch and ESpeak-NG>Steph 2
at 39% pitch as my synthesizers of choice with NVDA.

I would truly appreciate any thoughts and insights on this subject.

Thanks.

--
Best Regards
Bhavya Shah
Stanford University | Class of 2024
E-mail Address: bhavya.shah125@gmail.com
LinkedIn: https://www.linkedin.com/in/bhavyashah125/


DAVID GLOBE
 

Hello Bhavya
Speaking for my own personal comfort, I believe most speech synthesizers work in the mid to high range.
David

-----Original Message-----
From: main@TechTalk.groups.io [mailto:main@TechTalk.groups.io] On Behalf Of Bhavya shah
Sent: January 18, 2021 7:04 PM
To: main; blindtech
Subject: [TechTalk] Are Speech Synthesizers Base, Midrange or Treble Heavy?

Dear all,

While the bulk of my question is captured in the subject line, I would
like to contextualize it slightly. Many of us buy headphones, speakers
and other audio products and as blind people, screen readers' speech
synthesis is what we listen to most frequently. I would imagine that
different text to speech engines would operate at different
frequencies, but I was wondering if there is a certain portion of the
spectrum - namely base, midrange, or treble - that speech synthesizer
occupy more commonly in general? If, for instance, the answer to this
question is base, would base-heavy or extra-base audio products be
conducive to a better experience for screen reader users? I am more
interested in thoughts on this subject overall, but regardless, I'll
share that I use ETI Eloquence>Reed at 65% pitch and ESpeak-NG>Steph 2
at 39% pitch as my synthesizers of choice with NVDA.

I would truly appreciate any thoughts and insights on this subject.

Thanks.

--
Best Regards
Bhavya Shah
Stanford University | Class of 2024
E-mail Address: bhavya.shah125@gmail.com
LinkedIn: https://www.linkedin.com/in/bhavyashah125/


Gene
 

I don't think you can generalize. Eloquence has a lot of bass and it sounds better if you listen with headphones that either have a less emphasized or less bass but a lot of what I consider much worse synthesizers in terms of speech quality but have much more natural sounding voices probably sound better when listened to using full range headphones. I use those synthesizers so little, however, that that is my impression or guess but those who use such voices will very likely comment.

Also, I got used to Eloquence, Via Voice, really, with a lot of bass when I got much better head;phones a few years ago. While Eloquence sounds better with less bass, the higher amount of bass didn't bother me even though the sound of the default voice, which is what I'm discussing, is better with less.

Gene

-----Original Message-----
From: Bhavya shah
Sent: Monday, January 18, 2021 6:04 PM
To: main ; blindtech
Subject: [TechTalk] Are Speech Synthesizers Base, Midrange or Treble Heavy?

Dear all,

While the bulk of my question is captured in the subject line, I would
like to contextualize it slightly. Many of us buy headphones, speakers
and other audio products and as blind people, screen readers' speech
synthesis is what we listen to most frequently. I would imagine that
different text to speech engines would operate at different
frequencies, but I was wondering if there is a certain portion of the
spectrum - namely base, midrange, or treble - that speech synthesizer
occupy more commonly in general? If, for instance, the answer to this
question is base, would base-heavy or extra-base audio products be
conducive to a better experience for screen reader users? I am more
interested in thoughts on this subject overall, but regardless, I'll
share that I use ETI Eloquence>Reed at 65% pitch and ESpeak-NG>Steph 2
at 39% pitch as my synthesizers of choice with NVDA.

I would truly appreciate any thoughts and insights on this subject.

Thanks.

--
Best Regards
Bhavya Shah
Stanford University | Class of 2024
E-mail Address: bhavya.shah125@gmail.com
LinkedIn: https://www.linkedin.com/in/bhavyashah125/