[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacspeak] Experimenting with multiple voices for mac/windows



1. Though you're seeing this in windows and Mac, it's more the case of
   "so called newer TTS voices dont give you too many params to
   manipulate" (read almost none). DECtalk and Outloud are formant
   engines  and are an entirely different beast. There has been research
   in academia around hybrid engines, see Prof Sue Hertz work from
   Cornell (she is also the one who created Eloquent AKA Outloud in the 90's).
2. The  example you posted sounds fine, but that is also because you're
   "cheating" in a way.
3. It does not sound jarring because you have relatively significant
   stretches of speech in the same voice.
4. The following will likely sound worse:
   int x, y, z = 0.0;
   char *x ="abc";
5. All that said, switching among voice families might well end up being
   what we can do for the newer engines; we made  a similar compromise
   with Math readings in Chromevox in 2012, with the result being orders
   of magnitude poorer than the readings produced by AsTeR using the
   DECTalk in 1994.
6. Another param that could be usefully applied -- but will need work
   with the newer voices is spatialization -- read about "SOFA" to
   understand where the audio world is heading.
7. Hopefully the newer engines will eventually expose some params for
   influencing emotion etc -- we even worked on an Emotional Markup
   spec about 20 years ago at the W3C -- but that went no where.
8.  And no surprize that a different voice for notifications works well,
    that should never be jarring  
   
-- 


|Full archive May 1995 - present by Year|Search the archive|


If you have questions about this archive or had problems using it, please contact us.

Contact Info Page