Synthetic Speech and Vocals

I used these online speech generators quite a few times in tracks submitted to the Disquiet Junto a couple years ago - either reading a book section, or computer-generated lyrics from another site, or saying phonetics to use as beat elements. they’re here on bandcamp:


Hey, thanks for the prompt - I’ve come across Schoenberg’s name since exploring the further reaches of electronic music these past few years but not got round to listening to his music yet. Maybe that’s tantamount to heresy or something in circles like these?

Anyway I watched a performance of “Pierrot Lunaire” (this one: whilst stuffing pasta into my face this lunchtime (just to hammer home what a philistine I am). It took approximately 2 minutes and 30 seconds for me to shed my first tear. Compelling indeed.

Having done some cursory research I believe the term you refer to is “sprechstimme” (literally “speech-singing” apparently). In his foreword to the score Shoenberg talks about it a little, saying about how the performer should have the melodic precision of singing married to the sort of natural rise & fall of speech - i.e. not sustaining notes like in singing. It sort of reminded me, in theory, of how it feels when I try to speak a poem I’ve been singing a while - though Schoenberg’s admonition that the performer should do anything but lapse into a singsong speech pattern is bracing. I’m not at all skilled in any classical way like these people, but am certainly able to relate on a kind of intuitive/essential level, and do find it edifying.

The deepest part of why or how this kind of thing works for me personally, is in reinforcing the whole constellation of meaning and making it exponentially more powerful/impacting. This is how it started for me this year, already being into Blake (like any good hippy), but then starting to take a closer look - seeing how the decoration of the poems, the forms & the colours around them, basically gave them more life - made them more vivid & energetic - & then learning that he used to sing them too… & all this being a response to my own disillusion at the way that sometimes precious words or special feelings, put down in words - well sometimes they just seem empty & ridiculous… I’m not sure any art is immune to this kind of nihilistic crisis, and I guess such is part of the motivation of enough of it - but certainly great art resists or rises above it more resolutely.

So it would be easy to scoff at, what was it, something about “the scent of ancient fairy-tales”, or the black butterflies of night having eaten the sun, but when it’s really delivered like that there’s just nothing you can do but feel it, and it’s marvellous.

Also I’m not sure I really got atonality before, but something in that piece really got to me, and I love that I can’t remember a single melody from it, some 2 hours later… maybe if I spoke German that’d be different tho.

Some time back I had a fun morning fake dialoguing with my computer, making hir respond to my daily existential crisis in Amiga voice, but the tune I did to accompany it was a bit too messy and I gave up, but I might try and salvage it one of these days soon, made me chuckle at least.

1 Like

Diemut Strebe’s art Installation, “The Prayer”


clicking ‘preview’ loads of times is fun :smiley:

EDIT: slowing the wav down also fun, turns into some nonspecific-european-accented robot

Tom Jenning’s work as World Power Systems was a big influence on me, here’s a speech synth art turing machine he built -

Somewhere I have a mix CD i made circa 200X of all the circuitbent/videogame/chip music I could find which begins with this thing reading Burroughs :slight_smile:

EDIT because I can’t self-reply :frowning: -

Accidentally discovered today that a touch-controlled simple 555 oscillator thru the TC Helicon VoiceOne = ultra playable free improv duck voice. Definitely going to have to experiment further!



Love it! Any more details for this?

would love to know more too, need to find a use for my Axoloti soon :slight_smile:

Thanks! It’s some examples of an experimental set of LPC synthesis objects I wrote for Axoloti (with a lot of help from another Axoloti forum member, latterly).

I have a vague plan to port them in some form to Norns.

1 Like

I never released them, partly because I sort of lost interest in Axoloti, as a platform, to be honest

I think it will be more fun on Norns, and I can exploit the much more powerful processor to take the project to places that weren’t possible on Axoloti (at least with my meagre coding skills and lack of DSP chops).


Cool, need to get onto building my DIY Norns :slight_smile:

Edit, self-reply:

“DSspeech” homebrew rom… Fun ultra basic old skool speech synthesis with limited control of speed. Seems to crash if you type more than 7 lines so my plans for laborious long-form dungeon synth chants were scuppered somewhat


Another LPC example.

1 Like

that’s incredible! i don’t use either norns or axoloti but would love to use those sounds in music recording process!

1 Like

Thanks! You’re welcome to download the files and use them, if you want. Send me a link to the track though, if you do!

1 Like

thank you so much! i’ve been exploring all sorts of vocal sampling processing mostly with my own voice but robotizing it and this thread delivers so much fantasy into my brain.

I plan to do some vocal-based stuff myself. Might run my voice through the same process.

1 Like

Found this nice speech-synth sample pack from Little-Scale

There are a total of 59 samples. These represent all 59 sounded allophones that the SP0256-AL2 is capable of producing via its internal ROM.


Cool! Someone should make that into an SC Engine, and write a text-to-phoneme converter in Lua to trigger it.

Not volunteering, though :wink:


I’ve been making more of an effort to get back into the synthetic speech world, especially musical applications of physically articulatory synthesis.

One of my first stabs was re-visiting voc (based off of Pink Trombone), and splitting it into it’s source component glottis and filter component tract. These are included as a part of my ongoing audio DSP literate program wiki sndkit. They are a bit simplified from the original PT/Voc model, but still quite fun to play with.

I made a little Android app called vocshape that showcases the glottis/tract sndkit components, which allows one to interactively shape the underlying virtual vocal tract, which in turn changes the perceived vowel sound. Unlike the usual formant synthesis techniques used for vocal synthesis, the vowel formant frequencies produced here are implicitly created, rather than explicitly created.


I ordered that new Casio CT-S1000V voice synthesizer. I have an obsession with synthetic voices and gender and have collected a zillion old chips, plug ins and devices to play with making non-human voices sound gay/effeminate or not specifically male/female, etc. Machines don’t have gender or sexual orientation so playing with “gender” there is interesting. So adding one more vocal synth to my collection seemed mandatory, especially since Casios were my first non-homemade synths when I was a kid (SK-1, CZ101).