Disquiet Junto Project 0290: Text-to-Beat


Disquiet Junto Project 0290: Text-to-Beat
Use computer-generated speech as the rhythmic foundation for a track.

Step 1: This week we’re going to build a track around text-to-speech, the results of a computer-generated voice speaking. In past text-to-speech we’ve used pre-existing text as the source. In this case we’re going to build the text to order. Keep this in mind.

Step 2: Find a good text-to-speech system that you think you can work with musically. In MacOS, for example, there’s a built-in system shortcut: Just select the text you want to hear, highlight it (in a browser, or a text editor, wherever) and then hit the ESC button while holding down the OPT button. There are also other tools, including browser-based options, like the one here:


Step 3: Experiment with different combinations of words to produce a rhythm you want to work with.

Step 4: Record the rhythm you developed in Step 3.

Step 5: Produce a track using the rhythm in Step 4 as the foundation. (Level Up: Use more than one text-to-speech pattern to create cross-patterns, phasing, among other polyrhythmic events and effects.)

Five More Important Steps When Your Track Is Done:

Step 1: If your hosting platform allows for tags, be sure to include the project tag “disquiet0290” (no spaces) in the name of your track. If you’re posting on SoundCloud in particular, this is essential to my locating the tracks and creating a playlist of them.

Step 2: Upload your track. It is helpful but not essential that you use SoundCloud to host your track.

Step 3: In this following discussion thread at llllllll.co please consider posting your track.

Step 4: Annotate your track with a brief explanation of your approach and process.

Step 5: Then listen to and comment on tracks uploaded by your fellow Disquiet Junto participants.

Deadline: This project’s deadline is 11:59pm wherever you are on Monday, July 24, 2017. This project was posted in the morning, California time, on Thursday, July 20, 2017.

Length: The length is entirely up to the participant, though two or three minutes is suggested.

Title/Tag: When posting your track, please include “disquiet0290” in the title of the track, and where applicable (on SoundCloud, for example) as a tag.

Upload: When participating in this project, post one finished track with the project tag, and be sure to include a description of your process in planning, composing, and recording it. This description is an essential element of the communicative process inherent in the Disquiet Junto. Photos, video, and lists of equipment are always appreciated.

Download: It is preferable that your track is set as downloadable, and that it allows for attributed remixing (i.e., a Creative Commons license permitting non-commercial sharing with attribution). Keep an eye on the license of the audio you source, as that may determine the license you end up using.

Linking: When posting the track online, please be sure to include this information, along with details of your source audio, including links to it:

More on this 290th weekly Disquiet Junto project — Text-to-Beat: Use computer-generated speech as the rhythmic foundation for a track. — at:


More on the Disquiet Junto at:


Subscribe to project announcements here:


Project discussion takes place on llllllll.co:


There’s also on a Junto Slack. Send your email address to twitter.com/disquiet for Slack inclusion.

Image associated with this project is by Flickr member Jordan, used thanks to a (note: No Derivatives) Creative Commons license:



The project is now live.


PS: As @LeeRosevere just mentioned on Twitter, the little microphone on Google Translate is also a fun tool for this:



This is @LeeRosevere’s use of Google Translate for “weird percussives”:



7 0 0 9, ‘Alice’ at http://www.fromtexttospeech.com


I’ve been working on a piece called Space Drug Nixon so I thought I’d appropriate that idea for this song.

did the text-to-speech with that link you provided (for some reason “Space” comes out sounding like “Paste,” hey fine), dumped it in ableton, chopped it up, stole drum loops from Skull Snaps, dropped in a bass line stolen from every Parliament Funkadelic song, threw in some gratuitous scratchy rhythm and lead guitar. Boom, it’s the 70s man!


Hey All. Started with the poem Bleecker Street, Summer By Derek Walcott and converted it to ye ole triad setting. Pitched that recording up and down 3 and 6 which created a weird fx that i looped with a straight version by crossfading putting a different beat with each. Did some modular for robotic sweetness. Hope all are having a great summer or down under you guys be chilling. Kinda funny I got a message from Disco Robot whem logging in here.

Peace, Hugh


If you press the speaker a second time, the speech is slower :grinning:



This is my message to Elon Musk… he(and others) should not be afraid of AI, and it taking over the world and destroying the human kind… it’s evolution! Maybe the world would be better of without humans!

Have a nice day,




Using the same text-to-speech website as shadow_machine, I chose a female voice for the word ‘treble’ and a male voice for ‘bass’. The samples were imported into Logic, cut to one bar lengths, looped throughout the whole track and manipulated using iZotope’s Stutter Edit. A drum track that followed the vocals was added and messed around with using an effects sequencer.


For text-to-speech I used a V-tech Alphabet Desk that’s been popular with my kids for over a decade now.

I experimented with a few combinations of letters but, once loaded into Ableton Live, used two instances of X for snare and hi-hat.

You can also hear a lot of B. The track needed a kick, so I settled on an 808 but the rest of the sounds come from the V-tech toy.

There’s more on my blog, as well as another track using another V-tech toy from about a decade ago.



I decided to see if I could make Radiohead’s “Fitter Happier” less depressing. I did the electric piano part by running audio-to-MIDI on the “Fitter Happier” sample, and then harmonizing it in thirds. I stretched and stuttered the sample in various places as well. I programmed the 909, and the percussion and oud loops are from GarageBand. The background texture is the “Fitter Happier” sample Paulstretched by a factor of about sixty.


I am going to play this on repeat today and see if I get more done.


The playlist is now live:


As with all of my content, this entire process was documented in a live stream at www.twitch.tv/zeromeaning

The spoken word is percussive by its very nature. Rhythm, flow, plosives, alliteration, shape, and so on all combine to form not only language but emotion that we can understand. If a robotic voice were to read the most affecting message, even though hard data remains, the original intention - the affect or feeling behind that message - is undermined.

A short poem was written prior to the livestream. An online text-to-speech was used to convert the text for the assignment. The setting used was American English ‘Allison’. Find this tool here: https://text-to-speech-demo.mybluemix.net/

A few short piano notes were recorded and looped to form a backbone for the piece. A bowed, broken violin was pitch-bent downwards and reverb added. This violin was also used to form both the phased, plucked strings and the distorted elements.

The speech was cut to fit the music. Tap-delay and reverb were added, and additional voices were roboticised subtly underneath.

the crowd roars
blisters swell like waves upon the shore
all bones and flesh and teeth and nails
their mouths agape
waiting for something, anything more

like raging hornets
ten thousand eyes flash
dripping as poisoned needles
burrow beneath putrid skin
and coronas of covenants unkept
emerge in complex splendour

but the world is small
we are nothing

zero sight
and zero mind
and zero place
and zero time
we’re blind to what we think we know
meaning is null and void

and zero and zero is nothing but zero

zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by zero divided by


Made me smile during the complete song :smiley: Very nice!


“Someone who doesn’t speak for a day has no idea what it’s like to not speak at all,” wrote the late film critic Roger Ebert, who spent the last seven years of his life speaking through a computer after thyroid cancer necessitated the removal of his lower jaw. He communicated verbally using Alex, the same text-to-speech (TTS) voice system that powers Apple’s VoiceOver engine for people who cannot see.

Mr Ebert bemoaned the lack of realism of computer-synthesized voices, describing how the comparisons between human speech and TTS were “relative, not absolute.” Communication isn’t simply about words, he explained; we also derive meaning from inflection, delivery, timing and tone.

For this short piece, Suss Müsik explored the relative (but not absolute) musical parameters of Apple VoiceOver. We recorded four quotes and identified one rhythmic phrase from each, which were then assembled to create a new sentence. The foundation of the piece is the combination of breaths, hiccups and nonverbal noises that accompany everyday human speech. Treated piano and metallic percussion were overdubbed.

The piece is titled Singularity in homage of Ray Kurzweil, who among other achievements is credited with inventing the first TTS synthesizer. The image is the Braille alphabet.

The quotes used in the piece are as follows:

“Those are my principles, and if you don’t like them… well, I have others.” ~ Groucho Marx

“Life is too short to work on inconsequential problems.” ~ C.K. Prahalad

“Somebody asked me, ‘If you had to give advice to a young actor, what would it be?’ I never even knew I was thinking this, but I said, ‘Always, even in a limo, wear your seat belt.’ To me, that’s good advice.” ~ Christopher Walken

“I’m seven people away from myself at the moment, but getting closer all the time.” ~ Don Van Vliet, aka Captain Beefheart


Hey all, first piece here: https://soundcloud.com/fiona-caldarevic/nothing-peculiar-disquiet0290
I used text out of the first sentence of a newspaper article and picked the most rhythmic words to keep. Kept it really simple. Then found a couple of other words that fitted, although I had to stretch them out a bit to keep them in time, once I worked out what the tempo was. Then added a beat, then some more instruments. That was a heap of fun to do, looking forward to listening to others’.


I used LumenVox www.lumenvox.com/products/tts/# Text to speech to generate the voice, using the Disquiet tagline itself: Listening to art. Playing with audio. Sounding out technology. Composing in code.
Main layers are just using the whole sample as keyboard pattern, and one live cutting the piece so it is fractured. This one was then slowly distorted with delays and chorus as well as time distortions.


I remember the latin drills I went through as a kid and how we would chant the verb conjugations like mantras and thought I would use this as a starting point.
I fed the present tense conjugations for to love into Alter Ego and used Numerology to experiment with different trigger patterns. The vocal synth pitch and rhythm info was also sent to Sinevibes zap and Demogorgon synth.
I had earlier tried using the Pink Trombone web app dood.al/pinktrombone/ as a vocal synth and mixed some of my efforts into the mix, added a drum loop and then got a bit carried away into my own private sound hole and veered off the main brief :slight_smile: