Week 4 Blog Post
Embarking on a journey through Eleven Labs to explore AI voice generation was a fascinating experience that resonated with Simon Willison’s insights in his “Catching up on the weird world of LLMs” talk at North Bay Python 2023. Just as he highlighted the diverse and quirky behaviour of LLMs, our class dove into the intriguing realm of AI-generated voices. I first played around with the Blip outers script but I thought it would be interesting to feed the AI some of my favourite childhood Dr Seuss stories.
Our interactions with Eleven Labs revealed a spectrum of responses. Sometimes the text couldn’t be properly spoken or sounded erratic depending on the settings we plugged into it, I liked to play around with accents and long words to see how it would interpret.
Eleven labs was only part of what we did in this class, we also looked at recording our natural voices on our phones, reading the script we created earlier on ChatGPT, we then put this recording into Reduct and got it to transcribe the recording. This process was funny. My group played around with making the script as natural as possible and as a result we found ourselves doing a lot of improvisation.
(ChatGPT Version)
Embarking on an odyssey through Eleven Labs to delve into the realm of AI voice generation proved to be a profoundly captivating experience that harmoniously echoed the insights shared by Simon Willison during his discourse, “Catching up on the weird world of LLMs,” at the distinguished North Bay Python event in 2023. Much like Willison’s elucidation of the nuanced and idiosyncratic behaviors of Large Language Models (LLMs), our academic cohort wholeheartedly embraced the enigmatic domain of AI-generated vocalizations.
Initially, I engaged with the Blip outers script, but the allure of further experimentation beckoned, prompting me to imbue the AI with cherished childhood tales by Dr. Seuss—an endeavor that promised a distinctive creative encounter.
Within the framework of Eleven Labs, our engagements unveiled an expansive spectrum of responses. Occasional phonetic dissonance and fluctuating coherency arose, influenced by the intricate amalgamation of parameters we invoked. Intriguingly, my fascination propelled me to manipulate accents and lexicon complexity, thereby unraveling the AI’s intricate interpretation process.
The Eleven Labs dimension, while pivotal, merely constituted a segment of our pedagogical voyage. We ventured further by capturing our organic vocal cadences through mobile devices, reciting scripts authored earlier via ChatGPT. Subsequently, immersing the recordings within Reduct’s purview, we navigated the whimsical tapestry of transcriptions—an endeavor that induced hearty amusement.
Particularly within my collaborative group, an affinity for authenticity galvanized us to cultivate a narrative that mirrored natural discourse. This pursuit not only engendered spontaneous improvisations but also kindled a mosaic of comical moments, rendering the learning process both insightful and enjoyable.
Audio Player