Python, Code and Cats

I absolutely fell in love with the sound of the companions in [Stray](https://store.steampowered.com/app/1332010/Stray) They have a super interesting voice, with a system I wanted to try to emulate here!

Note: I did NO work on this title, and I have no idea if this is the way they did it, but I highly encourage you to check out the sound designers;

Music Composer/Sound Designer : Yann Van Der Cruyssen @Morusque
Additional Sound Designer : Raphaël Monnin (Twitter handle not found)

Use this script to your heart's content, and generate some whacky companions! In addition, play the incredible work of the team on Stray.

You’ll find the output of all of this in a colab file here!
https://colab.research.google.com/drive/1q6xfaA5jiuKULcrf2dmVWNau_0zD7Aw6?usp=sharing
To contact me, hit up david@weaveraudio.com or go to weaveraudio.com

The Process

I thought I’d take you (that’s you!) through some of the thought processes of the creation of this file.

For now, it’ll probably be a collection of screenshots while I update this page with the thoughts as I have some more time.

The pip installations you’ll need in order to run the script.

The pip installations were not INSANE here, and if you’re unfamiliar with colab, you can use ! to run terminal commands in the code blocks. For more information on the package installer for python (pip), click here.

A collection of the other libraries we’ll use to create this. There are…a few.

There are a number of additional libraries that are used for the following;

nltk - a corpus of training data for words and thesaurus content
wordnet - dictionary features and the like
gtts - Google’s text to speech library
os - operating system for the loading and creation of files
warnings - the oopsie library
playsound - you know…to play sounds
eng-to-ipa - a library for getting pronunciations
random - for random number gen
librosa - advanced audio features library for pitch shifting and time stretching
numpy - for working with data structures commonly used in librosa
soundfile - for creating sound files
ipywidgets - for making pretty widgets
Ipython.display - for showing sliders and other cool things

There are a few, but hopefully they all make sense!

The code

1 - Counting Words

I’m breaking the code section down in 9 sections. They’re all available on the colab and were used for far more demonstrative purposes than anything else, so there are some optimizations and such on the table. But hey, we’re in python, so relax on the speed there cowfriend.

This first code block is getting the word count of a string. Why do we want this? Well, we’ll be doing quite a bit a substitution here for sentences that are long enough, and in order to know that, we need to count the words! Here we’re iterating through a list, adding it up and then returning words. This also helps to separate out the words to ensure we can swap some of them, remove them, and work with our sentence in a more modular way.

2 - Getting Syllables

This handy dandy little feature gives us the IPA pronunciation of syllables for a string, so we can mess with the funky sounding figures!

3- Reversing & Garbling

Our next block is all about reversing the words! Why do we do this? Well with the newly added crappy pronunciation, we’re really starting to garble the words, which is WHAT WE WANT (remember?)

4 - Synonyms

So we’ve got our words separated from earlier, and it’s time to swap some of those out, so that we have a little more distance between the word that we’ve written, and the one that we hear. This is less important for the reader, but it’s nice to get something different enough back. We essentially search for synonyms for words, and if we replace them, then we DONT further garble the word later, by adding it to the list of pre-garbled nonsense that we’re writing all this code for.

5 - Deleting Random Parts

Still sounding too normal? Probably so! But not if we start absent mindedly dropping parts of the words as if they were commit messages on a friday afternoon! Here we check out some of the words in the sentence and just hey presto delete some.

6 - Pitch and Time!

Everything we’ve done so far has ENTIRELY been about swapping the content that we’re working with. Here, we’ll take the value (given later by the user) of an amount of pitch and time shifting, remove some silence from parts of the file, and shift it around by different amounts. This is where we start getting somewhere in the ballpark of the Stray sounding Voiceover from the Companions!

7 - Variations

8 - The Logic Flow

Here’s essentially the whole thing in action!

We count the words
Get some pronunciation
Rearrange the sentence
Delete some words
Send each word off for audio!

9 - The Audio Files!

Stitching each created word together, pitch shifting more of a it and creating all the necessary output audio files!

I’d like to get down into the robotic sounding vocoder elements at some point, but I found a LOT of use out of this part, and I really enjoyed the logic puzzle of garbling some voiceover! I also went ahead and gave the user a little GUI (in the colab file anyway) that allowed you to create a number of variations, enter some text, and pick a text to speech language, which gives some really fun results!

Stray Companion Voiceover Generator