Aeiou is a digital musical instrument that I designed for Northwestern University's COMP SCI 497: Digital Musical Instrument Design. As the name suggests, it's an instrument that uses vowels to bring a more human-like feeling to a digital instrument.
I was really inspired by an article that I had read during the class. Overholt (2009) stated that "one of the primary factors in the evolution of acoustic instruments promoted instruments that were better at expressing human emotion". I thought, "well, what better way is there to convey human emotion than to mimic a certain quality of the human voice?".
Aeiou consists of two main components: the physical interface (the Arduino) and the digital interface (Max). The two sensors currently used are an ultrasonic sensors, which controls the overall pitch, and an accelerometer/gyroscope, which controls the vowel shape of the pitch that is being played. Although there are only 5 vowels that can be used at once, as seen in the video below, you can adjust certain settings of the instrument such as the vowels that you want to use as well as also how much vibrato you want to add. Depending on which key or scale you wanted to play in, you can adjust that on the Max interface as well (not shown in the video). The Arduino interface is connected to my laptop via a USB cable and the audio output comes out of my computer after being processed by my Max code.
The reason why vowels sound they way that they do is due to acoustic resonance that results from changing the space within your mouth or—more accurately—your vocal tract. In order to achieve a similar effect digitally, a resonant bandpass filter is used on a soundwaves with harmonics (e.g. square, triangle, sawtooth waves). A resonant bandpass filter basically allows for certain frequencies to pass through as well as some of the surrounding frequencies, depending on the specified range. To model the vowel sounds, a combination of 3 formant frequencies were used, which can be found on a Wikipedia article about Formants as well as this webpage created by Kevin Russell, a linguistics professor at the University of Manitoba.
For my perfomance, I played a song called Sapiliepah a radiw, which is a Amis/Pangcah folk song.
As you can see in the video, despite practicing for a few days, my performance was not perfect. However, I was still pleasantly surprised how well the instrument was able to mimic the human voice. In fact, there were times where it felt so similar that it was surprising how easy it was to achieve while only using mainly 3 formant frequencies.
Who are the Amis or Pangcah? You've probably never heard of them before, but you've probably heard their singing, especially if you've heard of Enigma's Return to Innocence. The Amis or Pangcah people are an indegenous Austronesian ethnic group to Taiwan. Although they only make up roughly 2% of the modern day Taiwanese population, their culture and presence have been significant in forming the Taiwanese culture and identity. In fact, one of Asia's most famous superstars, A-Mei (張惠妹), is a singer from the Puyuma ethnic group.
As a Taiwanese American, I've spent many long summers in Taiwan during my childhood. Although I did not interact with the Taiwanese indigenous people frequently, their beautiful songs have made a lasting impact on me. Since one of the most striking features of arguably one of their most famous song is the use of vowels, I thought I would take this opportunity to demonstrate the capabilities of my instrument as well as showcase one of their heartwarming songs.
Initially, try to model the vowel sounds was difficult to wrap my head around since although I have an interest in linguistics, some of the technical terms and concepts were a bit advanced for me. Thankfully, there has already been a lot of work that has been done on modeling vowel sounds by studying vowel formant frequencies, so I was able to already find some Max code that could lay down some of the foundations of my instrument.
At the same time, I also wanted to avoid completely mimicking and modeling the human voice because my intentions were not to make a vocaloid, because I would like to avoid the uncanny valley. Unexpectedly, the vowel sounds at times seemed a bit too realistic.
Another part that I originally wanted to do was somehow implement the instrument in a way so that it responds slightly different at the low register compared to the high register, which is another phenomonon that we see with acoustic instruments in general. However, as you probably noticed earlier, this was not part of the instrument design due to the time it took to work out issues with interpreting signals from the Arduino. This brings us to the next challenge that I faced.
Since I wanted to have 5 immediately accesible vowels on my instrument at a time, I had to map out regions on the accelerometer's range of motion in a way that was intuitive to me (and hopefully others) and could perform reasonably ergonomically. Since the extreme values that the accelerometer sends are when it is fully tilted right, left, forward, and back, I decided to use these as the vowel regions. "But wait, that's only 4 regions!" And you're right, so I decided to make the 5th region where the accelerometer is flat—or at least as flat as possible. Little did I know, this also became the biggest source of Aeiou's glitchiness.
To understand why it's "glitchy" in the first place, I should give you some background information on how these regions are to work with each other. Since I wanted the regions to transition between each other smoothly, I defined values where the vowel would be at its strongest and where it would be quietest. This worked well for all four of the regions except the middle since I had to combine two different ranges back to back (0 to 1 and 1 to 0) in order for the middle value to be the loudest. However, as a result, when you cross between the middle point, the sudden change in value and break in continuity results in a "click" in the audio output. Of course, there might be a way to fix this in Max, but as a beginner, I haven't found the solution yet. If you know, by any chance, feel free to reach out and let me know!
Lastly, soldering was a bit more difficult than I thought. Although I would say I have pretty good hand motor skills and I do quite a bit of crafts, solder behaves in a way unlike anything I have worked with before.
Most notably, the instrument's interface has not been fully fleshed out yet. While the overall interaction mechanism has mostly been decided, there is still room for improvement in terms of how the sensors should be held and the casing that should house them. I would also like to eventually add a button that allows for the user to start and stop the audio output easily to allow for the instrument to "play" rests.
In addition, as mentioned previously in the "Challenges & Obstacles" section, I would eventually like to achieve a sound that is less human but still retains its capability of having vowel-like qualities. I would also like to introduce some unique character to Aeiou by implementing something that causes the instrument to behave a little different at high and low registers.
Finally, it would be nice to find a way to remove the clicking that occurs when switching from one vowel to another to make the overall sound smoother.
Of course you do! Well look no further, I have provided both my Arduino and Max code below. Unfortuantely, you'll have to assemble your own Arduino interface, which you should be able to easily obtain online if you have a budget of $100 USD.
To download the files, right click and select "Save Link As...". If you are new to Max, in order to use the Max code, select and copy the whole thing and from the "Files" tab in Max, select "New From Clipboard". With the auxillary file, just download and save it in the same folder/directory where your newly created Max file is. As for the Arduino code, that should open without any problems as long as you have the Arduino software downloaded and installed on your computer.
Download Max Code Download Max Auxillary File Download Arduino Code