Archive

Posts Tagged ‘audiobook’

Thesis audiobook aka. Thesis grenade

Thesis grenade, almost finished
Thesis audiobook, contents missing

Idea

As a good friend of mine, Ville Kotimäki (who is also an excellent photographer btw) just finished his PhD, some figuring out some present to give was in order. I got an idea to machine some really solid piece out of aluminum and engrave it with text. I also wanted it to play something, as it seems to be a recurring theme for me.

The initial idea was to build a fire alarm siren inside an aluminum container, which would be welded shut and would only have ON button, but no means to turn it off. The idea was shortly scratched because also I would have to listen to it. The second iteration was to make the container read out the thesis like an audiobook.

Text

Fortunately I managed to recruit my friend Heikki to do all the laborious tasks. We started by downloading the thesis pdf. The pdf was converted to text and Heikki did they most annoying part of the project by cleaning the text files from badly converted items like equations, picture captions and tables etc. In the meanwhile I tried a few different speech synthesis programs. I would have liked to use some open source software, but most of them sounded like Stephen Hawking on a bad day.

The commercial Nuance was far superior to everything else I tried. Nobody seemed to have a license for it but there is a nifty feature in OS X that it has text to speech support by Nuance and from system preferences menu you can even download additional voices from Nuance. Heikki proceeded to write a script, which splits the text file to single pages. These pages are then converted to speech with “say” command in OS X and resulting AIFF files are converted to WAV files suitable for wav library we used in Arduino.

Container

The container itself was first drafted on a piece of paper. The measurements were dictated by the speaker we used (diameter 66mm) and the fact that on the inside there are speaker, 9v battery and power switch on top of each other. My brother machined the container shape from solid aluminum with a manual lathe from the drawing. Engraving and machining of the legs and drilling the bottom to let the sound out was designed with Alphacam software and machined with Haas UMC-750 5-axis machining center at my company G-Tronic.

I thought the engraving would be an easy job with Alphacam , but it turned out that the post processor which generates code for the machining center from cad drawing had a few nasty bugs resulting the tool occasionally to go through the work piece. Eventually I had to resort the help of professional machinists at the company, but (or because of that) the finished product looks great! The black effect on text is achieved by coloring it over with a black sharpie and wiping it over with a tissue dipped in acetone.

Electronics

The schematic for the device is really simple. SD-card is connected in parallel with ISP interface to ATMega328p. Since SD-cards operate with 3.3v voltage, LM328 regulator drops voltage from 9v battery to 3.3v. BC547 transistor acts as a extremely simple audio amplifier, switching 9v voltage to speaker commanded by processors pwm output. Not and audiophile solution but works surprisingly well in this case. Since Heikki wanted to learn some electronics, I just drew the schematic in Eagle and left him the job of figuring parts arrangement on the stripboard and soldering the parts together.

We had a great trouble with the first version. We tried to use LD1117V33 fixed voltage regulator. When measured, it outputted solid 3.3v but the ATMega just would not start to execute the code. The thing worked without problems when powered with PSU. Only explanation I can think about is that the regulator starts to oscillate, but when we checked the output with oscilloscope, there was nothing obvious visible. In the end we exchanged the regulator with trusty old LM328 and the problems went away. The circuit draws about 170 mA, which translates to roughly 3h of usage from 9v battery, but we figured out that no sane person wants to listen this for more than couple of minutes at the time. Note that the connector SV2 pinout is not the same than SD-card pinout! For example how to connect the SD-Card to avr, see here.

Firmware

The firmware itself is quite straightforward, the only difficulty was the user interface since there is only a power button and we did not want the device to start at the beginning of the book each time. We also wanted a couple of different voices to read the book. On the SD-card, audio files are saved with file name pattern <page (number)>-<voice (number)>.wav. When the device is turned on, it randomizes one of the three voices to use. Each time it finishes playing one page (one file), it saves the next page number to eeprom memory. When device is turned on, it starts to play from the saved page. If the device is turned off before it finishes to play back one page, the counter resets to the beginning of the book. There’s even a nifty page turn sound saved in every other file.

When we initially tested the setup on Arduino UNO board, we used TMRPCM library. It worked well, but on the final hardware, we used the internal 8MHz oscillator instead of 16MHz crystal and found out that the library does not support other that 16MHz clock speed. We changed the library to superior SimpleSDAudio, which enabled us to use 31.250 kHz sampling rate @ 8 Mhz. The library seemed also to have smaller compiled size, but that was irrelevant to us since we are only using 10k of 32k code space.

Result

I was really happy with the finished device. Some Ville’s friend dubbed the device to “Thesis Grenade”, which is quite accurate by the looks of it =) The video below is a little bit repetitive, I tried to demonstrate the different voices used, but the random generator decided to use one same voice for many times.