Specialization

Process

I started with trying to find a a good library to record, and playback, audio. I settled with FMod since that's a Library that I have some experience with. It would also simplify the proximity part. The goal with FMod was to then extract raw data, in form of a char pointer, from the sound and then send it to another client for recreation of the sound.

The Code used for recording

I started by looking at FMod's example for recording. In their example they create a sound and get a index for a input device. They then use these to start a recording on that specific sound. It's also important to notice that they create a sound with a circular buffer with the length of one second (calculated with the microphones hertz). Altough their code for recording worked for what I needed, my method for playback would need to be changed since I can't just play it like they do.

The code I used for recording was pretty much ripped out of the example from above. The biggest change was to allow the use to choose which input device they want and that all the code for this should run on a seperate thread. I also limited the hertz of the audio to 16000, since that made it so that the size of the data was within a certain limit (under 512 bytes).

To then extract the data I used the locking and unlocking of the sound. Since the sound has a circular buffer, I just need to add the length the last bit of data to the offset and then take our current position in the sound minus the offset to get the new length. I then take the char array, and it's size, and use winsock2 to send it to the server. All the server then does is send it to all other clients.

For the playback on the other clients, all I had to do, in theory, was to then recreate the sound by feeding in the char array in the name slot and use the FMOD_OPENMEMORY setting. after that I would just need to play the sound.

Result

The result was, sadly enough, way too subpar. While I am able to get the data of the sound and send it through winsock for playback, the sound that plays back, even if I skip the network part and just play it on the same device, is not right. From what I can hear, the sound that plays seems to be slower and more robotic, and since the audio is slower, it also means it gets cut-off before the whole sound is played.

I tried increasing the frequency of the sound, but that only made it so that my voice became more high pitched, which was a expected biproduct.

The code I wrote can be downloaded here, it is not finished but if someone who is better than FMod had a look at it, they might be able to solve it. That's why I'm letting this code be "open-source", meaning that anyone can use and edit it however they want without crediting me.

Link To Download

Reflection

Even though I've written quite a lot of code in FMod recently, I am still unsure why this code produced the results that it did. The latest theory is that lock-ing the sound to extract the data also pauses the recording. The solution to that would be to create the sound using FMOD_OPENMEMORY_POINT and then feeding in my own buffer to read from, but I've been unable to test that solution since I haven't had a spare moment to do so.