Tag Archives: coding

WAV files: How I design myself to a standstill, and how to get around that.

So, I need to write some code that writes windows .WAV audio files, and I’ve been looking into the format. I am sure there are some libraries out there for doing such a thing, but since I can’t use open-source with Nintendo ever, I thought it would be good to make my own.

While I am working on writing the files, I might as well add code to read the files as well, which is where it gets a little more tricky.

Evidently, a .WAV file is actually a “RIFF” container format that can hold just about any type of data, and while there is some fairly standard stuff that you need in an audio file, there is also a bunch of other stuff that MIGHT be there, and you have to code to at least properly skip the stuff you don’t need if you want to have any hope of reading many of the .WAV files out there.

Fortunately, the format is fairly well designed… each of these “chunks” starts with a 4 letter ID code followed by a 4 byte length that covers the rest of the data in that chunk. So you can pretty much look at the ID to see if it is useful, read the length, and then either read or seek past the entire chunk. That is pretty easy.

Now, how to arrange all this in memory so that an arbitrary block of code can find what it needs in all of this is the tricky bit.

One could build up some separate data structures to store all the chunk data, and write routines to search through it for what you want later, but that is an awful lot of work.

On the other hand, it might be easier just to read the whole block, and then just traverse all the header stuff looking for what you want when you are ready to use it. But this also has drawbacks, namely that you are possibly keeping a lot of extra data around, and when you need to access information about that data you lose time searching for the headers you need.

Or, I could write my code as a toolbox for parsing the .WAV file on your own terms so that you can get what you want, but that seems like it would assume future programmers who might use this would understand enough about the .WAV file format that they would know how and why to use said toolkit. Will anybody besides me ever use this?

I suppose I could have it parse the important bits I know that I will need, and then give the option to give a list of chunk IDs to not throw away and a mechanism to find them again? Are they really that important?

And suddenly, I realize that I have become paralyzed by the design decisions, trying to make a perfect system that I can use forever and never write again, which will only make sure that I make something so complicated that I will never finish it, and will never WANT to use it again!

So I step back. What do I need it to do, minimum?

  • Write a stereo .WAV audio file at 44.1kHz.

What would be secondary goals?

  • Allow the sampling rate to be adjusted and be able to read the Rhythm Core Alpha 1 & 2 sounds, which are mono and contain loop data.

Anything else?

  • I wanted to be able to write the file as I go, rather than build it all in RAM, so that perhaps if I ported to smaller platforms later, I wouldn’t need so much memory.

Is there anything I can do to make future expansion easier?

  • Structure the chunk reading loop so that there is a good place to put processing for additional chunks.

Ok, those are more reasonable goals. The system will write .WAV files based on a small number of options, and will read at least mono sound data and handle the chunk data for loop points.

Anything beyond that which is required later will be written later. It is important not to be too hung up on the concept that you might need to rewrite or reengineer code sometimes. It will happen. It is easier to do than to write an everything machine from the start, and probably easier to maintain as well.

In short, the moral is: Write today’s program. Get it done. ┬áConsider tomorrow’s program to a reasonable degree, but don’t let it hang you up or bog you down.