What's the difference between Wave, MIDI, MOD and MP3?

By Kees van der Velden

For sound you need an instrument and a 'musician' (in the broadest sense of these words). If you like to hear a piano sound, you need to have a piano and someone playing it. If you love the sound of breaking glass, the neighbours window will do fine as an instrument and their son, throwing a baseball very wide, could be the great musician to satisfy you.


Let's stick to the piano. When the pianist is playing the piano, little hammers strike the piano strings. The strings start to vibrate and make the air-molecules around them vibrate. The air-molecules pass this on to other air-molecules, until finally this vibration of air hits your eardrums. Sound is how you (your brain) interpret this vibration.

The continuous 'flow of vibrations' from the sound source (to your ears) is called the sound wave, which can be represented on paper (or on screen) as a wavy line, although this line would be very fanciful when it represents 'normal' sound.

When two or more instruments are played at the same time, all the vibrations coming from those instruments will be mixed (in the air), so there still will only be ONE sound wave. (Even more complex than from the piano alone).

When you want to save sound, so you can hear it later, you can record the sound wave in several possible ways.

In the past people could only save sound in an analogous way. The sound wave was 'printed' on a tape or a (vinyl) disk. A bit similar to the way you would draw a shaky, wavy line on paper.

Nowadays it's possible to record sound in a digital way. Therefore the sound wave is 'cut' into thin slices, called samples. Each of these samples gets a value, depending on its position on the 'wavy line'. This way we 'convert' an analogue sound wave into a string of values. But don't forget, this long string of numbers (values) still represents the sound wave, the vibration of the air. No more, no less. This string of numbers, that can be stored on CD, Harddisk or Tape is called a WAVE file.

Two things are important in this process; the sample rate and the sample value.

The sample rate tells you how many samples are taking per second of sound, i.e. in how many slices a second of sound wave is cut. More samples/second (thinner slices) mean a better preservation of sound quality. A typical sample rate for real good quality is 44.100 samples per second.

Then these samples will be given a value. To be able to make a good distinction between the various samples you need a broad range of numbers.

Think about the athletes that run the 100 metres. If we could only measure their time in full seconds, the numbers 1 through 16 would be sufficient. The good ones would all do the 100 metres in 10 seconds, ergo they would all be world champion. Since we don't want that, we measure their time in thousandth of seconds, which gives us a broad range of 16.000 numbers (in 16 secs) to make a good distinction between the athletes.

We need something in that same order when we assign values to samples. Since computers work with bytes and 1 byte (256 numbers) is not really enough for reasons of quality, we use 2 bytes per sample, which gives us 65.536 numbers to choose from.

Now you also know, why (quality) WAVE files are huge. For one second of sound you need 44.100 x 2 = 88.200 bytes and that is just one channel. For stereo you have to double that of course, which brings you at a total of 176.400 bytes for one second of sound. A minute of sound will cost you roughly 10,5 megabytes.

MP3 explained

MP3 is the file extension for MPEG Audio compressed files. The .mp3 files are WAVE files, but they are compressed in a very special way. Maybe you have heard of file compression methods or maybe you even use a program like PKZIP or WINZIP to make .zip files yourself. This however is a completely different compression method.

When you compress a file and turn it into a .zip file, nothing is left out. It's a method to save ALL data in a smart way using less space. There are lots of possibilities to do that, but let me give you one very simple example.

When there are 40 dashes in a standard file, they are written as: ---------------------------------------- taking 40 bytes of space.

Another way of writing these 40 dashes is: 40x- (40 times -) which only takes 4 bytes of space. The compression ratio in this example is 10:1, which is, as you will understand, quite exceptional and certainly not the average for a complete file.

The advantage is, ALL data is still there, although the file takes up less space. The downside is, a .zip file has to be 'unzipped' before you can use it, which means that (after 'unzipping' it) it will take up the same amount of space as it did before it was 'zipped'. In addition, 'zipping' a WAVE file will not bring you very much. A compression ratio of 2:1 at the most.

The compression method that is used to make .mp3 files is totally different. In this method some things are actually left out, but in a very smart way, so you won't notice (hear) it. Information that is not important will be stripped. Based on the research of human perception the encoder decides what information is important and what can be discarded.

When a sound wave hits your eardrums, the incoming data is analyzed by your brain. The brain interprets the sound and filters out irrelevant information, which means you just don't hear everything that is in the sound wave.

Another simple example:

You're listening to the Rolling Stones using your headphones. Now turn of the walkman. You can hear everything that's going on around you. The headphones over (or in ;-) your ears do not really block the sound that is coming from the 'outside'. Turn the walkman back on and listen to the Stones again. This time you won't hear 'outside' sounds, although they're still there. The music on your headphones is so loud in comparison to the 'outside' sound, that this 'outside' sound is filtered out by your brain.

MPEG Audio compression does this job for you. It's called "perceptual coding." This is quite clever, because the information that would be stripped by your own "brain-filter" anyway, no longer needs to occupy hard disk space or internet bandwidth. You have to be a bit careful though, because if you encode at a very strong compression rate, MPEG also strips information that is audible, but with 'light' compression (up to a ratio of approximately 12:1) you won't hear the difference between the .mp3 file and the uncompressed original. Compression rates of 12:1 without loosing quality are pretty normal for MPEG Audio compression.

The disadvantage of MPEG Audio compression is, that there is a lot of processing power required to encode and play files.

MIDI explained

Let's go back to the pianist we met in the section about WAVE. We see him play the piano ('commanding' the piano) and we hear the sound. We already saw, that we can record this sound. (see WAVE)

Suppose I don't like the piano player and I want to get rid of him (for whatever reason), but I still like to hear that piano play the tunes. In that case I must record the actions ('commands') of the piano player and find a way to execute these 'commands' upon the piano. Well, they thought of a thing like this ages ago and developed the player piano, also called pianola. The 'commands' of the pianist were recorded on a roll of paper (the piano roll) by punching holes in the paper at exactly the right places. That way a 'smart' mechanism could play the piano. These piano rolls, representing a sequence of 'commands', are in a way the first MIDI files.

Todays techniques give us many more possibilities and we don't need the roll of paper anymore, but the idea is about the same. In a MIDI file we record (lay down) all the 'commands' of the musicians playing their instrument. So there is no sound in a MIDI file, there are only 'commands'. In MIDI these 'commands' are called messages or events.

MOD explained

Now that you have a general idea of MIDI and WAVE files, we can move on to MOD files. A Module (MOD for short) is sort of a hybrid, a mixture of MIDI and WAVE. The MIDI file depends on the instruments, that are on your sound card or in your external sound module. A MOD file has the sequencing information AND the instruments (in the form of 'samples') in it. These samples can be looked upon as short WAVE files of one note of an instrument. It's like a MIDI file with a soft-synth (software synthesizer) inside.

It is up to you, the MOD composer, what samples you wish to include in the MOD file. This way you're not dependent on the instruments of the sound card, which means that the song will sound the same on any computer and you're not limited to the instruments and effects that are built into the sound card. On the other hand, you are limited in the number of samples you can put in a MOD file and changes are less easy to make. When you buy a better sound card or sound module, all your MIDI files will sound better, without any (relevant) changes. Whereas, in a MOD file the quality is laid down 'forever'.

Also the size of MOD files is larger, because the wave samples are included and good samples take a lot of space.

If you would like to learn more about MOD files, you can visit:

news:alt.music.mods or

"Some of this information was supplied by the official alt.music.midi FAQ created and maintained by Kees van der Velden. A HTML version and download of the complete FAQ can be found at the excellent web site MIDI Papa's - the MIDI FAQ by CC maintained by Bomi"

Share This Article