Site hosted by Angelfire.com: Build your free website today!
About MP3

About MP3*


*From the help section of MP3 Workshop


MP3 stands for MPEG Layer-3 and it is a file format used for compressing audio data which normally takes up loads of space into a smaller file. Provided you have the right software, you can then play them in their compressed form just as you would a normal wave file. If you are new to MP3s then there may be a few questions you have about the format so I hope this section will answer them.

Bitrates, Quality & Sampling Rate

The bitrate of a file is kind of like a measure of it's audio quality. Technically, it is the amount of data that is being used to store the audio. It is measured in kbps which stands for kilo-bits per second (not kilobytes as many people think- a bit is 1/8th of a byte). The higher the bitrate the better the quality of the sound is and the more space it will take up on your hard disk. A bitrate of 128kbps or 160kbps is normally the standard for audio on the Internet although some people insist on bitrates of at least 192kbps - I personally think that 160kbps is fine. This was once the only option for the bitrate but now there is an option of VBR ( Variable Bit Rate) - as opposed to CBR (Constant Bit Rate). What VBR encoders do is that they analyse each frame of audio to be encoded and decide what is the minimum bitrate that should be used to encode it. You know VBR files because when you play them the kbps panel on your player constantly changes. In my opinion VBR is a very good idea because you don't waste space on silent portions or bits that don't really need alot of bytes so you can spend more space on sections that do. The quality of the audio produced by an MP3 file also depends heavily on the encoder used to encode it. Some encoders produce better quality files than others and also some encoders specialize in lower or higher bitrates. For example, a file produced with Xing's Encoder @ 128kbps will not sound as good as a Fraunhoffer produced file at the same bitrate. The Xing Encoder is used in several products including AudioCatalyst and while it is fast it does not produce good quality files. The Blade encoder specialises in encoding at 160kbps while the Fraunhoffer encoder works best at low bitrates, so if you're encoding to 160kbps it's better to use Blade but if you're encoding to 64kbps you should probably use Fraunhoffer. LAME seems to be quite good at most bitrates and is a good all-rounder due to it's speed advantage over Blade. Sampling Rate is the rate at which the audio was sampled (converted into a digital number representing the sound). CDs use a sampling rate of 44.1KHz which means that the audio is sampled 44,100 times every second.

Legality

Many people ask questions about the legality of MP3s and whether or not they are illegal. Well, for a start, MP3s themselves aren't illegal - they are only a data format. It is what you do with them that it potentially illegal. MP3s are the same as CDs or tapes- you cannot make legal copies of what you don't legally own. Downloading MP3s of songs you don't already legally own is illegal but if you already own the CD then you can make or download it legally as an MP3.

Technical Stuff

So I suppose you're wondering how the format actually works and how it is able to compress the data down so much without too much loss of audio quality. Well the answer is that MP3 encoders employ several techniques to achieve this. First of all, it's worth pointing out that there are two kinds of compression - Lossless (like ZIP files) and Lossable. Lossable compression means that it isn't important that the data is perfectly decompressed- just that it's fairly close to the original. This allows a far greater compression than Lossless where the original is perfectly restored. Information from here on in was adapted from information at www.mp3-tech.org. The methods encoders use to reduce the size of the file can be put in two classes: perceptual encoding and data compression. The perceptual encoding algorithms analyse the audio and snip very large portions of it out. The human ear is far from perfect and it can't hear everything so this data is removed and not stored in the MP3 file. For a start it can't hear sounds below the minimal audition threshold, a curve dug between 2Khz and 5Khz, so any audio which falls below this can be removed. You can't always hear all sounds above this threshold either which is where The Masking Effect comes in. This means that louder sounds block out or mask the sounds of quieter ones. Encoders use a psychoacoustic model mimicking the behavior of the human ear to decide which sounds are being masked and therefore don't need to be stored. The next three methods of compression deal not with the audio and what can be removed but with how we can store the data so it takes up less space. Certain bits of audio simply cannot be encoded to a certain bitrate without losing audio quality. For these a reservoir of bytes is used from bits which need less space to encode to give the extra space needed to encode the difficult passage. This sounds a bit like some kind of internal VBR to me. Another trick of the human ear is used to save yet more space and it is called the Joint Stereo. At low and high frequencies the human ear cannot locate exactly where the source of a sound is, so when such sounds are encountered they are stored as a mono signal with extra data which partially rebuilds the stereo signal from the mono one. This saves alot more space than encoding the two stereo channels seperately. Another stereo trick to save space is used when the Left & Right channels are similar and is called Mid/Side Stereo. The encoder encodes a middle and a side channel. The side channels uses far less bits then the middle channel and again the decoder reconstructs the Left & Right stereo channels from this data. Finally, the last thing done to an MP3 stream should be familiar to Computer Science students - a Huffman Encoding. This is a means of data compression whereby symbols which occur most frequently use less bits to store than those which don't. This saves around 20% of space on average. It is also the perfect partner to preceptual encoding. When perceptual encoding cannot save much space the Huffman algorithm works best and vice-versa. This is because the sounds which perceptual encoding find difficult are 'pure' sounds which contain many of the same symbols and so Huffman encoding can compress these well and it is when very there are several individual sounds or tunes at once that few of the symbols are identical that Huffman encoding finds it hard, but thankfully when there are many individual sounds the masking effect can remove large portions of them.

Back