About MP3
About MP3*
*From the help section of MP3 Workshop
MP3 stands for MPEG Layer-3 and it is a file format used for compressing audio
data which normally takes up loads of space into a smaller file. Provided you
have the right software, you can then play them in their compressed form just
as you would a normal wave file. If you are new to MP3s then there may be a
few questions you have about the format so I hope this section will answer them.
Bitrates, Quality & Sampling Rate
The bitrate of a file is kind of like a measure of it's audio quality. Technically,
it is the amount of data that is being used to store the audio. It is measured in
kbps which stands for kilo-bits per second (not kilobytes as many people think- a
bit is 1/8th of a byte). The higher the bitrate the better the quality of the sound
is and the more space it will take up on your hard disk. A bitrate of 128kbps or
160kbps is normally the standard for audio on the Internet although some people
insist on bitrates of at least 192kbps - I personally think that 160kbps is fine.
This was once the only option for the bitrate but now there is an option of VBR (
Variable Bit Rate) - as opposed to CBR (Constant Bit Rate). What VBR encoders do
is that they analyse each frame of audio to be encoded and decide what is the
minimum bitrate that should be used to encode it. You know VBR files because
when you play them the kbps panel on your player constantly changes. In my opinion
VBR is a very good idea because you don't waste space on silent portions or
bits that don't really need alot of bytes so you can spend more space on sections
that do. The quality of the audio produced by an MP3 file also depends heavily
on the encoder used to encode it. Some encoders produce better quality files
than others and also some encoders specialize in lower or higher bitrates. For
example, a file produced with Xing's Encoder @ 128kbps will not sound as good
as a Fraunhoffer produced file at the same bitrate. The Xing Encoder is used
in several products including AudioCatalyst and while it is fast it does not
produce good quality files. The Blade encoder specialises in encoding at
160kbps while the Fraunhoffer encoder works best at low bitrates, so if you're
encoding to 160kbps it's better to use Blade but if you're encoding to 64kbps
you should probably use Fraunhoffer. LAME seems to be quite good at most
bitrates and is a good all-rounder due to it's speed advantage over Blade.
Sampling Rate is the rate at which the audio was sampled (converted into a digital
number representing the sound). CDs use a sampling rate of 44.1KHz which means that
the audio is sampled 44,100 times every second.
Legality
Many people ask questions about the legality of MP3s and whether or not they are
illegal. Well, for a start, MP3s themselves aren't illegal - they are only a data
format. It is what you do with them that it potentially illegal. MP3s are the same
as CDs or tapes- you cannot make legal copies of what you don't legally own.
Downloading MP3s of songs you don't already legally own is illegal but if you
already own the CD then you can make or download it legally as an MP3.
Technical Stuff
So I suppose you're wondering how the format actually works and how it is
able to compress the data down so much without too much loss of audio quality.
Well the answer is that MP3 encoders employ several techniques to achieve this.
First of all, it's worth pointing out that there are two kinds of compression -
Lossless (like ZIP files) and Lossable. Lossable compression means that it isn't
important that the data is perfectly decompressed- just that it's fairly close
to the original. This allows a far greater compression than Lossless where the
original is perfectly restored. Information from here on in was adapted from
information at www.mp3-tech.org. The methods
encoders use to reduce the size
of the file can be put in two classes: perceptual encoding and data compression.
The perceptual encoding algorithms analyse the audio and snip very large
portions of it out. The human ear is far from perfect and it can't hear
everything so this data is removed and not stored in the MP3 file. For a start
it can't hear sounds below the minimal audition threshold, a curve dug between
2Khz and 5Khz, so any audio which falls below this can be removed. You can't
always hear all sounds above this threshold either which is where The Masking
Effect comes in. This means that louder sounds block out or mask the sounds
of quieter ones. Encoders use a psychoacoustic model mimicking the behavior
of the human ear to decide which sounds are being masked and therefore don't
need to be stored. The next three methods of compression deal not with the
audio and what can be removed but with how we can store the data so it takes
up less space. Certain bits of audio simply cannot be encoded to a certain
bitrate without losing audio quality. For these a reservoir of bytes is used
from bits which need less space to encode to give the extra space needed to
encode the difficult passage. This sounds a bit like some kind of internal
VBR to me. Another trick of the human ear is used to save yet more space
and it is called the Joint Stereo. At low and high frequencies the human ear cannot locate exactly where the source of a sound
is, so when such sounds are encountered they are stored as a mono signal with
extra data which partially rebuilds the stereo signal from the mono one. This
saves alot more space than encoding the two stereo channels seperately. Another
stereo trick to save space is used when the Left & Right channels are similar
and is called Mid/Side Stereo. The encoder encodes a middle and a side channel.
The side channels uses far less bits then the middle channel and again the decoder
reconstructs the Left & Right stereo channels from this data. Finally, the last
thing done to an MP3 stream should be familiar to Computer Science students
- a Huffman Encoding. This is a means of data compression whereby symbols
which occur most frequently use less bits to store than those which don't. This
saves around 20% of space on average. It is also the perfect partner to preceptual
encoding. When perceptual encoding cannot save much space the Huffman
algorithm works best and vice-versa. This is because the sounds which
perceptual encoding find difficult are 'pure' sounds which contain many of the
same symbols and so Huffman encoding can compress these well and it is when
very there are several individual sounds or tunes at once that few of the
symbols are identical that Huffman encoding finds it hard, but thankfully
when there are many individual sounds the masking effect can remove large
portions of them.
Back