Last week, a client received an email from a fan who expressed surprise that my client’s new record was only available as a compressed download from iTunes.
The prevailing level of misunderstanding over the sound quality possible from a store like iTunes is perhaps best-encapsulated by this excerpt of a review of Cameron Carpenter’s latest album by Mark Swed, of the LA Times:
“…the real deal requires the real deal. The touring organ is a digital instrument, and on it Carpenter does his wowing best in the best digital sound, which isn’t bad on the CD (and is bad on restricted mp3 downloads on Amazon and iTunes or streaming sites). On studio master download from sites that handle high definition, though, the touring organ becomes a conveyor of psychedelic electronic music in a class of its own.”
Amazon does sell MP3 files, but iTunes uses the AAC codec instead. As a consumer, the exact distinction between these isn’t terribly important, but if you’re the music critic for the LA Times and you’ve taken it upon yourself to weigh in on audio quality, it’s something you really ought to understand. Streaming sites also employ a variety of types and degrees of compression. Beats uses MP3 and AAC. Spotify uses Ogg at a variety of bitrates.
Lumping all these formats together (and dismissing them) just because they’re “compressed” makes about as much sense as equating the sound of 78s and LPs just because they’re both round. It’s really exactly that stupid, and yet here’s a respected music critic doing just that (and not for the first time).
I’m all for good audio quality, but the obsession with “lossless” is a distraction which has almost completely obfuscated any sensible discussion of useful improvements to the way normal people hear music.
It’s a common misconception to measure expected audio quality in terms of bitrate. Intuitively, it seems as if more data will mean higher quality, but this isn’t always the case. The trouble with lossless codecs is that they’re very inefficient – even a compressed lossless format like FLAC or ALAC is generally encoding things that humans simply cannot hear.
It helps to consider the bitrate not as a measure of the merits of an encoding system, but as a measure of its cost. We’re commonly encouraged to treat bitrate as a proxy for quality, but really this is like measuring the performance of a car by looking at it’s fuel consumption. True, fast cars use a lot of petrol, but so do bad ones.
We might consider the amount of data required to transmit a page of text. As a text file, it might take up a few kilobytes. If we take a high resolution photograph of the page, it might yield a thousand times as much data, but when it is read aloud, it will sound exactly the same. We could use a microscope to photograph every fibre on the surface of the page, but if what we want to do is read the text, there’s a lot of data there we simply don’t need.
People don’t seem to have a problem with this when it comes to pictures. Nobody says “I won’t look at a website unless all the images are TIFF files”, because that’s plainly ridiculous. We’ve all seen badly compressed images on the Internet, and we’ve all seen beautiful ones too. We understand that “what it looks like” is the reliable measure of, well, what it looks like.
Eyes work differently to ears, though. Eyes are much harder to bamboozle with plausible-sounding pseudoscience. This is why there is no market for super-high-end TVs which reproduce infra-red and ultraviolet light. We all just accept that these are parts of the electromagnetic spectrum that we cannot see, and we leave it at that.
One of the (many) things my company does is to help broadcasters to encode audio for delivery to consumers. When they look into it, they almost always settle on AAC – and not because they’re too cheap to store something bigger.
The fact is that AAC is efficient. Bit for bit, it achieves higher audio quality than just about any other method of storing digital audio. AAC works at a variety of bitrates. It would theoretically be possible to use something like AAC at 1411kbps. If you did that, you’d achieve far higher quality than a CD can store.
Why isn’t this done? Well, AAC is a perceptual codec, which means that its success at reproducing a sound is measured by examining what users hear. When figuring out which bits of the sound to keep, the focus on the bits that are audible. In rigorous double-blind tests published in peer-reviewed journals, nobody could find any point in encoding AAC at a higher bitrate than 256kbps. Without preconceptions to guide them, under test conditions, people simply couldn’t tell the difference between 256kbps Vbr AAC and the highest quality studio masters. Consistently.
Of course, this doesn’t mean that all AAC files sound great. To make a 16-bit CD from a 24-bit studio master, you normally add a small amount of noise in a process called dithering, to improve the dynamic range.
This noise is unnecessary in AAC encoding. By skipping this step, and by avoiding the very loudest signals than can cause distortion on decoding, it is possible to create an AAC from a studio master which more accurately reflects the audible portions of the original than is possible with a CD or uncompressed PCM WAV file.
This is how Mastered for iTunes works. Although not marketed very effectively, it’s really rather clever. Instead of saddling the user’s storage and bandwidth with inefficiently stored data and sounds they cannot hear, Apple has pushed the work back onto the producers. We do some extra work to make a more efficient master, and the consumer gets better sound with less than a fifth of the data.
There are circumstances where it makes sense to record audio at a higher degree of fidelity than is perceptible to the human ear, but once a record is finished, there’s no harm in throwing out the parts nobody can hear. When you buy a Mastered for iTunes AAC, you’re getting less data, but you’re still getting all of the music.*
*Unless you’re a dog. If you’re a dog, SACD or 96kbps downloads will sound noticeably better than CDs**. Don’t buy anything over 192khz, though. People who sell 384khz downloads to dogs are ripping them off. That stuff is for bats. They are most discerning customers.
** Perhaps Mark Swed is getting his dog to write his reviews for him. It certainly is an alternative explanation. If I had a literate dog, “music critic” would not be the way I exploited it for financial gain.