Forgotten Audio Formats: MP3

The year was 1994.

Music was as popular as ever, with rock bands like Nirvana and The Smashing Pumpkins, pop artists like Ace of Base and Mariah Carey, and soul artists like Boyz II Men and Janet Jackson selling millions of albums.

The music industry was healthy and investing in new artists. Thousands of people were employed to record, catalog, distribute, market, and keep the books for successful recording artists.

This B-side collection by The Smashing Pumpkins sold 1,000,000 copies in america in just a few months to go certified platinum. That’s 1 million CD’s sold, not youtube views.

Music could be consumed on multiple formats and most people had a mixed bag for their own collection: analog vinyl LP’s and cassettes along with digital CD’s.

Other physical formats existed like reel-to-reel and LaserDisc but were tiny markets. DAT and DSD were still years away.

File-only digital had just begun with the WAV format being released in 1991, but a CD held more data than most hard drives.


In the tech world a trend was accelerating that would forever change the music industry: hard drive price per megabyte:

1988 – $16
1989 – $12
1990 – $9
1991 – $7
1992 – $4
1992 – $2
1993 – $0.95
1994 – $0.81
1995 – $0.68
1996 – $0.21

1 CD worth of drive space would have cost $10k in 1988!

By 1994 it was $526. By 1996 you would have spent around $135 for 650mb of HD space.

But the 650mb CD cost pennies to manufacture and sold at retail for $20. Plus they were proving to be pretty durable and CD-R’s were coming down in price. CD was the digital format of necessity unless and until something drastically changed with either the bandwidth needed or bandwidth available.

Don’t forget: bandwidth = moving storage.  aka Storage = static bandwidth.


So the same software engineers who came up with lossy JPG image compression were called upon to investigate audio and video compression. Their goal – to get the file size small enough for 1990’s bandwidth.

For music testing they used contemporary music (Suzanne Vega) and developed what they called perceptual coding.

Perceptual coding targeted all the parts of mixed music that were open to perception beyond the main focus of the song (melody and beat): things like transients, pan/placement, room and soundstage size, timbre of instruments, blending of sounds, that type of thing.

Remember hi-hats? MP3 crushed them into non-existence.

These audible cues are all present in mixed music but are unmeasurable. They are all nearly impossible to explain and communicate verbally or through written language.

You may know it when you hear it, but it’s not possible to explain further in a controlled, consistent, scientific way. No matter how descriptive you are, the next person will use completely different terms.

This listener confusion and lack of terminology made the engineers jobs far easier. They found that they could remove nearly 90% of the audio data before testers consistently identified a difference using their flawed testing methods.

A few 90’s mp3 engineers, not audio engineers.

 

This gave them the green light they needed. The MP3 specification was published and started to catch on. A 50mb WAV file was now a 5mb MP3 file and life was good!

It was true – at first listen, they almost sounded like the original. It took a more critical listen, or repeated listens, to pick out the degradation, and over time many came to hate the MP3 sound. Casual listeners didn’t care as much, but professionals, musicians, and audiophile-types rejected MP3 as lossy.

Sound quality was secondary though. Finally computers could play near-full quality music! Digital file-based convenience had arrived.

Finally modems and networks could send the files around! Finally bootlegging was convenient!


MP3 was quite popular in it’s time. Nearly every device made could play MP3 files, including phone’s, video games, TV’s, and wireless speakers.

Early MP3 player

But MP3 had no artwork beyond a tiny cover. No lyrics. No credits. No booklet. No shout outs. Nothing to attach to. It was highly bootlegged and for some time, recorded music lost all value.

It also required almost no people to distribute or sell. Nothing to sell & nothing to move = nothing to promote. Nothing to invest in.

Bootlegging ran rampant and the music industry practically folded. Most musicians stopped making money from their music.

Limping along, MP3 got one quality improvement in 2009 (aac), but it wasn’t going to help much. By 2014 streaming was stealing the download market.

Streaming takes everything bad about MP3’s and extends it to the rental model.

Now you own nothing. You just pay a subscription to hear degraded versions of your favorite songs in between commercials. Don’t pay up? No music for you.


The current streaming business model is unsustainable for both the license holders and the license purchasers, but in this post-fact world it really doesn’t matter. Quality has been trumped.

Lossless formats like FLAC, around for years, finally took off around 2016, giving critical listeners an open format to rally around. Buying hi-res music from sites like HDTracks ProStudioMasters was a thing again. Hi-res hi-fi DAP’s finally emerged in many markets. 24bit FLAC continues to offer higher-resolution files with no DRM.

Bandwidth/storage is now available. I have 60+ full lossless albums on a card the size of my pinky nail. I have the bandwidth into the house to stream 24bit audio, if anyone offered it.

One can only hope that the MP3 era is the last time we accept such a massive downgrade in quality.

#SaveTheAudio

 

The Power of Labels

Degrade -d  

  • treat or regard with contempt or disrespect
  • lower the character or quality of
  • reduce to a lower rank, especially as a punishment

Synonyms: demean, debase, cheapen, devalue, shame, humiliate, mortify, abase, dishonor, dehumanize, brutalize, lossy

 


Original   

  • present or existing from the beginning; first or earliest
  • created directly and personally by a particular artist; not a copy or imitation

Synonyms: authentic, genuine, actual, true, bona fide, kosher, archetype, prototype, source, master, lossless

 


Do you think mp3 would be nearly as popular if it was called the devalued version or dehumanized version? 

Do you think lossless would be ignored by the masses if it was called the original version or the true version?

Of course not – this is the power of labels. Marketers and politicians understand this and use it against us. We must see through the subtle brainwashing, this trick of words.


TLmatched

This is not an audio wave but it caught your eye didn’t it?


Lossy sounds like a cool nickname on purpose. It’s all marketing. They figured out how to sell us less for the same and have been doing it for nearly 16 years now.

The various limitations requiring degradation of our fucking music have expired – leaving only greed.

dictionary-page

 

 

Lossy Is Hurting Us

 

Cedar_Point_beach_view_from_Sky_Ride_2013_resize

Summer fun in full resolution: Cedar Point, Ohio looking out over Lake Erie.

 

Windows Phone_20130621_02520130621194354

If you stream music or buy lossy files, here’s your version of summer fun. Close enough, right?

 

If you own a ponoplayer or another fancy modern 24bit digital audio player, you can experience this. Full resolution for all the music you love will return you to the quality you deserve.

 


Note 1 – I bet your browser showed the compressed image first. That’s why data compression exists – to get the file to you faster. Once they are both loaded, was the wait worth it?

Note 2 – Image is not audio. Audio has more detail, more nuance, and packs far more emotional cues than visuals.

#SaveTheAudio

 

 

The Art of Recorded Music

Studer_A80

A canvas. A monitor. A block of clay.

Human imagination is more fertile and expansive than all of them.

Human imagination is where the soundstage of recorded music is rendered.

640px-Shilkret_directing_Bain_Collection_(edited)

 

Creating sound for a recording takes planning. Even a simple voice over requires quieting the room, writing a script, and a doing a mic level check. Recording a band or larger unit requires extensive planning, both technical in nature and strategic from an artistic sense.

 

Foley_Room_at_the_Sound_Design_Campus_(cropped)

 

How many sounds are we trying to create? How many instruments, voices, microphones, and additional dubs? How many tracks per song? How many songs per album? These are artistic decisions mixed with lots of technical hum-drum (a million cables).

 

Eddie_Kramer

 

As the musicians and producer start to craft the songs they are already working on many layers.

The arrangement is one layer, actually each part within the arrangement is a layer. The type and style of sounds emanating from the instruments are another layer.

 

640px-P_Kolbe-13_Stern-Trio-1965_01

The feel or tempo of the songs is another layer. The prominence of each instrument in the mix is another layer.

The amount of soloing is another layer. I could go on. Some bands do indeed go on and on!

640px-Mervin_Solomon

My point? There is complexity here that gets painted into the soundstage of the final product. These layers of creation are not only intentionally put there, but fretted over in emotionally draining recording sessions, hour after hour.

There are screaming battles, insults, and hurt feelings as the artist sweats and bleeds for their art.

Pure creativity is buried throughout the mix.

Artists layer the sounds in their heads while recording engineers massage and place those sounds through the recording system.

640px-Diana_Yukawa_at_Abbey_Road_in_Studio_1

 

The blend of the sounds is critical. Each sound works within, against, and through all other sounds. This is known overall as the mix. It’s a most precious thing.

No medium has more depth than sound. Nothing – NOTHING, including color, mixes like sound.

No other medium works by fully enveloping the participant.

IMAX? IMAX is actually about 20% of your surroundings fixed in space with visible framing. A simple head turn or eye close makes IMAX = no-max.

Sound has no equal. This is why I fight so strongly these days against the lossy crowd, against the phones are fine for music, buy new headphones crowd.

Even my own friends. I have to remind them that reducing our music is reducing our soul and we should be very careful with such things.

 

Youkill Audio Youtube

lossless-jpg

Lossless data on the left. The right side is a visual representation of what we’ve been listening to for 20 years now.


Deets on Youtube’s audio handling:

Audio is streamed at either 128k or 320k mp3.

Everything defaults to 128k. You can only get the 320k audio stream by selecting the HD video quality. Some videos start in HD but most don’t. It’s also hard to embed HD youtube into other sites since it seems to default to the basic stream.

It appears there’s no FLAC streaming allowed and no lossy streaming of any kind.

The 320k mp3’s can sound decent, especially coming from 128k, but once you go lossless you won’t want to listen to lossy anymore.


deubert_fig06

Which is better? Neither. The compression on the left appears to have slightly fewer artifacts but neither is close to the original.

The Danger of Perceptual Coding

Perceptual coding is responsible for data loss that is greatly misunderstood and perhaps even dangerous to society.

What is perceptual coding ? It’s a data compression concept used in audio, video, and streaming technologies.

 


 

send-to-zip

ZIP is a lossless compression like FLAC. To permanently reduce media size, MP3 and AAC use perceptual coding to determine importance of data and permanently reduce it.


 

Why does perceptual compression exist? Native media files tend to be large. In the 90’s it was difficult to move these files around because they were too large for the network speed and storage prices of the time. Extreme data compression was needed.

A CD might hold 10 songs at 40mb each for a total of 400mb. How to get that 40mb song file small enough to fit through a dial-up modem and play on the other side in real-time?

The answer was perceptual coding, the trick behind lossy compression. It has been used for decades in voice transmission compression. You have to go inside the audio data and start throwing sound away.

 


 

PerceptualCoding

PerceptualCoding.pdf


 

 

But what sounds can be thrown away? How do you go inside of a mixed piece of music and delete things? And how far can you go before people notice a quality drop?

Perceptual coding can’t do things like delete the 2nd guitar solo or reduce the backing vocals, that can only be done in the mix of the song.

Perceptual coding also can’t make the song acoustic or shorter in length, those can only be done in the mixing stage.

What perceptual coding does do is analyze the sounds in the song and prioritize them. The programmers determined which sounds are more important on the scale.

First it locates the lead sounds – the main instruments/voices in the material.

There might be 5 primary sound makers in your song, let’s say drums, bass, guitar, keys, and voice. Perceptual coding manages to quarantine those and only removes small amounts of their identifying data.

This allows a listener to quickly ID the melody, the lyric, the artist, and the song since these primary elements are only slightly degraded.

 


 

lossy


 

But you can’t achieve 90% overall data reduction by only slightly degrading the material. Perceptual coding achieves the brunt of it’s loss from outside of the primary sounds.

This includes everything not inside the primary sound including the echoes and delays of the primary sounds. In fact all reverbs, delays and room sounds are attacked and removed. Other things outside the primary sound are timbre characteristics, breaths, string and instrument noise, room shape and activity, and soundstage timing cues. All of this is shorthanded to “the tone” and “the soundstage”.

By masking and/or deleting all kinds of sounds that they believe are unable to be reliably perceived* by listeners they achieve massive size decreases.

*What the smart DSP programmers behind perceptual coding understood is that while people can easily hear this loss in the music, most can’t identify it reliably and consistently using the same terminology, and good luck having any of this come out in the whacked-world of ABX listening tests.

If most can’t identify what is gone, but can identify the song and sing along, the codec is considered a success. And MP3 was and still is a huge success by those metrics.

But listen to Ghost in the MP3 to hear an idea of what perceptual coding takes away from your music.

 


MGUI1k_oNjN-Jy6LJbYYVTl72eJkfbmt4t8yenImKBV9ip2J1EIeUzA9paTSgKmv


 

The destruction of all of the natural movement, transients, and timing cues has a long lasting effect on our music, which has a long lasting effect on our psyche.

The things that perceptual coding deems unnecessary and inaudible are in fact the critical emotional elements of the music.

This amounts to a perceptual loss in all modern music and is the reason behind two trends: 1- robotic voices with fake instruments, and 2- hyper-fast switching of sounds from disparate sources with heavily active pan and audio limiter settings.

When your end result is forced to be artificial and limited in size and range, hip producers know to co-opt the weaknesses and make them strengths. The more artificial and huge you can sound the better.

No point in producing realism when there is none at the distribution.


 

256px-Lichtenstein_jpeg_difference

An approximation of lost data from this image after lossy compression.

Spotify Wants Your Profile For The Highest Bidder

While Pono makes news with their righteous promise to give you free content upgrades for life, Spotify is making news with an update to their privacy policy that informs the users of their service – particularly the free subscription tier – through a million words of legalese that they are agreeing to share their contact, photo album, location data, browsing history and Facebook profile in order to listen to music on the service.

725cd197ea03b892518154fa03b57043

Give your life away to hear rented 10% music files?  Haha yeah right.

Even previously happy Spotify customers are canceling subscriptions over this new (yet totally predictable) revenue stream.

Low-vs-High-Quality-Image

 

I’ve been saying for a couple of years that the streaming services aren’t going to make it. I know they continue to get more and more subscribers, and more listeners. More 10%’ers.

low

But they can’t sustain their business because there is no margin. They can barely pay the crazy-low royalties now, and they won’t be able to pay the increased royalties in the near future.  Advertisers will ruin the service trying to get those clicks.

 

 

You simply can’t give access to the world’s entire catalog of music for $0.30 a day, there’s no margin there. There’s too much good music out there with more being made every day. This model will not sustain.


 

Buy your music people, whether it’s vinyl or digital download, and try to buy the highest quality you can get. The rental model is a disaster in the process.

dgb

Spend the $120/year that used to go to Spotify on buying legal retail music and trading with your actual friends and the music industry will survive and prosper.

IMG_7842

Own your own music in full quality, non-tracking, files. Stop renting 10% versions for your digital sanity. Actual social media is enjoying music with other people.

Apple’s Upcoming Music Announcement

Will it have anything to do with sound quality?  I doubt it.

Apple likes to roll out new products with slick presentations touting all of the improvements in the product, or how the new product improves upon an existing solution.

This new rumored streaming audio service (a re-branded Beats Music service) looks like more of the same – random, computer generated playlists or hunt & peck streaming at a compressed rate, trying it’s damnedest to sell you that same compressed copy to own.

No one wants to buy those compressed little MP3’s when you can stream them. If they were smart enough to offer an HD version of the song I bet people would buy more when streaming. I know I would.

A new walkman sounded better than the old one. What happened?

A new walkman usually sounded better than the old one. What happened?

Since iPod shipped 14 years ago, I can recall one single upgrade to the sound quality in Apple’s iTunes ecosystem. This was around 2009 when they introduced the “mastered for iTunes” program, that allowed you to deliver files in 24bit lossless but they would not sell the HD version, they reduce it to 320k AAC (apple’s version of MP3) and sell it for $1.29 a track instead of $.99.

All of this is why I have a PonoPlayer and haven’t looked back. iTunes was always a toy musically, and since they’ve made absolutely no effort to really improve sound quality in 15 years, it’s even more of a toy.

The sad thing is how popular it is, with millions of people listening to tinny, distorted audio devices playing horribly compressed files. None of it is necessary anymore but it lives on as “The modern way”.  A huge decrease in quality in the name of perceived convenience.

Breathophile

I love air. I really enjoy breathing, and I do it everyday.

It’s what drives me and is perhaps the most important thing in my life.

I don’t want it constricted or contain some odor of unfamiliarity.

Chiang Mai Open Sewer

I won’t accept known poison unless, of course, I like the way it feels.

This is why it’s important to keep it clean. This is why I am a breathophile.

You can accept poor smelly air or you can move to somewhere better.

800px-Relaxing


I love music. I really enjoy making it, and I play it every day.

It’s what drives me and is perhaps the most important thing in my life.

I don’t want it constricted or contain some odor of unfamiliarity.

450px-Fredric_Effects_Harmonic_Percolator_-_front

I won’t accept known poison unless, of course, I like the way it feels.

This is why it’s important to keep it clean. This is why I am a audiophile music lover.

You can accept poor quality mp3’s on phones or you can move to somewhere better.

pono-player-ces-2015-anewdomain-375x195

The Ghost in the MP3

Excellent work by Ryan letting you hear an approximation of what they are removing from MP3 files when doing “lossy” compression.

This is what the MP3 programmers deem unimportant in your music. You can play the video with it’s own lossy audio, or go here to hear the full version of what they pull from your music to make MP3 files.

Most of what is cut out is spatial — reverbs, room sound, delays, decays, fade outs, dynamics, lots of pre-delays, layering of sounds, attacks, breathes, etc..

This is the movement and the emotional content of the song. The interacting layers is the kind of data that computer programmers (and digital internet babies) can’t quite measure, so they disregard it. That’s scientific method at work – if you can’t measure or control it, disregard it.

This is important listening and will help you to understand that hearing music is more than frequencies.

I would love to see someone do this type of experiment with a 24bit mix and a 16bit mix of the same music.

 

enhanced-buzz-11125-1347395176-28

Don’t take away my reverb and delays! The power of Bonzo is a result of decay, delay, and room sound.

40 Years of Recorded Music Distribution

Vintage baby

Vintage jams

Quick history lesson —

Digital audio made it’s public debut with the CD standard known as “RedBook”, started in 1978. A collaboration between Phillips & Sony, the CD standard was originally going to be 14bit/40k with error correction and ship on a 115mm disc, but Sony pushed for 16bit/44k with no error correction. A VP of Sony also pushed to increase the total run-time from 60 minutes to 74 minutes, warranting the disc be enlarged to 120mm, and ruining Phillips’ early investment in a plant already printing the 115mm discs! Corporate intrigue for sure.

The RedBook standard was finalized in 1980 and CD players started hitting the shelves by 1982. To this day RedBook is owned by Phillips and costs a manufacturer over $300 to download the specifications. Why the name RedBook? The engineers compiling the specifications did so in a red binder. Engineers aren’t known for creativity ;-).

In the marketplace, the new digital CD’s had numerous advantages over the two existing analog formats of vinyl albums and cassettes. To list a few: no dust problems, little heat warping, less vibration-induced skipping, couldn’t unwind or tangle, vertical storage no longer needed, no replaceable stylus, not magnetic, liquid-proof, instant auto cue. Also there’s the indefinite duplication with no loss in quality on the copy or the original – that’s a huge advantage for digital.

[table id=1 /]

But CD’s did not clearly “sound better” than vinyl when all the other issues were addressed. Most of those issues are considered interference or physical media issues. None of them address how the actual recorded music is presented. All music sounds best live, as the microphone is not able to recreate our auditory system. Did CD’s actually sound “better” than analog once playback and media issues were addressed?

This has been a sticking point since the early 80’s. Many of us could hear something missing from CD’s, and it wasn’t just dust and motor noise from the turntable. It was the stuff that is nearly impossible to describe in words: reverbs and decays were different, the timbre of cymbals, voices, and stringed instruments were different, the mid-lows weren’t as warm or round, delays didn’t seem as present or accurate, the stereo-width wasn’t as obvious, the center was hard to find, the top was very pronounced and brittle, some complained of a boxy sound or a digital graininess.

The 1980’s didn’t just bring CD’s to market, it brought us personal computers and the early internet. By 1990 the same group that was working on the JPEG digital picture compression standard starting working on a media compression format. MPEG was designed for squashing CD-quality audio files small enough to stream on dial-up modems. By the mid-90’s the mpeg format was in use and competing with other early digital audio formats like RealAudio.

Now that the music could be squashed to an easily tradable size, piracy ran rampant. The late 1990’s brought us mp3 (after mpeg-1 and mpeg-2). Napster, peer to peer file sharing, bad DRM attempts (security on audio files), and ultimately led to a rapid decline of the music industry. Everything was being stolen and fewer hard copies were selling. The new mp3 files were perfect for trading online, and the novelty of this new convenience outranked the decline in sound quality. “Good enough” became the standard for sound quality.

Into this disaster stepped Apple, wisely seeing an opportunity to re-invent the personal audio player like the Walkman/Discman (stealing that market from Sony) and re-invent the record store (taking that market from traditional retailers). First they launched the player line “iPod” with it’s easy loading from your computer, then they opened the new record store with legal $1 songs and no-hassle purchasing.

Apple bet right and it took off (I bought music from there for a few years). I kept thinking I was getting ripped off though — where’s the hard copy with artwork that I can love, lose, find, loan out, break and buy another (or not?). All gone. Instead of our society going “paperless”, we went “album less”, to our detriment. We have been buying and streaming low-quality audio for over a decade now, and not always because of technical limitations.

That’s the end of this lesson, kiddies. The point here is that if you grew up in the mp3 era, you were listening to a compromise built on top of a compromise. 24bit HD Audio should be a revelatory listen for you.