30 thoughts on “Why you’re doing audio levels wrong, and why it really does matter

  1. I produce Snap Judgment. I have a great professional staff that mixes the show. They also produce music and live sound. We spend countless hours mixing with our ears, our meters, on speakers and in headphones. We aim for consistency throughout the course of the hour. We mix our show to -3db max. That level is music plus dialog. We are a sound-rich show that mixes hip-hop, electronica, jazz, foley, and sound effects all to create our storytelling experience. In our show, the story is king. That means the speaker must be heard. If you lose the speaker, you lose the plot. It is also our aim to create an immersive experience. We do take full advantage of the spectrum, but we still leave headroom for fluctuations in volume.

    Let’s address the 6db change that people are calling the turnoff point. I try to make it so the entire show is balanced and consistent. That means when you turn on Snap Judgment, you set your level once, and you are good for an hour. If you play my show next to the Moth which is all dialog, then yes you should adjust the volume. But once. We try to make it so people can listen in their car, on their headphones, and while they are doing the dishes.

    Finally, go look at the wave forms for produced music these days. The loudness wars have long existed in music, and the standards have definitely given way to wave forms that look like a solid block (that means they are loud as hell – compressed beyond recognition) All compression is not bad. Yes, we compress, but it is an art form. It’s done not just with meters, but our ears. We have meetings all the time about that level is too hot, or that song is distracting. We try to bring new and interesting music into our stories in a way that the
    audience has really responded to. We have a good following around the
    country and our sound is really working for us. We are very open to what people have to say about our show. We take feedback all the time. If you would like to reach us just send an email to stations at snap judgment dot org.

    Mark Ristich
    Executive Producer
    Snap Judgment

    • Hey Mark, am I reading a defensive tone in your comments here? Have people been giving you static about your levels? I think you guys are one of the few truly pro-sounding shops out there, along with RadioLab.

        • Cool. Just in case it isn’t clear to anyone, my opinion (for what it’s worth), is that shows like Snap, OTM, and RL are right, and most others are too quiet.

    • Sorry I’m coming to this party so late…and I’m going to preface my comments with some background: when I listen to Snap Judgment, it’s usually in the car and my car is pretty noisy. It’s also usually on WELH 88.1 and our STL could stand to be a lot cleaner than it is. Also, we use a Broadcast Warehouse DSPX-FM-Mini-SE on the TALK2 preset. This is a “good” processor, but not a “great” one. Finally, I personally have trouble understanding people with accents unless and until I spend a few months interacting with them.

      Mark, I think the levels are *consistent* in Snap, but frankly, I often have a VERY hard time listening to Snap Judgment. It’s not that I have to fiddle with my volume knob a lot, it’s that Glynn speaks so incredibly fast that he’s hard for me to understand under the best of circumstances, and the effect is exacerbated by the strong presence of music and sound effects throughout the show.

      To my ears…and I want to emphasize that…part of the problem is that Glynn’s voice skews heavily towards bass. Or perhaps there’s some combination of vocal qualities that Glynn has that makes our processor skew his voice towards bass. And of course, more bass and less midrange = muddied-sounding audio.

      I don’t know that this is really a “problem” for SJ, much less something you can “fix”. It’s more just one report from one listener on one station (although when I listened to SJ on K272DT 102.3FM in Santa Barbara CA…aka KCLU…back in 2011, I recall I had the same problems). Take it for what you will.

      FWIW, other than the tonal issues, I do rather like Snap overall. The stories are great and the method of storytelling is entertaining as hell. :)

      • BTW, I should add that I also speak really, really fast. It’s something our pledge producers often yell at me during pledge drives when I’m pitching. I’ve gotten better about using deliberate slow-downs to emphasize a particular point I’m making, but I’m still not very good about keeping my default talking speed at something below Warp 8.

        OTOH, a lot of New Englanders talk really damn fast, anyways, and we usually don’t have trouble understanding each other. Perhaps we’re all just a bunch of provincial hicks. :)

  2. This just might be the best audio-for-radio piece I’ve ever read since my college textbooks back who knows when. I must admit, before digital hit I loved mixing to the little speaker in the Studer A807 reel-to-reel decks. It was a good reference in its day.

    • Thanks! You know, Keefe mentioned to me in our interview (audio at the top) that he often listens on cue speakers to hear how his mix will sound lo-fi.

  3. I really appreciate this piece. Fresh Air is the WORST. My husband and I frequently listen to the podcast as we’re making dinner – wait, we USED to listen to the podcast. Between the sound of pots boiling, the stove fan, etc, I was constantly running over to the stereo to adjust the volume. So annoying that I now miss a lot of Fresh Air and listen, instead, to shows that I can hear over the sound of cooking. Sound Opinions comes to mind.

    When I was mixing, I ALWAYS took the extra step of listening through the board monitor with my eyes closed, which I reckoned was the equivalent of listening to my crappy car radio (which I usually listen to with my eyes open, BTW). I think part of it was that I was one of the last people in the universe to cut tape as I was being trained.

  4. This is really only a new issue because of podcasts. Stations can, do, and should manage their program levels to aim for whatever that station determines is their “standard level” That’s why we use an audio console and not an audio router.

    Where it gets a bit more complicated is automated/ HD-2/3/12 stations where there’s often not a human being behind the feed, but again if a show is consistently 6db too high, just automate, and consider using a feature almost all systems have nowadays, an automix.

    • Sure, but this article is aimed at producers, not station engineers. I’m trying to help people create a product that’s idiot proof, will sound good anywhere, etc.

      • Totally, it’s just interesting because Fresh Air, for example, does have a professional engineer attached to their program. Mastering is always tricky, hopefully one day we’ll all settle on a smaller range.

  5. Hi all. I’m really happy that this article is out there and is continuing the conversation! This feels like a solvable problem.

    I want to clarify something that I think is muddying the waters a bit. The working group is focused on finding a solution to provide consistent levels in production and *delivery* to stations. The goal is to get distribution to a consistent place.

    The results in consistency would then obviously carry over in to podcasts, streams, segmentation, etc.

    So I don’t think you’ll see a spec or recommendation coming in relation to podcasts or streams as a result of this work – only distribution practices. But it will give content creators a consistent starting point to influence those other conversations.

    Consistency is key!

  6. Excellent article, Mr. Ragusea! and you’ve made a very good attempt at distinguishing the related but distinct concepts of level (energy level), loudness perception and dynamic range.
    I find non-audio engineering trained content producers often confuse the idea of loudness management, which they should be enthusiastically embracing, with dynamic range compression, of which they should be justifiably wary (but certainly, judiciously, use).

    To address Mr. Federa’s comment regarding automated stations – it’s easy to automate for level, but not for *loudness* – at least not in real time; Loudness preprocessing should work fine, but that’s not necessarily practical for all stations (since files have to be processed before transmission). This is why producers need to raise their consciousness regarding loudness, and produce to loudness targets while maintaining reasonable dynamic range.
    And the loudness targets of broadcast and streaming may not be the same – so pre-processing at distribution points may be required.

  7. I fear that Adam is advocating the idea that content should be mastered for the least-common-denominator of listening systems. Doesn’t it short-change the attentive listener with a nice, full-range stereo in a quiet room if we produce the content to be intelligible on a cell-phone speaker in a moving car? How do you master something to simultaneously be a compelling, immersive audio experience, and yet be easily ignored as audio wallpaper at the dentist’s office? Mastering too frequently targets a “typical” listening situation, which I argue doesn’t exist.

    The cinema industry recognized this problem over 20 years ago and came up with the concepts of dialog normalization and decoding-side compression. We need a distribution standard, as Rob Byers advocates. But also, we need pressure on the playback product manufacturers and developers to put more of the level accommodation for end-listener environments into the playback side, and we need the tools on the encoding side to add our target level and compression hints data. That way, the highest quality program can be distributed everywhere, and the local playback system (or radio station automation) can reduce that quality as local conditions dictate.

    Finally, even though, “…most of the audio production in public radio isn’t done by professional engineers anymore,” those doing the work still need to produce results as if they are. We need tools that let us LU-normalize our component clips before we mix. Such tools are expensive and hard to come by at the moment, though you can get part way there using RMS normalization if your clips are consistent internally. But if we demand LU-normalization tools and metering as features in our production environments, we’ll start to seem them as standard equipment.

    By all means, let’s raise the average quality of engineering. Let’s establish and adhere to average loudness standards. But let’s not master our content to best suit the worst listening environments.

    • How do you master something to simultaneously be a compelling, immersive audio experience, and yet be easily ignored as audio wallpaper at the dentist’s office?

      Honestly Steve, music producers have been doing just that for decades! They throw together a rough mix and listen to it on their expensive studio monitor set-ups, then they listen on headphones, then they listen to it in their cars, then they throw it on a mono shower radio, etc.

      They figure it’s their job is to craft a mix that will sound good everywhere, just as it’s the job of a web developer to code a site such that it displays nicely on the latest, greatest version of Chrome as well as the crappy old version of IE that half of Americans have on their office computers and can’t update.

      I mean, I sure like your fantasy of a world in which there’s a market for super hi-fi radio meant to be consumed blindfolded in an anechoic chamber. I would like to live in that world, but I don’t think it exists.

      The closest thing we have to a show that demands/rewards that type of attentive listening is Radiolab. And yet, as I describe in my article, Radiolab is simultaneously the show that sounds best in a car on the interstate, because the mixes are so clear, consistent, and yes, LOUD.

      I think one of the big reasons radio has remained so resilient through these years of media upheaval is that the technology involved in consuming it is stupid simple, cheap and ubiqitous. That includes both digital and terrestrial listening.

      • First, Adam, I need to say that I found your article well researched and well presented. The facts are all accurate and the argument is strong. I intend to use it as a reference for producers of modest technical background. However, you wrote…

        “Honestly Steve, music producers have been doing just that for decades! ”

        Except, they haven’t. I offer as evidence exhibit #1 your own example of the music loudness wars. Exhibit #2 is the (admittedly niche) increasing popularity of vintage vinyl recordings, partially because of the minimal dynamic processing. Exhibit #3 is Neil Young’s Pono. :-)

        I mastered the podcast work I did in the previous decade as close to the top as possible, thinking exactly as you are in the closing of this article. However, then I got to see how the motion picture industry solved this problem, 20 years ago. And I see how they continue to solve it today in the face of wide ranging extremes of “3D surround” systems like Dolby Atmos, all of the way down to people watching movies on their tablets on airplanes. And they do this by distributing one, full-dynamic-range mix, and designing the players at the point of consumption to compensate.

        I’ll grant you that we’re not going to get there overnight in the public radio world. But it’s a better long-term goal than limiting and compressing our way toward appeasing listeners in the worst of situations.

        • Thanks for your kind words Steve. Back atcha.

          “It’s the very naive producer who works only on optimum systems,” or so said Brian Eno. And I could dig up a lot of similar quotes.

          I think you are conflating the loudness war with what I’m talking about. Records haven’t been getting louder so that they’ll work on hi-fi and lo-fi systems. They’ve been getting louder because producers are locked in an arms race, enabled by advancing technology, to make sure their track doesn’t sound wimpy when heard adjacent to the next guy’s.

          Vinyl records are indeed getting more popular, but only as a share of an overall recorded music market that’s shrinking rapidly. Also, I think you may overestimate the extent to which records are mastered differently for vinyl.

          Pono is both a dream and a rip-off. Not even the most persnickety audiophile could tell the difference between a high bitrate mp3 and a wav. I hereby offer the fabulous prize of a date with your’s truly to anyone who can prove me wrong in a blind taste test.

          Anyway, I’m hardly advocating for radio shows to be compressed into brick walls. I’m saying that we, as an industry, need to get a little louder and more even. That’s hardly controversial, right?

  8. We use a “magic box” in our airchain for OTA FM and streaming. it’s called a Aphex Compellor. It’s been on the market for almost two decades. For the most part, it doesn’t corrupt the sound quality, yet it keeps audio within normal, listenable range. Sometimes I think folks might try to overthink things a bit too much. When there’s a appropriate fix available, it probably would just be best to go get it and use it, not trying to reinvent the “wheel” by getting a herd of cats to march in line, asking content providers to “watch their levels”. Streaming is such a huge part of public radio listenership these days. 1000 dollars fixes a lot of things including poor level control with a Compellor.

    • Ahh, the Compellor. Responsible for some of my favorite guitar sounds of 80s, notably Bad Religion’s “No Control” album.

  9. Old news. . . . just now reaching npr. Dolby Labs indicated a problem with film soundtracks 20 years ago and developed a meter with which to grade films and trailers. I am not sure that this is still being used however. Has been used to check TV audio as well.

    • Indeed, as I discussed in the article, the LU standard is old and has been used in TV for awhile. It’s new to American public radio, which is what the article is about.

  10. Just reading this linked from The Pub – super interesting for a techie geek and regular podcast listener. And the Loudness Units presentation video made my day. Thanks for sharing this stuff – some of us really do want to know how the sausage is made (and then squeezed into pieces that are then chopped up by local affiliates…). Thanks!

      • Sorry, it was a rebuttal to your Jedi/reference early in the article.

        Very good article, by the way. I used to work at a station where the GM would have me ‘re-edit’ a program we acquired from Southern California (which was admittedly, poorly produced) and adjust all the levels ‘up’ purely by visual reference of the wave forms. I would be editing and than loading the program segments into our automation moments before we aired the program -occasionally, I didn’t get a segment uploaded in time and a segment from the previous days’ show would air. After two weeks, I gave up and just started using the Audition Hard-Limiter and let others at the station think I was doing it by the prescribed methods.

Leave a Reply

Your email address will not be published.