From ccc4dd89ac13d7fee39f5ee778146b7dbdda8a39 Mon Sep 17 00:00:00 2001 From: "Steinar H. Gunderson" Date: Wed, 9 Nov 2016 01:58:55 +0100 Subject: [PATCH] Write more about audio. --- audio.rst | 117 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 110 insertions(+), 7 deletions(-) diff --git a/audio.rst b/audio.rst index 2aedd36..e1b8d19 100644 --- a/audio.rst +++ b/audio.rst @@ -45,13 +45,84 @@ can do. (In fact, simple mode constructs a multichannel setup behind-the-scenes and then runs the multichannel audio code.) +Audio meters +------------ + +.. image:: images/level-meters.png + +When setting overall audio levels, there are two important goals: +To keep a reasonable **perceived loudness**, and to **avoid clipping**. +Both are more subtle to measure than one would initially assume, +and there are many ways to misstep. In particular, pretty much any +naïve way of measuring loudness will fail; human hearing is, for instance, +much more sensitive in some frequencies than others. + +`EBU R128 `_ provides solid solutions +to both problems. It specifies a precise algorithm to calculate a +both *momentary* loudness (over short and medium time intervals; +Nageru uses the short measurement), and a *loudness range* over an +arbitrary amount of time. The loudness is measured in LU (loudness +units), which is a relative unit very much like decibels; there's +also LUFS (loudness unit relative to full scale), which is number of +LU compared to a given reference. + +EBU R128 specifies a *target loudness* (0 LU) of -23 LUFS +/- 1 LU; +if you keep your stream within this and don't have a huge range +in general, it will have a reasonable loudness on most viewers' +setups. The left meter shows the momentary loudness (over the short +400 ms intervals), and the right meter shows the loudness range, +with the target shown as a box. If you are within the target, +the box turns green; otherwise, it is red. Both meters show +1 LU as one segment, with the highest value being +9 LU +(compared to the reference level) and the lowest being -18 LU. + +Even if the overall loudness is correct, one needs to avoid clipping; +if samples go outside the allowed range, it will sound as clicking +or popping (or if many do, as extreme distortion). However, +just measuring the value of every single sample is not good enough; +since the client might do its own resampling and processing, +we also need to account for *inter-sample peaks*. Nageru, in line +with R128 recommendations, oversamples the audio by 4x and writes +the highest peak (in dBFS) below the left meter. Anything above +the R128 limit of -0.1 dBFS will make the meter turn red to alert +the operator that clipping has occurred. (In practice, this should +rarely happen due to the limiter; see the next section.) + +You can click the reset (RST) button to reset all the meters, including +the peak measurement. + +Finally, the very top contains a **correlation meter** measuring +the correlation between the left and right channel, which is +useful for checking the stereo image. It goes from -1 at the very +left (the channels are exact opposites of each other), via 0 in +the middle (the channels are totally uncorrelated), to +1 at +the very right (the channels are exactly the same). All of these +are indications of common issues: + + * A correlation meter that sits at exactly zero typically means + either the left or the both channel (or both) is silent. + * A correlation meter that sits at exactly +1 typically means + you are sending a mono stream. This could be intentional + (if you e.g. have only a single microphone), but if not, + it could indicate either a loose connector or stereo channels + panned wrong. + * Finally, a correlation meter that sits at negative values + for longer periods of time indicate that one of the channels + is inverted (the phase is wrong), and could sound odd on + speaker setups. However, certain kinds of reverb or other + effects could also cause this, so it could be benign. + +A healthy stereo stream will usually have a correlation somewhere +around 0.7–0.8, and this section is marked in green. + + The audio strip --------------- -.. image:: images/basic-ui.png +.. image:: images/audio-strip.png -The audio strip contains the processing chain for the audio from -start to end. Note that by default, everything is enabled; +The audio strip contains controls for the processing chain for the audio from +start to end, left to right. Note that by default, everything is enabled; if you have a premade audio mix that you are confident that you want 1:1 into the stream, you can start Nageru with the “--flat-audio” flag, that instead starts with everything disabled. @@ -64,11 +135,43 @@ noise that is not related to the speaker's voice. (If you were producing music, you'd probably want it there to make room for music *under* it, but the you'd want it higher than the default 120 Hz.) -(TODO: write more) - +Next comes a chain of no less than four compressors. They are +based on the same basic structure, but have very different settings, +and fill very different roles. + +The first compressor is the **gain staging**, or auto-leveler; +it is very slow, with 500 ms attack time and 20 second release time. +Its purpose is to set the overall level for the next compressor +in the chain (so that it is slightly over its threshold); +if you have a pretty consistent input signal, you can uncheck +the “Auto” box and just set a static value manually. + +The second compressor is the **actual compressor**. It is much +faster, with typical voice settings (5 ms attack, 40 ms release). +It has the effect of making the voice sound a bit tighter, +more level and overall better; if you have multiple things +in the mix, it will also bring them somewhat closer together. +(In general, a compressor gives the signal less dynamic range +by making it quieter, which allows you to gain it more up in +a later stage, so that it can get louder overall. It's a bit +paradoxical if you're not used to it.) + +You can adjust the threshold if you wish, or disable the compressor +altogether if your signal is already mastered. Note that if the +gain staging is not set so that this compressor gets an input signal +that's loud enough, it won't do anything to it. + +At this point, the mastering section begins; for simple audio, +the distinction won't matter, but for multichannel, the previous +effects are separate per-bus and the remaining are applied +after the mix. (More on this below.) The mastering section begins +with a **limiter**, basically a compressor with very high ratio. +It's there as an emergency brake for really loud compressors +that got through the other compressors—a classic example is a +speaker suddenly coughing, or a very loud bass drum. This prevents +both clipping and blowing out the speakers' ears. -Audio meters ------------- +(TODO: write more) Multichannel mode -- 2.39.2