Fix wording about border colors

[nageru-docs] / hdmisdi.rst
diff --git a/hdmisdi.rst b/hdmisdi.rst

index 80a7a4c81e3c5f65b98552a86b1c5811972411e0..f6f032cf48a29df3c27cb9e6a4b8359a2242d6d2 100644 (file)
--- a/hdmisdi.rst
+++ b/hdmisdi.rst
@@ -17,7 +17,8 @@ Setting up HDMI/SDI output
  
  Turning on HDMI/SDI output is simple; just right-click on the live view and
  select the output card. (Equivalently, you can access the same functionality
-from the _Video_ menu in the regular menu bar.) Currently, this is supported
+from the *Video* menu in the regular menu bar, or you can give the
+*--output-card=* parameter on the command line.) Currently, this is supported
  for DeckLink cards only (PCI/Thunderbolt), as the precise output protocol for
  the Intensity Shuttle cards is still unknown. The stream and recording will
  keep running just as before.
@@ -68,11 +69,169 @@ This section aims to illuminate some of the sources of latency and how to deal
  with them. Note that often, latency is at odds with throughput, and so,
  tradeoffs must be made. The most important sources of latency are:
  
-- Jitter and queuing latency
-- Processing latency
-- Output latency
-- Audio latency
-
-TODO: Write something about them.
-
-(TODO: Write something about time codes here.)
+ - **Frame transmission latency:** Unlike computer networks, HDMI and SDI
+   transmit their frames pretty much in real time, ie., sending one frame
+   takes one frame of time. For cut-through switching (which includes
+   HDMI → SDI conversion and the other way around), this doesn't really
+   matter, but Nageru has to receive the entire frame before it can start
+   processing it (and by extension, send the result frame out). Thus, you will
+   typically get one frame of latency just by having Nageru, or really any
+   switcher/mixer with digital effects, in the chain at all.
+
+ - **Jitter and queuing latency:** Unless you are lucky enough to have an
+   all-SDI setup where everything runs off of a shared reference clock,
+   frames on different devices, as well as on the output, will be at random
+   offsets from each other (and also drifting slowly, even if they are at
+   the same frame rate). Thus, some sort of *input queue* is needed for each
+   input card, and the time a frame spends in the queue before being picked
+   out for processing is by definition extra latency. (Note that this means
+   that latency is not a single number for the chain as a whole, but can
+   vary by input.)
+
+ - **Processing latency:** By definition, processing of each frame has to take
+   less than one frame's worth of time, or else the system can't keep up.
+   But if you have a fast GPU and/or do little processing, you can spend
+   significantly less. Thus, if you're after the lowest possible latency,
+   a faster GPU might help you shave off a fraction of a frame here.
+
+ - **Output latency:** Finally, cards have their own output queue,
+   and some will expect there to be multiple frames in it before outputting anything.
+   This is outside Nageru's control, unfortunately, but can easily add 2–3
+   frames of latency. If you want to avoid this, look for Blackmagic's “4K” series of
+   cards, which are of a newer, lower-latency design than the previous cards.
+   The 4K series in this context include everything that have “4K” in their
+   names, plus the Mini Recorder, Duo 2 and Quad 2 devices.
+
+Controlling latency
+...................
+
+Of the different sources of latency outlined in the previous section,
+the only one that is really under your control (short of buying faster
+or better hardware) is the input queue latency. By default, Nageru
+attempts to strike a balance between reducing latency and having to
+drop frames due to jitter; by looking at each queue's input length
+history, it attempts to find a “safe queue limit”, above which it
+can drop frames without risking underrun (which requires duplicating
+frames). However, if latency is more important to you than 100% smooth
+motion, you can override this by using the *--max-input-queue-frames=*
+flag; this is a hard limit on the number of frames that can be kept
+in the queue, on top of Nageru's own heuristics. It cannot be set lower
+than 1, or else all incoming frames would immediately get dropped
+on arrival.
+
+However, even though the other factors are largely outside your control,
+you still have to *account* for them. Nageru needs to know when to begin
+processing a frame, and it cannot do this adaptively; you need to give
+Nageru a latency budget for processing and output queueing, which tells it when
+to start processing a frame (by picking out the input frames available at that
+time). If a frame isn't processed in time for the output card to pick it up,
+it will be dropped, which means its effort was wasted. (Nageru will tell you
+on the terminal if this happens.) The latency budget is set by
+*--output-buffer-frames=*, where the default is a pretty generous 6.0,
+or 100 ms at 60 fps; if you want lower latency, this you probably want
+to adjust this value down to the point where Nageru starts complaining about
+dropped or late frames, and then a bit up again to get some margin.
+(But see the part about `audio latency <audio-latency>` below.) Note that
+the value can be fractional.
+
+As an exception to the above, Nageru also allows *slop*; if the frame is
+late but only a little (ie., less than the slop), it will give it on to the
+output card nevertheless and hope for forgiveness, which may or may not
+cause it to be displayed. The slop is set with *--output-slop-frames=*,
+where the default is 0.5 frames.
+
+
+.. _audio-latency:
+
+Audio latency
+.............
+
+Since Nageru does not require synchronized audio sources, neither to video
+nor to each other (which would require a common, locked reference clock for all
+capture and sound cards), it needs to *resample* incoming audio to match
+the rate of the master video clock. To avoid buffer underruns caused by
+uneven delivery of input audio, each card needs an audio input queue,
+just like the video input queue; by default, this is set to 100 ms, which then
+acts as a lower bound on your latency.
+
+If you want to reduce video latency, you will probably want to reduce audio
+latency correspondingly, or audio will arrive too late to be heard. You can
+adjust the audio latency with the *--audio-queue-length-ms=* flag, but notice
+that this value is in milliseconds, not in frames.
+
+Audio and video queue lengths do not need to match exactly; the two streams
+(audio and video) will be synchronized at playback, both for network streaming
+and for HDMI/SDI output.
+
+
+.. _measuring-latency:
+
+Measuring latency
+.................
+
+In order to optimize latency, it can be useful to measure it, but for most
+people, it's hard to measure delays precisely enough to distinguish reliably
+between e.g. 70 and 80 milliseconds by eye alone. Nageru gives you some simple
+tools that will help.
+
+The most direct is the flag *--print-video-latency*. This samples, for every
+100th frame, the latency of that frame through Nageru. More precisely,
+it measures the wall clock time from the point where the frame is received from
+the input card driver (and put into the input queue) to up to four different
+points:
+
+ * **Mixer latency:** The frame is done processing on the GPU.
+ * **Quick Sync latency:** The frame is through :ref:`VA-API H.264 encoding <digital-intermediate>`
+   and ready to be muxed to disk. (Note that the mux might still be waiting
+   for audio before actually outputting the frame.)
+ * **x264 latency:** The frame is through :ref:`x264 encoding <transcoded-streaming>`
+   and ready to be muxed to disk and/or the network. (Same caveat about the
+   mux as the previous point.)
+ * **DeckLink output latency:** The HDMI/SDI output card reports that it has
+   shown the frame.
+
+As every output frame can depend on multiple input frames, each with different
+input queue latencies, latencies will be measured for each of them, and the
+lowest and highest will be printed. Do note that the measurement is still done
+over a single *output* frame; it is *not* a measurement over the last 100
+output frames, even though the statistics are only printed every 100th.
+
+For more precise measurements, you can use Prometheus metrics to get percentiles
+for all of these points, which will measure over all frames (over a one-minute
+window). This yields more precise information than sampling every 100 frames,
+but setting up Prometheus and a graphic tool is a bit more work, and usually not
+worth it for simple measurement. For more information, see :doc:`monitoring`.
+
+Another trick that can be useful in some situations is *looping* your signal,
+ie., connecting your output back into your input. This allows you to measure
+delays that don't happen within Nageru itself, like any external converters,
+delays in the input driver, etc.. (It can also act as a sanity check to make
+sure your A/V chain passes the signal through without quality degradation,
+if you first set up a static picture as a signal and then switch to the loop
+input to verify that the signal stays stable without color e.g. shifts [#]_.
+See the section on :doc:`the frame analyzer <analyzer>` for other ways of
+debugging signal integrity.)
+
+For this, the *timecode output* is useful; you can turn it on from the Video
+menu, or through the command-line flag *--timecode-stream*. (You can also
+output it to standard output with the flag *--timecode-stdout*.) It contains
+some information about frame numbers and current time of day; if you activate
+it, switch to the loop input and then deactivate it while still holding the
+loop input active, the timecode will start repeating with roughly the
+same length as your latency. (It can't be an exact measurement, as delay is
+frequently fractional, and a loop length cannot be.) The easiest way to find
+the actual length is to look at the recorded video file by e.g. dumping each
+frame to an image file and looking at the sequence.
+
+In general, using Nageru's own latency measurement is both the simplest and
+the most precise. However, the timecode is a useful supplement, since it
+can also test external factors, such as network stream latency.
+
+.. [#] If you actually try this with Nageru, you will see some
+       dark “specks” slowly appear in the image. This is a consequence of
+       small roundoff errors accumulating over time, combined with Nageru's
+       static dither pattern that causes rounding to happen in the same
+       direction each time. The dithering used by Nageru is a tradeoff between
+       many factors, and overall helps image quality much more than it
+       hurts, but in the specific case of an ever-looping signal, it will
+       cause such artifacts.