1 <chapter> <title> The complex multi-layer input </title>
4 The idea behind the input module is to treat packets, without knowing
5 at all what is in it. It only takes a packet,
6 reads its ID, and delivers it to the decoder at the right time
7 indicated in the packet header (SCR and PCR fields in MPEG).
8 All the basic browsing operations are implemented without peeking at the
9 content of the elementary stream.
13 Thus it remains very generic. This also means you can't do stuff like
14 "play 3 frames now" or "move forward 10 frames" or "play as fast as you
15 can but play all frames". It doesn't even know what a "frame" is. There
16 is no privileged elementary stream, like the video one could be (for
17 the simple reason that, according to MPEG, a stream may contain
21 <sect1> <title> What happens to a file </title>
24 An input thread is spawned for every file read. Indeed, input structures
25 and decoders need to be reinitialized because the specificities of
26 the stream may be different. <function> input_CreateThread </function>
27 is called by the interface thread (playlist module).
31 At first, an input plug-in capable of reading the plugin item is looked
32 for [this is inappropriate : we should first open the socket,
33 and then probe the beginning of the stream to see which plug-in can read
34 it]. The socket is opened by either <function> input_FileOpen</function>,
35 <function> input_NetworkOpen</function>, or <function>
36 input_DvdOpen</function>. This function sets two very important parameters :
37 <parameter> b_pace_control </parameter> and <parameter> b_seekable
38 </parameter> (see next section).
42 We could use so-called "access" plugins for this whole mechanism
43 of opening the input socket. This is not the case because we
44 thought only those three methods were to be used at present,
45 and if we need others we can still build them in.
49 Now we can launch the input plugin's <function> pf_init </function>
50 function, and an endless loop doing <function> pf_read </function>
51 and <function> pf_demux</function>. The plugin is responsible
52 for initializing the stream structures
53 (<parameter>p_input->stream</parameter>), managing packet buffers,
54 reading packets and demultiplex them. But in most tasks it will
55 be assisted by functions from the advanced input API (c). That is
56 what we will study in the coming sections !
61 <sect1> <title> Stream Management </title>
64 The function which has opened the input socket must specify two
69 <listitem> <para> <emphasis> p_input->stream.b_pace_control
70 </emphasis> : Whether or not the stream can be read at our own
71 pace (determined by the stream's frequency and
72 the host computer's system clock). For instance a file or a pipe
73 (including TCP/IP connections) can be read at our pace, if we don't
74 read fast enough, the other end of the pipe will just block on a
75 <function> write() </function> operation. On the contrary, UDP
76 streaming (such as the one used by VideoLAN Server) is done at
77 the server's pace, and if we don't read fast enough, packets will
78 simply be lost when the kernel's buffer is full. So the drift
79 introduced by the server's clock must be regularly compensated.
80 This property controls the clock management, and whether
81 or not fast forward and slow motion can be done.</para>
83 <note> <title> Subtilities in the clock management </title> <para>
84 With a UDP socket and a distant server, the drift is not
85 negligible because on a whole movie it can account for
86 seconds if one of the clocks is slightly fucked up. That means
87 that presentation dates given by the input thread may be
88 out of sync, to some extent, with the frequencies given in
89 every Elementary Stream. Output threads (and, anecdotically,
90 decoder threads) must deal with it. </para>
92 <para> The same kind of problems may happen when reading from
93 a device (like video4linux's <filename> /dev/video </filename>)
94 connected for instance to a video encoding board.
95 There is no way we could differentiate
96 it from a simple <command> cat foo.mpg | vlc - </command>, which
97 doesn't imply any clock problem. So the Right Thing (c) would be
98 to ask the user about the value of <parameter> b_pace_control
99 </parameter>, but nobody would understand what it means (you are
100 not the dumbest person on Earth, and obviously you have read this
101 paragraph several times to understand it :-). Anyway,
102 the drift should be negligible since the board would share the
103 same clock as the CPU, so we chose to neglect it. </para> </note>
106 <listitem> <para> <emphasis> p_input->stream.b_seekable
107 </emphasis> : Whether we can do <function> lseek() </function>
108 calls on the file descriptor or not. Basically whether we can
109 jump anywhere in the stream (and thus display a scrollbar) or
110 if we can only read one byte after the other. This has less impact
111 on the stream management than the previous item, but it
112 is not redundant, because for instance
113 <command> cat foo.mpg | vlc - </command> is b_pace_control = 1
114 but b_seekable = 0. On the contrary, you cannot have
115 b_pace_control = 0 along with b_seekable = 1. If a stream is seekable,
116 <parameter> p_input->stream.p_selected_area->i_size </parameter>
117 must be set (in an arbitrary unit, for instance bytes, but it
118 must be the same as p_input->i_tell which indicates the byte
119 we are currently reading from the stream).</para>
121 <note> <title> Offset to time conversions </title> <para>
122 Functions managing clocks are located in <filename>
123 src/input/input_clock.c</filename>. All we know about a file
124 is its start offset and its end offset
125 (<parameter>p_input->stream.p_selected_area->i_size</parameter>),
126 currently in bytes, but it could be plugin-dependant. So
127 how the hell can we display in the interface a time in seconds ?
128 Well, we cheat. PS streams have a <parameter> mux_rate </parameter>
129 property which indicates how many bytes we should read in
130 a second. This is subject to change at any time, but practically
131 it is a constant for all streams we know. So we use it to
132 determine time offsets. </para> </note> </listitem>
137 <sect1> <title> Structures exported to the interface </title>
140 Let's focus on the communication API between the input module and the
141 interface. The most important file is <filename> include/input_ext-intf.h,
142 </filename> which you should know almost by heart. This file defines
143 the input_thread_t structure, the stream_descriptor_t and all programs
144 and ES descriptors included (you can view it as a tree).
148 First, note that the input_thread_t structure features two <type> void *
149 </type> pointers, <parameter> p_method_data </parameter> and <parameter>
150 p_plugin_data</parameter>, which you can respectivly use for buffer
151 management data and plugin data.
155 Second, a stream description is stored in a tree featuring program
156 descriptors, which themselves contain several elementary stream
157 descriptors. For those of you who don't know all MPEG concepts, an
158 elementary stream, aka ES, is a continuous stream of video or
159 (exclusive) audio data, directly readable by a decoder, without
164 This tree structure is illustrated by the following
165 figure, where one stream holds two programs.
166 In most cases there will only be one program (to my
167 knowledge only TS streams can carry several programs, for instance
168 a movie and a football game at the same time - this is adequate
169 for satellite and cable broadcasting).
174 <imagedata fileref="stream.png" format="PNG" scalefit="1" scale="80"/>
177 <imagedata fileref="stream.gif" format="GIF" />
180 <phrase> The program tree </phrase>
183 <para> <emphasis> p_input->stream </emphasis> :
184 The stream, programs and elementary streams can be viewed as a tree.
190 For all modifications and accesses to the <parameter>p_input->stream
191 </parameter> structure, you <emphasis>must</emphasis> hold
192 the p_input->stream.stream_lock.
196 ES are described by an ID (the ID the appropriate demultiplexer will
197 look for), a <parameter> stream_id </parameter> (the real MPEG stream
199 in ISO/IEC 13818-1 table 2-29) and a litteral description. It also
200 contains context information for the demultiplexer, and decoder
201 information <parameter> p_decoder_fifo </parameter> we will talk
202 about in the next chapter. If the stream you want to read is not an
203 MPEG system layer (for instance AVI or RTP), a specific demultiplexer
204 will have to be written. In that case, if you need to carry additional
205 information, you can use <type> void * </type> <parameter> p_demux_data
206 </parameter> at your convenience. It will be automatically freed on
210 <note> <title> Why ID and not use the plain MPEG <parameter>
211 stream_id </parameter> ? </title> <para>
212 When a packet (be it a TS packet, PS packet, or whatever) is read,
213 the appropriate demultiplexer will look for an ID in the packet, find the
214 relevant elementary stream, and demultiplex it if the user selected it.
215 In case of TS packets, the only information we have is the
216 ES PID, so the reference ID we keep is the PID. PID don't exist
217 in PS streams, so we have to invent one. It is of course based on
218 the <parameter> stream_id </parameter> found in all PS packets,
219 but it is not enough, since private streams (ie. AC3, SPU and
220 LPCM) all share the same <parameter> stream_id </parameter>
221 (<constant>0xBD</constant>). In that case the first byte of the
222 PES payload is a stream private ID, so we combine this with
223 the stream_id to get our ID (if you did not understand everything,
224 it isn't very important - just remember we used our brains
225 before writing the code :-).
229 The stream, program and ES structures are filled in by the plugin's
231 </function> using functions in <filename> src/input/input_programs.c,
232 </filename> but are subject to change at any time. The DVD plugin
233 parses .ifo files to know which ES are in the stream; the TS plugin
234 reads the PAT and PMT structures in the stream; the PS plugin can
235 either parse the PSM structure (but it is rarely present), or build
236 the tree "on the fly" by pre-parsing the first megabyte of data.
240 In most cases we need to pre-parse (that is, read the first MB of data,
241 and go back to the beginning) a PS stream, because the PSM (Program
242 Stream Map) structure is almost never present. This is not appropriate,
243 though, but we don't have the choice. A few problems will arise. First,
244 non-seekable streams cannot be pre-parsed, so the ES tree will be
245 built on the fly. Second, if a new elementary stream starts after the
246 first MB of data (for instance a subtitle track won't show up
247 during the credits), it won't appear in the menu before we encounter
248 the first packet. We cannot pre-parse the entire stream because it
249 would take hours (even without decoding it).
253 It is currently the responsibility of the input plugin to spawn the necessary
254 decoder threads. It must call <function> input_SelectES </function>
255 <parameter>( input_thread_t * p_input, es_descriptor_t * p_es )
256 </parameter> on the selected ES.
260 The stream descriptor also contains a list of areas. Areas are logical
261 discontinuities in the stream, for instance chapters and titles in a
262 DVD. There is only one area in TS and PS streams, though we could
263 use them when the PSM (or PAT/PMT) version changes. The goal is that
264 when you seek to another area, the input plugin loads the new stream
265 descriptor tree (otherwise the selected ID may be wrong).
270 <sect1> <title> Methods used by the interface </title>
273 Besides, <filename> input_ext-intf.c </filename>provides a few functions
274 to control the reading of the stream :
278 <listitem> <para> <function> input_SetStatus </function>
279 <parameter> ( input_thread_t * p_input, int i_mode ) </parameter> :
280 Changes the pace of reading. <parameter> i_mode </parameter> can
281 be one of <constant> INPUT_STATUS_END, INPUT_STATUS_PLAY,
282 INPUT_STATUS_PAUSE, INPUT_STATUS_FASTER, INPUT_STATUS_SLOWER.
285 <note> <para> Internally, the pace of reading is determined
286 by the variable <parameter>
287 p_input->stream.control.i_rate</parameter>. The default
288 value is <constant> DEFAULT_RATE</constant>. The lower the
289 value, the faster the pace is. Rate changes are taken into account
290 in <function> input_ClockManageRef</function>. Pause is
291 accomplished by simply stopping the input thread (it is
292 then awaken by a pthread signal). In that case, decoders
293 will be stopped too. Please remember this if you do statistics
294 on decoding times (like <filename> src/video_parser/vpar_synchro.c
295 </filename> does). Don't call this function if <parameter>
296 p_input->b_pace_control </parameter> == 0.</para> </note>
299 <listitem> <para> <function> input_Seek </function> <parameter>
300 ( input_thread_t * p_input, off_t i_position ) </parameter> :
301 Changes the offset of reading. Used to jump to another place in a
302 file. You <emphasis>mustn't</emphasis> call this function if
303 <parameter> p_input->stream.b_seekable </parameter> == 0.
304 The position is a number (usually long long, depends on your
305 libc) between <parameter>p_input->p_selected_area->i_start
306 </parameter> and <parameter>p_input->p_selected_area->i_size
307 </parameter> (current value is in <parameter>
308 p_input->p_selected_area->i_tell</parameter>). </para>
310 <note> <para> Multimedia files can be very large, especially
311 when we read a device like <filename> /dev/dvd</filename>, so
312 offsets must be 64 bits large. Under a lot of systems, like
313 FreeBSD, off_t are 64 bits by default, but it is not the
314 case under GNU libc 2.x. That is why we need to compile VLC
315 with -D_FILE_OFFSET_BITS=64 -D__USE_UNIX98. </para> </note>
317 <note> <title> Escaping stream discontinuities </title>
319 Changing the reading position at random can result in a
320 messed up stream, and the decoder which reads it may
321 segfault. To avoid this, we send several NULL packets
322 (ie. packets containing nothing but zeros) before changing
323 the reading position. Indeed, under most video and audio
324 formats, a long enough stream of zeros is an escape sequence
325 and the decoder can exit cleanly.
329 <listitem> <para> <function> input_OffsetToTime </function>
330 <parameter> ( input_thread_t * p_input, char * psz_buffer,
331 off_t i_offset ) </parameter> : Converts an offset value to
332 a time coordinate (used for interface display).
333 [currently it is broken with MPEG-2 files]
336 <listitem> <para> <function> input_ChangeES </function>
337 <parameter> ( input_thread_t * p_input, es_descriptor_t * p_es,
338 u8 i_cat ) </parameter> : Unselects all elementary streams of
339 type <parameter> i_cat </parameter> and selects <parameter>
340 p_es</parameter>. Used for instance to change language or
344 <listitem> <para> <function> input_ToggleES </function>
345 <parameter> ( input_thread_t * p_input, es_descriptor_t * p_es,
346 boolean_t b_select ) </parameter> : This is the clean way to
347 select or unselect a particular elementary stream from the
354 <sect1 id="input_buff"> <title> Buffers management </title>
357 Input plugins must implement a way to allocate and deallocate packets
358 (whose structures will be described in the next chapter). We
359 basically need four functions :
363 <listitem> <para> <function> pf_new_packet </function>
364 <parameter> ( void * p_private_data, size_t i_buffer_size )
366 Allocates a new <type> data_packet_t </type> and an associated
367 buffer of i_buffer_size bytes.
370 <listitem> <para> <function> pf_new_pes </function>
371 <parameter> ( void * p_private_data ) </parameter> :
372 Allocates a new <type> pes_packet_t</type>.
375 <listitem> <para> <function> pf_delete_packet </function>
376 <parameter> ( void * p_private_data, data_packet_t * p_data )
378 Deallocates <parameter> p_data</parameter>.
381 <listitem> <para> <function> pf_delete_pes </function>
382 <parameter> ( void * p_private_data, pes_packet_t * p_pes )
384 Deallocates <parameter> p_pes</parameter>.
389 All functions are given <parameter> p_input->p_method_data </parameter>
390 as first parameter, so that you can keep records of allocated and freed
394 <note> <title> Buffers management strategies </title>
395 <para> Buffers management can be done in three ways : </para>
398 <listitem> <para> <emphasis> Traditional libc allocation </emphasis> :
399 For a long time we have used in the PS plugin
401 </function> and <function> free() </function> every time
402 we needed to allocate or deallocate a packet. Contrary
403 to a popular belief, it is not <emphasis>that</emphasis>
407 <listitem> <para> <emphasis> Netlist </emphasis> :
408 In this method we allocate a very big buffer at the
409 beginning of the problem, and then manage a list of pointers
410 to free packets (the "netlist"). This only works well if
411 all packets have the same size. It is used for long for
412 the TS input. The DVD plugin also uses it, but adds a
413 <emphasis> refcount </emphasis> flag because buffers (2048
414 bytes) can be shared among several packets. It is now
415 deprecated and won't be documented.
418 <listitem> <para> <emphasis> Buffer cache </emphasis> :
419 We are currently developing a new method. It is
420 already in use in the PS plugin. The idea is to call
421 <function> malloc() </function> and <function> free()
422 </function> to absorb stream irregularities, but re-use
423 all allocated buffers via a cache system. We are
424 extending it so that it can be used in any plugin without
425 performance hit, but it is currently left undocumented.
431 <sect1> <title> Demultiplexing the stream </title>
434 After being read by <function> pf_read </function>, your plugin must
435 give a function pointer to the demultiplexer function. The demultiplexer
436 is responsible for parsing the packet, gathering PES, and feeding decoders.
440 Demultiplexers for standard MPEG structures (PS and TS) have already
441 been written. You just need to indicate <function> input_DemuxPS
442 </function> and <function> input_DemuxTS </function> for <function>
443 pf_demux</function>. You can also write your own demultiplexer.
447 It is not the purpose of this document to describe the different levels
448 of encapsulation in an MPEG stream. Please refer to your MPEG specification