git.sesse.net Git - ffmpeg/blob - doc/writing_filters.txt

   1 This document is a tutorial/initiation for writing simple filters in
   2 libavfilter.
   3
   4 Foreword: just like everything else in FFmpeg, libavfilter is monolithic, which
   5 means that it is highly recommended that you submit your filters to the FFmpeg
   6 development mailing-list and make sure that they are applied. Otherwise, your filters
   7 are likely to have a very short lifetime due to more or less regular internal API
   8 changes, and a limited distribution, review, and testing.
   9
  10 Bootstrap
  11 =========
  12
  13 Let's say you want to write a new simple video filter called "foobar" which
  14 takes one frame in input, changes the pixels in whatever fashion you fancy, and
  15 outputs the modified frame. The most simple way of doing this is to take a
  16 similar filter.  We'll pick edgedetect, but any other should do. You can look
  17 for others using the `./ffmpeg -v 0 -filters|grep ' V->V '` command.
  18
  19  - sed 's/edgedetect/foobar/g;s/EdgeDetect/Foobar/g' libavfilter/vf_edgedetect.c > libavfilter/vf_foobar.c
  20  - edit libavfilter/Makefile, and add an entry for "foobar" following the
  21    pattern of the other filters.
  22  - edit libavfilter/allfilters.c, and add an entry for "foobar" following the
  23    pattern of the other filters.
  24  - ./configure ...
  25  - make -j<whatever> ffmpeg
  26  - ./ffmpeg -i http://samples.ffmpeg.org/image-samples/lena.pnm -vf foobar foobar.png
  27    Note here: you can obviously use a random local image instead of a remote URL.
  28
  29 If everything went right, you should get a foobar.png with Lena edge-detected.
  30
  31 That's it, your new playground is ready.
  32
  33 Some little details about what's going on:
  34 libavfilter/allfilters.c:this file is parsed by the configure script, which in turn
  35 will define variables for the build system and the C:
  36
  37     --- after running configure ---
  38
  39     $ grep FOOBAR ffbuild/config.mak
  40     CONFIG_FOOBAR_FILTER=yes
  41     $ grep FOOBAR config.h
  42     #define CONFIG_FOOBAR_FILTER 1
  43
  44 CONFIG_FOOBAR_FILTER=yes from the ffbuild/config.mak is later used to enable
  45 the filter in libavfilter/Makefile and CONFIG_FOOBAR_FILTER=1 from the config.h
  46 will be used for registering the filter in libavfilter/allfilters.c.
  47
  48 Filter code layout
  49 ==================
  50
  51 You now need some theory about the general code layout of a filter. Open your
  52 libavfilter/vf_foobar.c. This section will detail the important parts of the
  53 code you need to understand before messing with it.
  54
  55 Copyright
  56 ---------
  57
  58 First chunk is the copyright. Most filters are LGPL, and we are assuming
  59 vf_foobar is as well. We are also assuming vf_foobar is not an edge detector
  60 filter, so you can update the boilerplate with your credits.
  61
  62 Doxy
  63 ----
  64
  65 Next chunk is the Doxygen about the file. See https://ffmpeg.org/doxygen/trunk/.
  66 Detail here what the filter is, does, and add some references if you feel like
  67 it.
  68
  69 Context
  70 -------
  71
  72 Skip the headers and scroll down to the definition of FoobarContext. This is
  73 your local state context. It is already filled with 0 when you get it so do not
  74 worry about uninitialized reads into this context. This is where you put all
  75 "global" information that you need; typically the variables storing the user options.
  76 You'll notice the first field "const AVClass *class"; it's the only field you
  77 need to keep assuming you have a context. There is some magic you don't need to
  78 care about around this field, just let it be (in the first position) for now.
  79
  80 Options
  81 -------
  82
  83 Then comes the options array. This is what will define the user accessible
  84 options. For example, -vf foobar=mode=colormix:high=0.4:low=0.1. Most options
  85 have the following pattern:
  86   name, description, offset, type, default value, minimum value, maximum value, flags
  87
  88  - name is the option name, keep it simple and lowercase
  89  - description are short, in lowercase, without period, and describe what they
  90    do, for example "set the foo of the bar"
  91  - offset is the offset of the field in your local context, see the OFFSET()
  92    macro; the option parser will use that information to fill the fields
  93    according to the user input
  94  - type is any of AV_OPT_TYPE_* defined in libavutil/opt.h
  95  - default value is an union where you pick the appropriate type; "{.dbl=0.3}",
  96    "{.i64=0x234}", "{.str=NULL}", ...
  97  - min and max values define the range of available values, inclusive
  98  - flags are AVOption generic flags. See AV_OPT_FLAG_* definitions
  99
 100 When in doubt, just look at the other AVOption definitions all around the codebase,
 101 there are tons of examples.
 102
 103 Class
 104 -----
 105
 106 AVFILTER_DEFINE_CLASS(foobar) will define a unique foobar_class with some kind
 107 of signature referencing the options, etc. which will be referenced in the
 108 definition of the AVFilter.
 109
 110 Filter definition
 111 -----------------
 112
 113 At the end of the file, you will find foobar_inputs, foobar_outputs and
 114 the AVFilter ff_vf_foobar. Don't forget to update the AVFilter.description with
 115 a description of what the filter does, starting with a capitalized letter and
 116 ending with a period. You'd better drop the AVFilter.flags entry for now, and
 117 re-add them later depending on the capabilities of your filter.
 118
 119 Callbacks
 120 ---------
 121
 122 Let's now study the common callbacks. Before going into details, note that all
 123 these callbacks are explained in details in libavfilter/avfilter.h, so in
 124 doubt, refer to the doxy in that file.
 125
 126 init()
 127 ~~~~~~
 128
 129 First one to be called is init(). It's flagged as cold because not called
 130 often. Look for "cold" on
 131 http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html for more
 132 information.
 133
 134 As the name suggests, init() is where you eventually initialize and allocate
 135 your buffers, pre-compute your data, etc. Note that at this point, your local
 136 context already has the user options initialized, but you still haven't any
 137 clue about the kind of data input you will get, so this function is often
 138 mainly used to sanitize the user options.
 139
 140 Some init()s will also define the number of inputs or outputs dynamically
 141 according to the user options. A good example of this is the split filter, but
 142 we won't cover this here since vf_foobar is just a simple 1:1 filter.
 143
 144 uninit()
 145 ~~~~~~~~
 146
 147 Similarly, there is the uninit() callback, doing what the name suggests. Free
 148 everything you allocated here.
 149
 150 query_formats()
 151 ~~~~~~~~~~~~~~~
 152
 153 This follows the init() and is used for the format negotiation. Basically
 154 you specify here what pixel format(s) (gray, rgb 32, yuv 4:2:0, ...) you accept
 155 for your inputs, and what you can output. All pixel formats are defined in
 156 libavutil/pixfmt.h. If you don't change the pixel format between the input and
 157 the output, you just have to define a pixel formats array and call
 158 ff_set_common_formats(). For more complex negotiation, you can refer to other
 159 filters such as vf_scale.
 160
 161 config_props()
 162 ~~~~~~~~~~~~~~
 163
 164 This callback is not necessary, but you will probably have one or more
 165 config_props() anyway. It's not a callback for the filter itself but for its
 166 inputs or outputs (they're called "pads" - AVFilterPad - in libavfilter's
 167 lexicon).
 168
 169 Inside the input config_props(), you are at a point where you know which pixel
 170 format has been picked after query_formats(), and more information such as the
 171 video width and height (inlink->{w,h}). So if you need to update your internal
 172 context state depending on your input you can do it here. In edgedetect you can
 173 see that this callback is used to allocate buffers depending on these
 174 information. They will be destroyed in uninit().
 175
 176 Inside the output config_props(), you can define what you want to change in the
 177 output. Typically, if your filter is going to double the size of the video, you
 178 will update outlink->w and outlink->h.
 179
 180 filter_frame()
 181 ~~~~~~~~~~~~~~
 182
 183 This is the callback you are waiting for from the beginning: it is where you
 184 process the received frames. Along with the frame, you get the input link from
 185 where the frame comes from.
 186
 187     static int filter_frame(AVFilterLink *inlink, AVFrame *in) { ... }
 188
 189 You can get the filter context through that input link:
 190
 191     AVFilterContext *ctx = inlink->dst;
 192
 193 Then access your internal state context:
 194
 195     FoobarContext *foobar = ctx->priv;
 196
 197 And also the output link where you will send your frame when you are done:
 198
 199     AVFilterLink *outlink = ctx->outputs[0];
 200
 201 Here, we are picking the first output. You can have several, but in our case we
 202 only have one since we are in a 1:1 input-output situation.
 203
 204 If you want to define a simple pass-through filter, you can just do:
 205
 206     return ff_filter_frame(outlink, in);
 207
 208 But of course, you probably want to change the data of that frame.
 209
 210 This can be done by accessing frame->data[] and frame->linesize[].  Important
 211 note here: the width does NOT match the linesize. The linesize is always
 212 greater or equal to the width. The padding created should not be changed or
 213 even read. Typically, keep in mind that a previous filter in your chain might
 214 have altered the frame dimension but not the linesize. Imagine a crop filter
 215 that halves the video size: the linesizes won't be changed, just the width.
 216
 217     <-------------- linesize ------------------------>
 218     +-------------------------------+----------------+ ^
 219     |                               |                | |
 220     |                               |                | |
 221     |           picture             |    padding     | | height
 222     |                               |                | |
 223     |                               |                | |
 224     +-------------------------------+----------------+ v
 225     <----------- width ------------->
 226
 227 Before modifying the "in" frame, you have to make sure it is writable, or get a
 228 new one. Multiple scenarios are possible here depending on the kind of
 229 processing you are doing.
 230
 231 Let's say you want to change one pixel depending on multiple pixels (typically
 232 the surrounding ones) of the input. In that case, you can't do an in-place
 233 processing of the input so you will need to allocate a new frame, with the same
 234 properties as the input one, and send that new frame to the next filter:
 235
 236     AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 237     if (!out) {
 238         av_frame_free(&in);
 239         return AVERROR(ENOMEM);
 240     }
 241     av_frame_copy_props(out, in);
 242
 243     // out->data[...] = foobar(in->data[...])
 244
 245     av_frame_free(&in);
 246     return ff_filter_frame(outlink, out);
 247
 248 In-place processing
 249 ~~~~~~~~~~~~~~~~~~~
 250
 251 If you can just alter the input frame, you probably just want to do that
 252 instead:
 253
 254     av_frame_make_writable(in);
 255     // in->data[...] = foobar(in->data[...])
 256     return ff_filter_frame(outlink, in);
 257
 258 You may wonder why a frame might not be writable. The answer is that for
 259 example a previous filter might still own the frame data: imagine a filter
 260 prior to yours in the filtergraph that needs to cache the frame. You must not
 261 alter that frame, otherwise it will make that previous filter buggy. This is
 262 where av_frame_make_writable() helps (it won't have any effect if the frame
 263 already is writable).
 264
 265 The problem with using av_frame_make_writable() is that in the worst case it
 266 will copy the whole input frame before you change it all over again with your
 267 filter: if the frame is not writable, av_frame_make_writable() will allocate
 268 new buffers, and copy the input frame data. You don't want that, and you can
 269 avoid it by just allocating a new buffer if necessary, and process from in to
 270 out in your filter, saving the memcpy. Generally, this is done following this
 271 scheme:
 272
 273     int direct = 0;
 274     AVFrame *out;
 275
 276     if (av_frame_is_writable(in)) {
 277         direct = 1;
 278         out = in;
 279     } else {
 280         out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 281         if (!out) {
 282             av_frame_free(&in);
 283             return AVERROR(ENOMEM);
 284         }
 285         av_frame_copy_props(out, in);
 286     }
 287
 288     // out->data[...] = foobar(in->data[...])
 289
 290     if (!direct)
 291         av_frame_free(&in);
 292     return ff_filter_frame(outlink, out);
 293
 294 Of course, this will only work if you can do in-place processing. To test if
 295 your filter handles well the permissions, you can use the perms filter. For
 296 example with:
 297
 298     -vf perms=random,foobar
 299
 300 Make sure no automatic pixel conversion is inserted between perms and foobar,
 301 otherwise the frames permissions might change again and the test will be
 302 meaningless: add av_log(0,0,"direct=%d\n",direct) in your code to check that.
 303 You can avoid the issue with something like:
 304
 305     -vf format=rgb24,perms=random,foobar
 306
 307 ...assuming your filter accepts rgb24 of course. This will make sure the
 308 necessary conversion is inserted before the perms filter.
 309
 310 Timeline
 311 ~~~~~~~~
 312
 313 Adding timeline support
 314 (http://ffmpeg.org/ffmpeg-filters.html#Timeline-editing) is often an easy
 315 feature to add. In the most simple case, you just have to add
 316 AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC to the AVFilter.flags. You can typically
 317 do this when your filter does not need to save the previous context frames, or
 318 basically if your filter just alters whatever goes in and doesn't need
 319 previous/future information. See for instance commit 86cb986ce that adds
 320 timeline support to the fieldorder filter.
 321
 322 In some cases, you might need to reset your context somehow. This is handled by
 323 the AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL flag which is used if the filter
 324 must not process the frames but still wants to keep track of the frames going
 325 through (to keep them in cache for when it's enabled again). See for example
 326 commit 69d72140a that adds timeline support to the phase filter.
 327
 328 Threading
 329 ~~~~~~~~~
 330
 331 libavfilter does not yet support frame threading, but you can add slice
 332 threading to your filters.
 333
 334 Let's say the foobar filter has the following frame processing function:
 335
 336     dst = out->data[0];
 337     src = in ->data[0];
 338
 339     for (y = 0; y < inlink->h; y++) {
 340         for (x = 0; x < inlink->w; x++)
 341             dst[x] = foobar(src[x]);
 342         dst += out->linesize[0];
 343         src += in ->linesize[0];
 344     }
 345
 346 The first thing is to make this function work into slices. The new code will
 347 look like this:
 348
 349     for (y = slice_start; y < slice_end; y++) {
 350         for (x = 0; x < inlink->w; x++)
 351             dst[x] = foobar(src[x]);
 352         dst += out->linesize[0];
 353         src += in ->linesize[0];
 354     }
 355
 356 The source and destination pointers, and slice_start/slice_end will be defined
 357 according to the number of jobs. Generally, it looks like this:
 358
 359     const int slice_start = (in->height *  jobnr   ) / nb_jobs;
 360     const int slice_end   = (in->height * (jobnr+1)) / nb_jobs;
 361     uint8_t       *dst = out->data[0] + slice_start * out->linesize[0];
 362     const uint8_t *src =  in->data[0] + slice_start *  in->linesize[0];
 363
 364 This new code will be isolated in a new filter_slice():
 365
 366     static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { ... }
 367
 368 Note that we need our input and output frame to define slice_{start,end} and
 369 dst/src, which are not available in that callback. They will be transmitted
 370 through the opaque void *arg. You have to define a structure which contains
 371 everything you need:
 372
 373     typedef struct ThreadData {
 374         AVFrame *in, *out;
 375     } ThreadData;
 376
 377 If you need some more information from your local context, put them here.
 378
 379 In you filter_slice function, you access it like that:
 380
 381     const ThreadData *td = arg;
 382
 383 Then in your filter_frame() callback, you need to call the threading
 384 distributor with something like this:
 385
 386     ThreadData td;
 387
 388     // ...
 389
 390     td.in  = in;
 391     td.out = out;
 392     ctx->internal->execute(ctx, filter_slice, &td, NULL, FFMIN(outlink->h, ff_filter_get_nb_threads(ctx)));
 393
 394     // ...
 395
 396     return ff_filter_frame(outlink, out);
 397
 398 Last step is to add AVFILTER_FLAG_SLICE_THREADS flag to AVFilter.flags.
 399
 400 For more example of slice threading additions, you can try to run git log -p
 401 --grep 'slice threading' libavfilter/
 402
 403 Finalization
 404 ~~~~~~~~~~~~
 405
 406 When your awesome filter is finished, you have a few more steps before you're
 407 done:
 408
 409  - write its documentation in doc/filters.texi, and test the output with make
 410    doc/ffmpeg-filters.html.
 411  - add a FATE test, generally by adding an entry in
 412    tests/fate/filter-video.mak, add running make fate-filter-foobar GEN=1 to
 413    generate the data.
 414  - add an entry in the Changelog
 415  - edit libavfilter/version.h and increase LIBAVFILTER_VERSION_MINOR by one
 416    (and reset LIBAVFILTER_VERSION_MICRO to 100)
 417  - git add ... && git commit -m "avfilter: add foobar filter." && git format-patch -1
 418
 419 When all of this is done, you can submit your patch to the ffmpeg-devel
 420 mailing-list for review.  If you need any help, feel free to come on our IRC
 421 channel, #ffmpeg-devel on irc.freenode.net.