This commit reorders the coding tools such that they're doing what
the decoder does in reverse order. The very first thing the decoder
does is to decode M/S stereo if that's signalled, then prediction,
IS, and finally TNS and PNS in another function.
adjust_frame_information()'s application of IS and M/S was taken
out into two separate functions since prediction doesn't expect
to get the raw coefficients but rathe the coefficients at that
part of the encoding process.
The results show a much better PSNR when any combination of
Intensity Stereo, Mid/Side stereo and Prediction is used, which
is a sign of an increased encoder efficiency as well as the fact
that the decoder gets what it expects.
Otherwise, with only IS, PNS or prediction there are neither
regressions nor improvements except in the case of IS, which
now by itself (or with PNS) is less prone to artifacts. Enabling
M/S (using stereo_mode) as well will also reduce stereo artifacts
induced by IS, so in the very near future M/S may be enabled
by default.
Andrew Stone [Tue, 1 Sep 2015 00:28:42 +0000 (20:28 -0400)]
rtmp: support the AMF_DATE tag
Instead of returning EINVAL, which can cause a stream to fail to load, this
allows the tag to be passed through to the flv demuxer, where it's summarily
ignored.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
aacenc: disable bandtype modifying extensions when coder != twoloop
If the selected coder isn't twoloop, this commit temporarily
disables IS and PNS.
The problem is in the encode_window_bands_info() being confused
and setting invalid band_types for non-marked (normal) bands.
IS and PNS increase quality a ton so as a result the PSNR changed.
Disable the extensions and keep the tests separate such that there
will be no red herrings if one test fails.
Since the changes made a few week ago (which were done more than a
month ago) the quality and stability of intensity stereo has been
notably good. There were some requests and wishes to have in on by
default and therefore it has been enabled. Should any regressions
arise changes will be made to preferably keep it operating rather
than just disabling it by default again.
aacenc: Enable Perceptual Noise Substitution by default
It has been in the current encoder in its current implementation
for quite some time now, so enable it by default. Will increase
quality at all bitrates, especially at low ones.
aacenc_tns: rework coefficient quantization and filter application
This commit reworks the TNS implementation to a hybrid between what
the specifications say, what the decoder does and what's the best
thing to do.
The filter application function was copied from the decoder and
modified such that it applies the inverse AR filter to the
coefficients. The LPC coefficients themselves are fed into the
same quantization expression that the specifications say should
be used however further processing is not done, instead they're
converted to the form that the decoder expects them to be in
and are sent off to the compute_lpc_coeffs function exactly the
way the decoder does. This function does all conversions and will
return the exact coefficients that the decoder will generate, which
are then applied to the coefficients.
Having the exact same coefficients on both the encoder and decoder
is a must since otherwise the entire sfb's over which the filter
is applied will be attenuated.
Despite this major rework, TNS might not work fine on some audio
types at very low bitrates (e.g. sub 90kbps) as it can attenuate
some coefficients too much. Users are advised to experiment with
TNS at higher bitrates if they wish to use this tool or simply
wait for the implementation to be improved.
aacenc: allocate a larger buffer for the TNS LPC context
Turns out autocorrelating more than 750 coefficients at once
will cause a segfault, despite there being enough space to
hold an entire frame of samples into the buffer.
This commit adds a function to get the reflection coefficients on
floating point samples. It's functionally identical to
ff_lpc_calc_ref_coefs() except it works on float samples and will
return the global prediction gain. The Welch window implementation
which is more optimized works only on int32_t samples so a slower
generic expression was used.
Fixes out of array access Fixes: 87196d8bbc633629fc9dd851fce73e70/asan_heap-oob_26f6853_862_cov_585961513_sonic3dblast_intro-partial.avi Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
swscale/utils: Split scaling if possible and yuv->yuv with different matrixes is requested
This uses a RGB intermediate, a more optimal solution would be to perform the rematrixing
directly in subsampled YUV, this is quite a bit more complicated though
Fixes Ticket4805
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This was copied from the decoder, but is unneeded for the encoder.
tns_max_bands is unused and set to zero which zeroed out start, end
and size and thus no filter was actually applied.
Since the coefficients are stepped up to order + 1 it was possible
that it went over TNS_MAX_ORDER. Also just return in case the only
coefficient is less than the threshold.
aacenc_tns: rework the way coefficients are calculated
This commit abandons the way the specifications state to
quantize the coefficients, makes use of the new LPC float
functions and is much better.
The original way of converting non-normalized float samples
to int32_t which out LPC system expects was wrong and it was
wrong to assume the coefficients that are generated are also
valid. It was essentially a full garbage-in, garbage-out
system and it definitely shows when looking at spectrals
and listening. The high frequencies were very overattenuated.
The new LPC function performs the analysis directly.
The specifications state to quantize the coefficients into
four bit index values using an asin() function which of course
had to have ugly ternary operators because the function turns
negative if the coefficients are negative which when encoding
causes invalid bitstream to get generated.
This deviates from this by using the direct TNS tables, which
are fairly small since you only have 4 bits at most for index
values. The LPC values are directly quantized against the tables
and are then used to perform filtering after the requantization,
which simply fetches the array values.
The end result is that TNS works much better now and doesn't
attenuate anything but the actual signal, e.g. TNS removes
quantization errors and does it's job correctly now.
It might be enabled by default soon since it doesn't hurt and
helps reduce nastyness at low bitrates.
This commit removes the array which was made redundant with
the last commit. The current prediction system gets the
quantization error directly (and without the single-frame delay)
in the search_for_pred function.
This commit completely alters the algorithm of prediction.
The original commit which introduced prediction was completely
incorrect to even remotely care about what the actual coefficients
contain or whether any options were enabled. Not my actual fault.
This commit treats prediction the way the decoder does and expects
to do: like lossy encryption. Everything related to prediction now
happens at the very end but just before quantization and encoding
of coefficients. On the decoder side, prediction happens before
anything has had a chance to even access the coefficients.
Also the original implementation had problems because it actually
touched the band_type of special bands which already had their
scalefactor indices marked and it's a wonder the asserion wasn't
triggered when transmitting those.
Overall, this now drastically increases audio quality and you should
think about enabling it if you don't plan on playing anything encoded
on really old low power ultra-embedded devices since they might not
support decoding of prediction or AAC-Main. Though the specifications
were written ages ago and as times change so do the FLOPS.
lpc: create a simplified Levinson-Durbin LPC handling float samples
This commit simply duplicates the functionality of ff_lpc_calc_coefs()
for the case of a Levinson-Durbin LPC with the only difference being
that floating point samples are accepted and the resulting coefficients
are raw and unquantized.
The motivation behind doing this is the fact that the AAC encoder
requires LPC in TNS and LTP and converting non-normalized floating
point coefficients to int32_t using SWR and again back for the LPC
coefficients was very impractical.
The current LPC interfaces were designed for int32_t in mind possibly
because FLAC and ALAC use this type for most internal operations.
The mathematics in case of floats remains of course identical.
aac: move the TNS tables from aacdectab to the shared aactab
This commit simply moves the TNS tables to a more appropriate
aactab.h since then they can be accessed by both the decoder
and encoder.
The encoder _shouldn't_ normally need the tables since the
specs describe a specific quantization process, but the exact
reason for this can be seen in the TNS commit following.
Philip Langdale [Sat, 8 Aug 2015 21:59:09 +0000 (14:59 -0700)]
avcodec/vc1dec: Re-order init to avoid initting hwaccel too early
At least for vdpau, the hwaccel init code tries to check the video
profile and ensure that there is a matching vdpau profile available.
If it can't find a match, it will fail to initialise.
In the case of wmv3/vc1, I observed initialisation to fail all the
time. It turns out that this is due to the hwaccel being initialised
very early in the codec init, before the profile has been extracted
and set.
Conceptually, it's a simple fix to reorder the init code, but it gets
messy really fast because ff_get_format(), which is what implicitly
trigger hwaccel init, is called multiple times through various shared
init calls from h263, etc. It's incredibly hard to prove to my own
satisfaction that it's safe to move the vc1 specific init code
ahead of this generic code, but all the vc1 fate tests pass, and I've
visually inspected a couple of samples and things seem correct.
Signed-off-by: Philip Langdale <philipl@overt.org>
Ronald S. Bultje [Mon, 17 Aug 2015 16:25:39 +0000 (12:25 -0400)]
Put remaining pieces of CODEC_FLAG_EMU_EDGE under FF_API_EMU_EDGE.
The amv one probably looks suspicious, but since it's an intra-only
codec, I couldn't possibly imagine what it would use the edge for,
and the vsynth fate result doesn't change, so it's probably OK.
Donny Yang [Wed, 19 Aug 2015 06:41:23 +0000 (06:41 +0000)]
apng: Support inter-frame compression
The current algorithm is just "try all the combinations, and pick the best".
It's not very fast either, probably due to a lot of copying, but will do for
an initial implementation.
Signed-off-by: Donny Yang <work@kota.moe> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
lummax [Thu, 27 Aug 2015 10:40:30 +0000 (12:40 +0200)]
avcodec: Assert on codec->encode2 in encode_audio2
Assert on `avctx->codec->encode2` to avoid a SEGFAULT on the subsequent
function call.
avcodec_encode_video2() uses a similar assertion.
Calling the wrong function on a stream is a serious inconsistency
which could at other places be potentially dangerous and exploitable,
it is thus safer to stop execution and not continue with such
inconsistency after returning an error.
Commit-message-extended-by commiter Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>