avcodec/aacdec, sinewin: Move 120 and 960 point sine tables to aacdec
The floating point AAC decoder is the only user of these tables, so it
makes sense to move them there. Furthermore, initializing the ordinary
power-of-two sinetables is currently not thread-safe and if the 120- and
960-point sinetables were not moved, one would have to choose whether
to guard initializing these two tables with their own AVOnces or not.
Doing so would add unnecessary AVOnces as the AAC decoder already guards
initializing its static data by an AVOnce; not doing so would be fragile
if a second user of these tables were to be added.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpegaudio_tablegen: Make exponential LUT shared
Both the fixed as well as the floating point mpegaudio decoders use
LUTs of type int8_t and uint32_t with 32K entries each; these tables
are completely the same, yet they are not shared. This commit makes
them shared. When both fixed as well as floating point decoders are
enabled, this saves 160KiB from the bss segment for a normal build
(translating into 160KiB less memory usage if both a shared as well as
a floating point decoder have actually been used) and 160KiB from the
binary for a build with hardcoded tables.
It also means that the code to create said LUTs is no longer duplicated
(for a normal build).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpegaudiodec: Hardcode tables to save space
The csa_tables (which always consist of 32 entries of four byte each,
but the type depends upon whether the decoder is fixed or
floating-point) are currently initialized once during decoder
initialization; yet it turns out that this is actually no benefit: The
code used to initialize these tables takes up 153 (fixed point) and 122
(floating point) bytes when compiled with GCC 9.3 with -O3 on x64, so it
is better to just hardcode these tables.
Essentially the same applies to the is_tables: They have a size of 128B
each and the code to initialize them occupies 149 (fixed point) resp.
140 (floating point) bytes. So hardcode them, too.
To make the origin of the tables clear, references to the code used to
create them have been added.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpegaudio_tablegen: Don't inappropriately use static array
Each invocation of this function is only entered once, so using a static
array makes no sense (and given that the whole array is reinitialized at
the beginning of this function, it wouldn't even make sense if the
function were called multiple times).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The mpegaudio_tablegen header contains code to initialize several
tables; it is included in both the fixed as well as the floating point
mpegaudio decoders and some of these tables are only used by the fixed
resp. floating point decoders; yet both types are always initialized,
leaving the compiler to figure out that one of them is unused.
GCC 9.3 fails at this (even with -O3):
$ readelf -s mpegaudiodec_fixed.o|grep _float
28: 0000000000001660 32768 OBJECT LOCAL DEFAULT 4 expval_table_float
An actually unused table (expval_table_fixed/float) of size 32KiB is kept
and initialized (the reason for this is probably that this table is read
from, namely to initialize another table: exp_table_fixed/float; of course
the float resp. fixed tables are not used in the fixed resp. floating point
decoder).
Therefore #ifdef the unneeded tables away.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpegaudiodec: Combine tables used to initialize VLCs
Up until now, there were several indiviual tables which were accessed
via pointers to them; by combining the tables, one can avoid said
pointers, saving space.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpegaudiodec: Reduce the size of tables used to initialize VLCs
By switching from ff_init_vlc_sparse() to ff_init_vlc_from_lengths() one
can replace tables of codes of type uint16_t by tables of symbols of
type uint8_t; this saves about 1.3KB for both the fixed and floating
point decoders (if enabled).
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mv30: Reduce the size of tables used to initialize VLCs
By switching from ff_init_vlc_sparse() to ff_init_vlc_from_lengths() one
can remove the array of codes of type uint16_t here; given that the
symbols are the default ones (0,1,2,...), no explicit symbols table
needs to be added.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Expressions like array[get_vlc2()] can be optimized by using a symbols
table if the array is always the same for a given VLC. This requirement
is fulfilled for several VLCs used by the AAC decoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/qdmc: Make tables used to initialize VLCs smaller
This is possible by switching to ff_init_vlc_from_lengths() which allows
to replace tables of codes of size uint16_t or uint32_t by tables of
symbols of size uint8_t; in case there already were symbols tables the
savings are even bigger.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
MagicYUV transmits its Huffman trees by providing the length of the code
corresponding to each symbol; then the decoder has to assemble the table
in such a way that (i) longer codes are to the left of the tree and (ii)
for codes of the same length the symbols are ascending from left to right.
Up until now the decoder did this as follows: It counted the number of
codes of each length and derived the first code of a given length via
(ii). Then the array of lengths is traversed a second time to create
the codes; there is one running counter for each length to do so. This
process creates a default symbol table (that is omitted).
This commit changes this as follows: Everything is indexed by the
position in the tree (with codes to the left first); given (i), we can
calculate the ranges occupied by the codes of each length; and with (ii)
we can derive the actual symbols of each code; the running counters for
each length are now used for the symbols and not for the codes.
Doing so allows us to switch to ff_init_vlc_from_lengths(); this has the
advantage that the codes table needs only be traversed once and that the
codes need not be sorted any more (right now, the codes that are so long
that they will be put into subtables need to be sorted so that codes
that end up in the same subtable are contiguous).
For a sample produced by our encoder (natural content, 4000 frames,
YUV420p, ten iterations, GCC 9.3) this decreased the amount of
decicycles for each call to build_huffman() from 1336049 to 1309401.
Notice that our encoder restricts the code lengths to 12 and our decoder
only uses subtables when the code is longer than 12 bits, so the sorting
that can be avoided does not happen at the moment. If one reduces the
decoder's tables to nine bits, the performance improvement becomes more
apparent: The amount of decicycles for build_huffman() decreased from 1165210 to 654055.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/utvideodec: Avoid implicit qsort when creating Huffman tables
The Huffman trees used by Ut Video have two important characteristics:
(i) Longer codes are on the left of the tree and (ii) for codes of the
same length, the symbol is descending from left to right in the tree.
Therefore all the information that needs to be transmitted is how long
the code corresponding to a given symbol is; and this is also all that
is transmitted.
Before 341914495e5c2f60bc920b0c6f660e5948a47f5a, the decoder used qsort
to sort the (length, symbol) pairs by ascending length and for equal
lengths by ascending symbol. Since said commit, the decoder uses
a first pass over the lengths table to count how many symbols of each
length there are; with (i) one can then easily calculate the code of
the left-most code with a given length in the tree and from there one
can calculate the codes for all entries, using one running counter for
each possible length. This eliminated the explicit qsort in
build_huff().
Yet ff_init_vlc_sparse() sorts the table itself as it has to ensure that
all the entries that will be placed in the same subtable are contiguous.
The tables created now are non-contiguous (they are ordered by symbol
and codes of different length aren't ordered at all; only codes of the
same length are ordered according to (ii)).
This commit therefore modifies the algorithm used to automatically create
tables whose codes are sorted from left to right in the tree. The key to
do so is the observation that the counts obtained in the first pass can
be used to contain the range of the codes of each length in the second
pass: If counts[i] is the count of codes with length i, then the first
counts[32] codes are of length 32, the next counts[31] codes are of
length 31 etc. So one knows the index of the lowest symbol whose code
has length 32 (if any): It is counts[32] - 1 due to (ii), whereas the
index of the lowest symbol whose code has length 31 (if any) is
counts[32] + counts[31] - 1; the index of the second-to-lowest symbol of
length 32 (if existing) is counts[32] - 2 etc.
If one follows the algorithm outlined above, one can switch to
ff_init_vlc_from_lengths() which has no implicit qsort; it also means
that one can offload the computation of the codes.
This turned out to be beneficial for performance: For the sample from
ticket #4044 it decreased the decicycles spent on one call to
build_huff() from 508480 to 340688 (GCC 9.3, looping 10 times over the
file to get enough runs and then repeating this ten times); for another
sample (YUV420p, natural content, 5500 frames, also ten iterations)
the time went down from 382346 to 275533 decicycles.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Expressions like array[get_vlc2()] can be optimized by using a symbols
table if the array is always the same for a given VLC. This requirement
is fulfilled for several VLCs used by ATRAC3, therefore this commit
implements this. This comes without any additional costs when using
ff_init_vlc_from_lengths() as one can then remove the codes tables.
While at it, remove the arrays of pointers to the individual arrays and
put all lengths+symbol pairs in one big array.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/intrax8: Reduce the size of tables used to initialize VLCs
By switching from ff_init_vlc_sparse() to ff_init_vlc_from_lengths() one
can replace an array of codes of type uint16_t with an array of symbols
of type uint8_t, saving space.
Also remove some more code duplication while at it.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/vp3: Use symbols table for VP3 motion vectors
Expressions like array[get_vlc2()] can be optimized by using a symbols
table if the array is always the same for a given VLC. This requirement
is fulfilled for the VLC used for VP3 motion vectors. The reason it
hasn't been done before is probably that the array in this case
contained entries in the range -31..31; but this is no problem with
ff_init_vlc_from_lengths(): Just apply an offset of 31 to the symbols
before storing them in the table used to initialize VP3 motion vectors
and apply an offset of -31 when initializing the actual VLC.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/vp3: Make tables used to initialize VLCs smaller
This is possible by switching to ff_init_vlc_from_lengths() because it
allows to replace codes of type uint16_t by symbols of type uint8_t; in
some cases (like here) it also allows to replace explicitly coded
codes by implicitly coded symbols.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
By switching to ff_init_vlc_from_lengths() one can apply both positive
as well as negative offsets for free; in this case it even saves space
because one replaces codes tables that don't fit into an uint8_t by
symbols tables that fit into an uint8_t or can even be completely
avoided as they are trivial.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/atrac9dec: Don't create VLCs that are never used
The ATRAC9 decoder creates VLCs with parameters contained in
HuffmanCodebooks; some of these HuffmanCodebooks are empty and yet
VLCs (that were completely unused*) were created from them. Said VLC
contained a single table with 512 VLC_TYPE[2] entries, each of which
indicated that this is an invalid code. This commit stops creating said
VLCs.
*: read_coeffs_coarse() uses the HuffmanCodebook
at9_huffman_coeffs[cb][prec][cbi]. prec is c->precision_coarse[i] + 1
and every precision_coarse entry is in the 1..15 range after
calc_precision(), so prec is >= 2 (all codebooks with prec < 2 are
empty). The remaining empty codebooks are those with cb == 1 and cbi ==
0, yet this is impossible, too: cb is given by c->codebookset[i] and
this is always 0 if i < 8 (because those are never set to anything else
in calc_codebook_idx()) and cbi is given by at9_q_unit_to_codebookidx[i]
which is never zero if i >= 8.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/atrac9dec: Make tables used to initialize VLCs smaller
The ATRAC9 decoder uses VLCs which are currently initialized with
static length tables of type uint8_t and code tables of type uint16_t.
Furthermore, in one case the actually desired symbols are in the range
-16..15 and in order to achieve this an ad-hoc symbols table of type
int16_t is calculated.
This commit modifies this process by replacing the codes tables by
symbols tables and switching to ff_init_vlc_from_lengths(); the signed
symbols are stored in the table after having been shifted by 16 to fit
into an uint8_t and are shifted back when the VLC is created. This makes
all symbols fit into an uint8_t, saving space. Furthermore, the earlier
tables had holes in them (entries with length zero that were inserted
because the actually used symbols were not contiguous); these holes are
unnecessary in the new approach, leading to further saving.
Finally, given that now both lengths as well as symbols are of the same
type, they can be combined; this saves a pointer for each VLC.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/atrac3plus: Run-length encode length tables to make them smaller
This is very beneficial for the scale factor tables where 4*64+4*15
bytes of length information can be replaced by eight codebooks of 12
bytes each; furthermore the number of codes as well as the maximum
length of a code can be easily derived from said codebooks, making
tables containing said information superfluous. This and combining the
symbols into one big array also made an array of pointers to the tables
redundant.
For the wordlen and code table tables the benefits are not that big
(given these tables don't contain that many elements), but all in all
using codebooks is also advantageouos for them. Therefore it has been
done.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/atrac3plus: Combine codebooks into one array
ATRAC3+ uses VLCs whose code lengths are ascending from left to right in
the tree; ergo it is possible (and done) to run-length encode the
lengths into so-called codebooks. These codebooks were variable-sized:
The first byte contained the minimum length of a code, the second the
maximum length; this was followed by max - min + 1 bytes containing the
actual numbers. The minimal min was 1, the maximal max 12.
While one saves a few bytes by only containing the range that is
actually used, this is more than offset by the fact that there needs
to be a pointer to each of these codebooks.
Furthermore, since 5f8de7b74147e2a347481d7bc900ebecba6f340f the content
of the Atrac3pSpecCodeTab structure (containing data for spectrum
decoding) can be cleanly separated into fields that are only used during
initialization and fields used during actual decoding: The pointers to
the codebooks and the field indicating whether an earlier codebook should
be reused constitute the former category. Therefore the new codebooks are
not placed into the Atrac3pSpecCodeTab (which is now unused during
init), but in an array of its own. The information whether an earlier
codebook should be reused is encoded in the first number of each
spectrum codebook: If it is negative, an earlier codebook (given by the
number) should be reused.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
This allows to remove lots of pointers (130) to small symbol tables;
it has the downside that some of the default tables must now be coded
explicitly, but this costs only 6 + 4 + 8 + 16 + 8 bytes and is therefore
dwarfed by the gains.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/atrac3plus: Make tables used to initialize VLCs smaller
The ATRAC3+ decoder currently uses ff_init_vlc_sparse() to initialize
several VLCs; sometimes a symbols table is used, sometimes not; some of
the codes tables are uint16_t, some are uint8_t. Because of these two
latter facts it makes sense to switch to ff_init_vlc_from_lengths()
because it allows to remove the codes at the cost of adding symbols
tables of type uint8_t in the cases where there were none before.
Notice that sometimes the same codes and lengths tables were reused with
two different symbols tables; this could have been preserved (meaning
one could use a lengths table twice), but hasn't, because this allows
to use only one pointer to both the symbols and lengths instead of two
pointers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ff_init_vlc_from_lengths() can be used to offload the computation
of the codes; it also allows to omit the check whether the codes
are already properly ordered (they are). In this case, this also allows
to avoid the allocation of the buffer for the codes.
This improves performance: The amount of decicycles for one call to
tm2_build_huff_tables() when decoding tm20.avi from the FATE-suite
decreased from 46239 to 40035. This test consisted of looping 50 times
over the file and iterating the test ten times.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpeg4video: Make tables used to initialize VLCs smaller
Switching from ff_init_vlc_sparse() to ff_init_vlc_from_lengths()
allows to replace codes which are so long that they need to be stored
in an uint16_t by symbols which fit into an uint8_t; and even these can
be avoided in case of the sprite trajectory VLC.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/indeo2: Make tables used to initialize VLCs smaller
Switching from ff_init_vlc_sparse() to ff_init_vlc_from_lengths()
allows to replace codes which are so long that they need to be stored
in an uint16_t by symbols which fit into an uint8_t; furthermore, it is
also easily possible to already incorporate the offset (the real range
of Indeo 2 symbols starts at one, not zero) into the symbols.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ff_init_vlc_from_lengths() can be used to offload the computation
of the codes; it also allows to omit the check whether the codes
are already properly ordered (they are).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
ff_init_vlc_from_lengths() can be used to offload the computation
of the codes; it also needn't check whether the codes are already
properly ordered (they are).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The lengths of the codes used by the mss4 decoder are ascending from
left to right and therefore the lengths can be run-length encoded and
the codes can be easily derived from them. And this is how it is indeed
done. Yet some things can nevertheless be improved:
a) The number of entries of the current VLC is implicitly contained in
the run-length table and needn't be externally prescribed.
b) The maximum length of a code is just the length of the last code
(given that the lengths are ascending), so there is no point in setting
max_bits in the loop itself.
c) One can offload the actual calculation of the codes to
ff_init_vlc_from_lengths().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv40: Avoid code duplication when initializing VLCs
Besides removing code duplication the method for determining the offset
of each VLC table in the VLC_TYPE buffer also has the advantage of not
wasting space for skipped AIC mode 1 VLCs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv40: Make better use of VLC symbols table
RealVideo 4.0 has a VLC that encodes two intra types per code; each
intra type is in the range 0..8 (inclusive) and up until now the VLC
used symbols in the range 0..80; one type was encoded as the remainder
when dividing the symbol by 9 whereas the other type was encoded as
symbol / 9. This is suboptimal; a better way would be to use the high
and low nibble to encode each symbol. But an even better way is to use
16bit symbols so that the two intra types can be directly written as
a 16bit value.
This commit implements this; in order to avoid huge tables the symbols
are stored as uint8_t with high and low nibbles encoding one type each;
they are only unpacked to uint16_t during initialization.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv40: Make the tables used to initialize VLCs smaller
After permuting the codes, symbols and lengths tables used to initialize
the VLC so that the codes are ordered from left to right in the Huffman
tree, the codes become redundant as they can be easily computed from the
lengths at runtime; in this case one has to use explicit symbol tables,
but all the symbols used here fit into an uint8_t, whereas some codes
needed uint16_t. This saves about 1.6KB.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/qdm2: Make tables used to initialize VLCs smaller
After permuting the codes, symbols and lengths tables used to initialize
the VLCs so that the codes are ordered from left to right in the Huffman
tree, the codes become redundant as they can be easily computed from the
lengths at runtime (or at compile time with --enable-hardcoded-tables);
in this case one has to use explicit symbol tables, but all the symbols
used here fit into an uint8_t, whereas some codes needed uint16_t.
Furthermore, the codes had holes because the range of the symbols was not
contiguous; these have also been removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mobiclip: Avoid redundant codes table to initialize VLCs
If both codes, lengths and symbols tables are ordered so that the codes
are sorted from left to right in the tree, the codes can be easily
derived from the lengths and therefore become redundant. This is
exploited in this commit to remove the codes tables for the mobiclip
decoder; notice that tables for the run-length VLC were already ordered
correctly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpc8: Avoid code duplication when initializing VLCs
Up until now, VLCs that were part of an array of VLCs were often not
initialized in a loop, but separately. The probable reason for this
was that these VLCs differed slightly in the parameters to be used for
them (i.e. the number of codes or the number of bits to be used
differs), so that one would have to provide these parameters e.g. via
arrays.
Yet these problems have actually largely been solved by now: The length
information is contained in a run-length encoded form that is the same
for all VLCs and both the number of codes as well as the number of bits
to use for each VLC can be easily derived from them.
There is just one problem to be solved: When the underlying tables have
a different number of elements, putting them into an array of arrays
would be wasteful; using an array of pointers to the arrays would
also be wasteful. Therefore this commit combines the tables into bigger
tables. (Given that all the length tables have the same layout this
applies only to the symbols tables.)
Finally, the array containing the offset of the VLC's buffer in the big
buffer has also been removed.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Several of the quantisation VLCs come in pairs and up until now the
number of bits used for each VLC was set to the same value for both VLCs
in such a pair even when one of the two required only a lower number.
This is a waste given that the get_vlc2() call is compatible with these
two VLCs using a different number of bits (it uses vlc->bits).
Given that the code lengths are descending it is easily possible to know
the length of the longest code for a given VLC: It is the length of the
first one. With this information one can easily use the least amount of
bits.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpc8: Reduce the size of the length tables to initialize VLCs
After permuting both length, code as well as symbol tables so that
the codes are ordered from left to right in the tree, it became apparent
that the length of the codes decreases from left to right. Therefore one
can run-length encode the lengths to save space. This commit implements
this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mpc8: Reduce size of tables used to initialize VLCs
By switching to ff_init_vlc_from_lengths() one can make a table of
codes of type uint8_t superfluous, saving space.
Other VLCs (those without dedicated symbols table and with codes of
type uint8_t) have been made to use ff_init_vlc_from_lengths(), too,
because it reduces codesize (ff_init_vlc_from_lengths() has two
parameters less than ff_init_vlc_sparse()) and because it allows to
use the offset parameter in future commits.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/mimic: Reduce size of tables used to initialize VLCs
By switching to ff_init_vlc_from_lengths() one can replace a table of
codes of type uint32_t with a table of symbols of type uint8_t saving
space. The old tables also had holes in it (because of the symbols) which
are now superfluous, saving ever more space.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv10: Simplify handling of skip VLC entries
The VLC tables to be used for parsing RealVideo 1.0 DC coefficients are
weird: The luma table contains a block of 2^11 codes beginning with the
same prefix and length that all have the same symbol (i.e. value only
depends upon the prefix); the same goes for the chroma block (except
it's only 2^9 codes). Up until now, these entries (which generally could
be parsed like ordinary entries with subtables) have been treated
specially: They have been treated like open ends of the tree, so that
get_vlc2() returned a value < 0 upon encountering them; afterwards it
was checked whether the right prefix was used and if so, the appropriate
number of bytes was skipped.
But there is actually an easy albeit slightly hacky way to support them
directly without pointless subtables: Just modify the VLC table so that
all the entries sharing the right prefix have a length that equals the
length of the whole entry.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
These tables were huge (14 bits) because one needed 14 bits in order to
find out whether a code is valid and in the VLC table or a valid code that
required hacky workarounds due to RealVideo 1.0 using multiple codes
for the same symbol and the code predating the introduction of symbols
tables for VLCs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
The RealVideo 1.0 decoder uses VLCs to parse DC coefficients. But the
values returned from get_vlc2() are not directly used; instead
-(val - 128) (which is in the range -127..128) is. This transformation
is unnecessary as it can effectively be done when initializing the VLC
by modifying the symbols table used. There is just one minor
complication: The chroma table is incomplete and in order to distinguish
an error from get_vlc2() (due to an invalid code) the ordinary return
range is modified to 0..255. This is possible because the only caller of
this function is (on success) only interested in the return value modulo
256.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv10: Reduce number of exceptions when reading VLC value
RealVideo 1.0 uses an insane way to encode DC coefficients: There are
several symbols that (for no good reason whatsoever) have multiple
encodings, leading to longer codes than necessary.
More specifically, the tree for the 256 luma symbols contains 255 codes
belonging to 255 different symbols on the left; going further right,
the tree consists of two blocks of 128 codes each of length 14 encoding
consecutive numbers (including two encodings for the symbol missing among
the 255 codes on the left); this is followed by two blocks of codes of
length 16 each containing 256 elements with consecutive symbols (i.e.
each of the blocks allows to encode all symbols). The rest of the tree
consists of 2^11 codes that all encode the same symbol.
The tree for the 256 chroma symbols is similar, but is missing the
blocks of length 256 and there are only 2^9 consecutive codes that
encode the same symbol; furthermore, the chroma tree is incomplete:
The right-most node has no right child.
All of this caused problems when parsing these codes; the reason is that
the code for this predates commit b613bacca9c256f1483c46847f713e47a0e9a5f6
which added support for explicit symbol tables and thereby removed the
requirement that different codes have different symbols. In order to
address this, the trees used for parsing were incomplete: They contained
the 255 codes on the left and one code for the remaining symbol. Whenever
a code not in these trees was encountered, it was dealt with in
special cases (one for each of the blocks mentioned above).
This commit reduces the number of special cases: Using a symbols table
allows to treat the blocks of consecutive symbols like ordinary codes;
only the blocks encoding a single symbol are still treated specially
(in order not to waste memory on tables for them).
In order to not increment the size of the tables used to initialize the
VLCs both the symbols as well as the lengths are now run-length encoded.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/rv10: Reduce the size of the tables used to initialize VLCs
This can be achieved by switching to ff_init_vlc_from_lengths() which
allows to replace two uint16_t tables for codes with uint8_t tables for
the symbols by permuting the tables so that the codes are ordered from
left to right in the tree in which case they can be easily computed from
the lengths at runtime.
And after doing so, it became apparent that the tables for the symbols
are actually the same for luma and chroma, so that one can even omit one
of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/cook: Avoid big length tables for VLC initialization
Permuting the tables used to initialize the Cook VLCs so that the code
tables are ordered from left to right in the tree revealed that the
length of the codes are ascending from left to right. Therefore one can
run-length encode them to avoid the big length tables; this saves a bit
more than 1KB.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
avcodec/cook: Make tables to initialize VLCs smaller
Up until now, the Cook decoder used tables for the lengths of codes and
tables of the codes itself to initialize VLCs; the tables for the codes
were of type uint16_t because the codes were so long. It did not use
explicit symbol tables. This commit instead reorders the tables so that
the code tables are sorted from left to right in the tree. Then the
codes can be easily derived from the lengths and therefore be omitted.
This comes at the price of explicitly coding the symbols, but this is
nevertheless a net win because most of the symbols tables can be coded
on one byte. Furthermore, Cook actually does not use a contiguous range
of symbols for its main VLC tables and the old code compensated for that
by adding holes (codes of length zero) to the tables (that are skipped by
ff_init_vlc_sparse()). This is no longer necessary with the new
approach. All in all, this saves about 1.7KB.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>