]>
git.sesse.net Git - narabu/log
summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Steinar H. Gunderson [Mon, 16 Oct 2017 19:27:50 +0000 (21:27 +0200)]
Make the encoder 100% GPU. Not working yet, though.
Steinar H. Gunderson [Thu, 12 Oct 2017 17:05:31 +0000 (19:05 +0200)]
Silence some Mesa warnings.
Steinar H. Gunderson [Thu, 12 Oct 2017 17:02:07 +0000 (19:02 +0200)]
Add rANS normalization to the encoder.
Steinar H. Gunderson [Tue, 10 Oct 2017 20:49:21 +0000 (22:49 +0200)]
Speed up the histogram counting immensely by adding via local memory.
Steinar H. Gunderson [Tue, 10 Oct 2017 16:07:23 +0000 (18:07 +0200)]
Make quant_matrix a bit more compact.
Steinar H. Gunderson [Tue, 10 Oct 2017 16:04:07 +0000 (18:04 +0200)]
Start trying to count the rANS distributions from the encoding shader.
Steinar H. Gunderson [Mon, 9 Oct 2017 21:13:05 +0000 (23:13 +0200)]
Minor whitespace fix.
Steinar H. Gunderson [Mon, 9 Oct 2017 21:12:55 +0000 (23:12 +0200)]
Add some more debugging.
Steinar H. Gunderson [Mon, 9 Oct 2017 21:11:44 +0000 (23:11 +0200)]
Fix the DCT scaling (I believe).
Steinar H. Gunderson [Sun, 8 Oct 2017 21:26:08 +0000 (23:26 +0200)]
Fix the upload type of the image.
Steinar H. Gunderson [Sun, 8 Oct 2017 13:05:37 +0000 (15:05 +0200)]
Update qdd with newer DC coefficient predictions.
Steinar H. Gunderson [Sat, 7 Oct 2017 23:08:25 +0000 (01:08 +0200)]
Fix an IDCT error.
Steinar H. Gunderson [Sat, 7 Oct 2017 22:54:43 +0000 (00:54 +0200)]
A sign fix in the FDCT.
Steinar H. Gunderson [Sat, 7 Oct 2017 21:22:14 +0000 (23:22 +0200)]
Add the beginnings of a GPU encoder.
It doesn't really work currently (too buggy), only does DCT
(not the rANS part), and only encodes luma.
Steinar H. Gunderson [Fri, 6 Oct 2017 18:08:46 +0000 (20:08 +0200)]
Add support for repeating blocks. About 2% size reduction.
Steinar H. Gunderson [Thu, 5 Oct 2017 18:30:19 +0000 (20:30 +0200)]
Add some code for calculating maximum coefficent ranges, for bit allocation.
Steinar H. Gunderson [Tue, 3 Oct 2017 22:44:41 +0000 (00:44 +0200)]
Revert "Switch to 64-bit rANS, although probably due for immediate revert (just want to preserve history)."
A bit larger files, no real speed gain, a few slight bugs.
This reverts commit
3fb87c6b953be3382cd216c74ff6aa025c8eaa2a .
Steinar H. Gunderson [Tue, 3 Oct 2017 22:38:36 +0000 (00:38 +0200)]
Switch to 64-bit rANS, although probably due for immediate revert (just want to preserve history).
Steinar H. Gunderson [Tue, 3 Oct 2017 22:30:02 +0000 (00:30 +0200)]
Don't print out the shader on failure, as it's not autogenerated.
Steinar H. Gunderson [Sun, 24 Sep 2017 19:10:36 +0000 (21:10 +0200)]
Reduce the spam level from qdc a little bit.
Steinar H. Gunderson [Sun, 24 Sep 2017 17:46:00 +0000 (19:46 +0200)]
Make the number of GPU iterations a named constant.
Steinar H. Gunderson [Sun, 24 Sep 2017 17:45:40 +0000 (19:45 +0200)]
Make the GPU decoder (finally) work with any resolution.
Steinar H. Gunderson [Sun, 24 Sep 2017 17:39:25 +0000 (19:39 +0200)]
Make blocks per stream a named constant.
Steinar H. Gunderson [Sun, 24 Sep 2017 17:28:07 +0000 (19:28 +0200)]
Get -Wall clean.
Steinar H. Gunderson [Sun, 24 Sep 2017 16:42:10 +0000 (18:42 +0200)]
Stop hardcoding blocks per row in the shader.
Steinar H. Gunderson [Sun, 24 Sep 2017 13:35:10 +0000 (15:35 +0200)]
Predict Y DC from 128 instead of 0; microscopic improvement.
Steinar H. Gunderson [Sun, 24 Sep 2017 13:29:44 +0000 (15:29 +0200)]
Predict DC across the entire slice instead of resetting each row. Opens up for slices crossing rows easier.
Steinar H. Gunderson [Sun, 24 Sep 2017 13:23:03 +0000 (15:23 +0200)]
Sanitize compile flags.
Steinar H. Gunderson [Thu, 21 Sep 2017 21:50:58 +0000 (23:50 +0200)]
Make num_blocks a uniform.
Steinar H. Gunderson [Wed, 20 Sep 2017 21:22:20 +0000 (23:22 +0200)]
Use WIDTH and HEIGHT some places instead of 1280 and 720. narabu is still not ready for anything but 1280px wide, though.
Steinar H. Gunderson [Tue, 19 Sep 2017 23:01:19 +0000 (01:01 +0200)]
Prepare for more flexible slices.
They're now always 320 blocks long, but this will probably change in
the future. Note that changing it from “two rows” to “320 blocks”
made chroma blocks a lot longer, which saved 4.9% bitrate overall.
Steinar H. Gunderson [Sun, 17 Sep 2017 11:45:59 +0000 (13:45 +0200)]
DC predict chroma. ~1.5% lower bitrate.
Steinar H. Gunderson [Sun, 17 Sep 2017 09:53:21 +0000 (11:53 +0200)]
Symbolize NUM_SYMS a bit.
Steinar H. Gunderson [Sun, 17 Sep 2017 09:50:04 +0000 (11:50 +0200)]
Go down to 4 rANS streams instead of 8.
Costs approx 0.8% bitrate, but reduces GPU cost from 1,3 to 1,2 ms
(~8%) due to less L1 cache pressure.
Steinar H. Gunderson [Sun, 17 Sep 2017 09:41:41 +0000 (11:41 +0200)]
Revert "k-means instead of k-medoids; doesn't work as well, so just keep it here to be immediately reverted."
This reverts commit
fb83fc30cf33cec1d155b3a63c338bbb64adb4e3 .
Steinar H. Gunderson [Sun, 17 Sep 2017 09:41:36 +0000 (11:41 +0200)]
k-means instead of k-medoids; doesn't work as well, so just keep it here to be immediately reverted.
Steinar H. Gunderson [Sun, 17 Sep 2017 09:06:32 +0000 (11:06 +0200)]
Add some code for (semi-)optimal assignment of rANS coefficients to streams.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:57:33 +0000 (15:57 +0200)]
Add a Makefile.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:57:24 +0000 (15:57 +0200)]
Add a .gitignore file.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:38:22 +0000 (15:38 +0200)]
Add some inactive debugging code to store the coefficients.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:38:11 +0000 (15:38 +0200)]
Add some parallel slicing code (not really a win).
Steinar H. Gunderson [Sat, 16 Sep 2017 13:36:56 +0000 (15:36 +0200)]
Remove some obsolete caching code.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:15:52 +0000 (15:15 +0200)]
Add a test image.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:11:24 +0000 (15:11 +0200)]
Add a PSNR measurement tool.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:10:19 +0000 (15:10 +0200)]
Encode sign bit directly in rANS, using some symmetry trickery.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:19:13 +0000 (15:19 +0200)]
Add the GPU decoder itself.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:14:36 +0000 (15:14 +0200)]
Add the decoder.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:03:57 +0000 (15:03 +0200)]
Add color support.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:03:21 +0000 (15:03 +0200)]
Change quantization to MPEG-2, some other changes.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:02:15 +0000 (15:02 +0200)]
Revert "Encoder with 4x4 blocks (using TF switching)."
This reverts commit
13db6a93d746d4152eb30f2d7cc7035d441df8ba .
Steinar H. Gunderson [Sat, 16 Sep 2017 13:01:46 +0000 (15:01 +0200)]
Encoder with 4x4 blocks (using TF switching).
Steinar H. Gunderson [Sun, 27 Aug 2017 18:44:04 +0000 (20:44 +0200)]
Add support for optimal renormalization.
The current code for rounding probabilities down to a fixed resolution
is a bit too crude when resolution is low; whether it's optimal to round
up or down depends on the other frequencies, and the code for stealing
slots from other symbols also doesn't take this into account (as the
comment rightfully points out). These effects only really show up when
getting down to lower resolution, e.g. prob_bits = 10, but there, they
can be quite pronounced.
Add a function (0-clause BSD licensed for optimal usefulness, ie. public
domain without potential difficulty about whether private persons can put
anything in the public domain) to calculate the optimal distribution of
encoding slots, basically through brute force plus some memoization.
For e.g. the basic example (main.cpp), the number of bytes for prob_bits = 10
goes down from 440895 to 437590 bytes. (At the default prob_bits = 14, the
difference is very low; from 435113 to 435093 bytes.)
For main_alias.cpp, which has prob_bits = 16, I've kept the existing code,
so that it's not lost to the mists of time in case someone wants something
that's nearly even cheaper in terms of startup cost.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:12:35 +0000 (15:12 +0200)]
Add the missing DCT code.
Steinar H. Gunderson [Sat, 16 Sep 2017 13:05:19 +0000 (15:05 +0200)]
Embed ryg_rans (from https://github.com/rygorous/ryg_rans).
Steinar H. Gunderson [Sat, 16 Sep 2017 13:01:23 +0000 (15:01 +0200)]
Initial checkin.