git.sesse.net Git - movit/log

]> git.sesse.net Git - movit/log

Steinar H. Gunderson [Sun, 26 Feb 2017 15:10:47 +0000 (16:10 +0100)]

Fix an issue where the last pass would have been rendered with the sRGB flag set, which confused Qt applications running in certain NVIDIA configurations.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 20 Feb 2017 20:44:33 +0000 (21:44 +0100)]

Fix compiling without C++11.

ABI break. Reported by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 20 Feb 2017 18:56:53 +0000 (19:56 +0100)]

Fix a bad typo.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 19:35:31 +0000 (20:35 +0100)]

Bump MOVIT_VERSION for all the YCbCr changes.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 19:35:19 +0000 (20:35 +0100)]

Treat num_levels == 0 as 256, for the benefit of older applications.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 15:55:44 +0000 (16:55 +0100)]

Cosmetic tweak in YCbCrInput.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 15:55:33 +0000 (16:55 +0100)]

Fix a stack buffer overrun in ycbcr_input_test.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 15:34:15 +0000 (16:34 +0100)]

Implement mipmap generation in YCbCrInput, now that we advertise single-texture.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 19 Feb 2017 10:02:09 +0000 (11:02 +0100)]

Loosen up some restrictions on YCbCrInput if we have interleaved mode.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 14 Feb 2017 21:26:28 +0000 (22:26 +0100)]

Add support for 10- and 12-bit planar Y'CbCr inputs.

This is mostly for completeness; at least for 10-bit, 10:10:10:2
should be a faster format. However, it's nice to allow direct
subsampled inputs _somehow_.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 14 Feb 2017 20:34:27 +0000 (21:34 +0100)]

Some minor comment fixes in ycbcr.h.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 14 Feb 2017 17:50:07 +0000 (18:50 +0100)]

Add input support for packed 10-bit Y'CbCr.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 13 Feb 2017 23:18:59 +0000 (00:18 +0100)]

Support interleaved (chunky) 4:4:4 in YCbCrInput.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 13 Feb 2017 23:10:10 +0000 (00:10 +0100)]

Fix some test breakage.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 12 Feb 2017 23:55:14 +0000 (00:55 +0100)]

Support 10- and 12-bit Y'CbCr output.

We don't have any input support yet; the constants are put in place,
but we also need some work on supporting (semi-)adequate input formats.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 11 Feb 2017 21:13:02 +0000 (22:13 +0100)]

Allow adjusting the output Y'CbCr coefficients after finalize.

Primarily useful for Nageru, which may have to switch output modes runtime.
Pretty much the same speed (just a single extra branch on a boolean uniform),
as constants and uniforms are typically the same speed and we're generally
ALU-bound.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Nov 2016 10:45:46 +0000 (11:45 +0100)]

Release Movit 1.4.0.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Nov 2016 10:41:46 +0000 (11:41 +0100)]

Fix a typo.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Nov 2016 10:32:56 +0000 (11:32 +0100)]

Fix an issue where a (cached) shader program could be used from multiple
threads at a time.

This isn't allowed, since uniforms belong to the program, not to the
context. Found by running Helgrind over Nageru.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 26 Jul 2016 13:30:56 +0000 (15:30 +0200)]

Make the error printed on check_error() slightly friendlier: Include the enum if possible, and print it to stderr instead of stdout.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 21 Jul 2016 22:40:50 +0000 (00:40 +0200)]

Be more defensive about width/height/pitch given to FlatInput and YCbCrInput.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 1 Apr 2016 21:51:28 +0000 (23:51 +0200)]

Remove a very old and outdated comment.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 1 Apr 2016 21:47:30 +0000 (23:47 +0200)]

Add some clarifying comments about the intermediate formats.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 1 Mar 2016 00:00:51 +0000 (01:00 +0100)]

Fix some comment formatting.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 29 Feb 2016 23:59:37 +0000 (00:59 +0100)]

Rework PaddingEffect alpha handling, which also fixes a long-standing assertion failure.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 28 Feb 2016 00:53:55 +0000 (01:53 +0100)]

Allow storing values in intermediate framebuffers as sqrt(x).

Together with GL_RGB10_A2, this would seem to be an even better tradeoff for
many chains than GL_SRGB8_ALPHA8 is, as long as you don't need intermediate
alpha. (We verify its accuracy with a unit test.)

This changes the API for specifying intermediate framebuffers, but that API
was never in a release, so it should be fine.

Also document a rather obscure problem where, if you can actually hold on to
non-linear values across a bounce buffer, you don't really want to store them
in sRGB encoding. (The square-root version actually avoids this problem.
I guess we could snoop on the type and do a similar thing if we see it's an
GL_SRGB* encoding, but it seems so obscure that we can ignore it for now.)

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 28 Feb 2016 00:46:15 +0000 (01:46 +0100)]

Hard-assert on something that has bitten me too many times now.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 28 Feb 2016 00:26:47 +0000 (01:26 +0100)]

Fix a typo.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 27 Feb 2016 23:37:01 +0000 (00:37 +0100)]

Add support for 10-bit RGB framebuffers.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 27 Feb 2016 23:28:31 +0000 (00:28 +0100)]

Add some sRGB conversion functions to test_util.{h,cpp}.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 27 Feb 2016 23:24:04 +0000 (00:24 +0100)]

Bump version after new format support.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 24 Feb 2016 22:18:41 +0000 (23:18 +0100)]

Add support for some of the more esoteric minifloat formats to ResourcePool.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 24 Feb 2016 00:32:49 +0000 (01:32 +0100)]

Do not send NULL to glTexSubImage2D if there is no input data set; it is illegal, and NVIDIA's drivers crash on it.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 23 Feb 2016 21:48:26 +0000 (22:48 +0100)]

Merge branch '1.3.x-release'

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 23 Feb 2016 21:47:05 +0000 (22:47 +0100)]

Release Movit 1.3.2. (From a branch, since I do not want to break ABI compatibility at this point.)

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 23 Feb 2016 21:41:33 +0000 (22:41 +0100)]

Make the sRGB intermediate test slightly more stringent (so that e.g. 0.0 will not work).

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 22 Feb 2016 20:34:05 +0000 (21:34 +0100)]

Make sRGBIntermediate test less sensitive to the exact sRGB choices; fixes unit test on NVIDIA.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 16:28:54 +0000 (17:28 +0100)]

Require OpenGL 3.0 unconditionally; this is a no-op, since we already required GLSL 1.30 (part of OpenGL 3.0).

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 16:13:48 +0000 (17:13 +0100)]

Allow setting the intermediate texture format; useful for reducing bandwidth at the expense of quality, and possibly future GLES2 support.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 14:42:19 +0000 (15:42 +0100)]

Make timer query objects polled asynchronously, so that the CPU blocks less on the GPU when doing timing. ABI break.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 13:19:58 +0000 (14:19 +0100)]

Remove an extern definition that no longer exists.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 13:19:19 +0000 (14:19 +0100)]

Correct/update a comment on rounding.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 20 Feb 2016 13:11:35 +0000 (14:11 +0100)]

Make gamma polynomial constants into an array; slightly fewer uniforms to set, and it makes sense overall, since they belong so much together.

commit | commitdiff | tree

Jean-Baptiste Mardelle [Wed, 17 Feb 2016 22:20:48 +0000 (23:20 +0100)]

Fix initialisation on locale with comma as numerical separator

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 16 Feb 2016 01:40:02 +0000 (17:40 -0800)]

Release Movit 1.3.1.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 9 Feb 2016 13:20:09 +0000 (05:20 -0800)]

Add deinterlace_effect_test from .gitignore. Reported by Christophe Thommeret.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 9 Feb 2016 04:26:37 +0000 (05:26 +0100)]

2016 README updates.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 7 Feb 2016 12:05:39 +0000 (13:05 +0100)]

Remove GL_GLEXT_PROTOTYPES from some files, since it is irrelevant with epoxy.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 7 Feb 2016 11:15:07 +0000 (12:15 +0100)]

Remove the check for movit_shader_rounding_supported, as we now demand 1.30 unconditionally.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 7 Feb 2016 11:12:03 +0000 (12:12 +0100)]

Do not bother with unbinding vertex attributes; that is automatically done when we unbind the VAO.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 7 Feb 2016 01:18:48 +0000 (02:18 +0100)]

Remove a few unneeded shader program switches.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 7 Feb 2016 01:12:14 +0000 (02:12 +0100)]

Optimize VAO/VBO usage for minimal state changes.

This is similar to what we had earlier to just reuse the VAO,
but now with correct bindings no matter what vertex attributes
are assigned to what index, so that the (new) test passes.

This is actually slightly more efficient than what we had before,
since we don't look up the attributes by text anymore, and don't
reupload the VBO for each frame anymore. In practice, the effects
should be small.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 6 Feb 2016 23:11:51 +0000 (00:11 +0100)]

Revert "Reuse the VAO across all phases."

The patch trickles a bug where, if the first phase doesn't need texture
coordinates, the rest of the phases don't get it either. (Or more generally,
if the vertex shader varying indices are not predictable, the patch does
the wrong thing.) Add a unit test and revert it for now; in time, we'll find a
way that's both low-overhead (the patch fixes a real problem) _and_ correct in
these cases.

This reverts patch 5e34f7a8969f4afc169f034d34fb908019b3a389.

Reported by Christophe Thommeret.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 31 Jan 2016 12:37:24 +0000 (13:37 +0100)]

Release Movit 1.3.0.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 31 Jan 2016 12:35:38 +0000 (13:35 +0100)]

In OverlayEffect, add support for swapping the inputs.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 25 Jan 2016 21:51:09 +0000 (22:51 +0100)]

Use ryg's much faster fp16 conversion code.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 15 Jan 2016 23:57:00 +0000 (00:57 +0100)]

Make all fp16 routines work with fp32 as input instead of fp64, since that is what hardware supports anyway.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 24 Dec 2015 21:31:47 +0000 (22:31 +0100)]

Unbreak some effects that were broken by 0c821b2e.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 24 Dec 2015 15:43:23 +0000 (16:43 +0100)]

Make shader generation more deterministic by removing a sort of pointers.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 24 Dec 2015 12:31:38 +0000 (13:31 +0100)]

Increase version number after YADIF.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Nov 2015 17:34:26 +0000 (18:34 +0100)]

Add a deinterlacer based on YADIF.

I tried a few different things before I finally settled on this, in particular
Weston's 3-field deinterlacer (w3fdif). It's not perfect (see .h comments),
but it works overall pretty well.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 21 Dec 2015 13:52:34 +0000 (14:52 +0100)]

Add even long convenience overloads for add_effect().

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 21 Dec 2015 12:46:23 +0000 (13:46 +0100)]

Make register_int call register_uniform_int, now that we require GLSL 1.30.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 14 Dec 2015 19:54:41 +0000 (20:54 +0100)]

Revert "Add a hack to use #version 110 but keep using 130 features, for the benefit of OS X."

This turned out not to work well on OS X after all.

This reverts commit e0811ddf51aeb50575fb5f7d9c6e32b92a6bac0d.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 14 Dec 2015 01:10:00 +0000 (02:10 +0100)]

Fix a crash in a unit test if nb_NO.UTF-8 is not available.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 14 Dec 2015 00:45:46 +0000 (01:45 +0100)]

Use libpng instead of the older libpng12; seemingly fixes compilation on some systems.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Dec 2015 14:21:12 +0000 (15:21 +0100)]

Fix some stack overflows in unit tests; found with asan.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Dec 2015 13:56:15 +0000 (14:56 +0100)]

Fix a stack overflow in ResampleEffectTest.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Dec 2015 13:13:32 +0000 (14:13 +0100)]

Work around a rounding precision issue that would cause spurious test failures on AMD.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Dec 2015 12:26:01 +0000 (13:26 +0100)]

Fix a double scaling issue in Y'CbCr conversion.

We multiplied by 224/219 once too many, causing some small accuracy issues.
Furthermore, we also did this for full-range Y'CbCr, which obviously is wrong.
The issue was so small that the unit tests kept on passing (its investigation
was prompted by a test that failed on AMD cards, which is a separate issue).

After this, the Rec. 601 matrices match Wikipedia exactly, both for limited
and full range. Added unit tests for this.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 12 Dec 2015 13:31:40 +0000 (14:31 +0100)]

Explicitly bind fragment shader outputs in order.

Evidently ATI drivers use the freedom the standard gives them to assign
these in another order than they are specified in the shader source,
so we need to explicitly bind them, or YCbCrConversionEffectTest will fail
in the multi-output tests.

Originally reported by Iwan Gabovitch.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 12 Dec 2015 11:27:21 +0000 (12:27 +0100)]

Add a hack to use #version 110 but keep using 130 features, for the benefit of OS X.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Nov 2015 22:13:03 +0000 (23:13 +0100)]

Stop linking widgets.o into the shared library.

This was never intended to be there, and we don't install headers for it
(so no API/ABI break); it is actively harmful because it has a static
ResourcePool, which is attempted destroyed during shutdown (which causes
use of uninitialized memory as we try to get the current context).

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Nov 2015 13:34:56 +0000 (14:34 +0100)]

Add the missing two array uniform types.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Nov 2015 20:32:29 +0000 (21:32 +0100)]

Allow setting width/height on FlatInput and YCbCrInput after instantiation.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 1 Nov 2015 15:08:49 +0000 (16:08 +0100)]

Forgot to increment version.h for bounce override; doing so now.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 1 Nov 2015 01:09:56 +0000 (02:09 +0100)]

Add a function to let non-input effects override texture bounce.

Definitely read the comment before using; it is not for the faint
of heart. Also make ResampleEffect tolerate this kind of abuse.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 1 Nov 2015 01:02:10 +0000 (02:02 +0100)]

Add some earlier check_error() calls so that we do not get confusing behavior if there is already error on the entrance to render_to_fbo().

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 8 Oct 2015 19:13:44 +0000 (21:13 +0200)]

Install identity.frag; it is needed for ResizeEffect.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 7 Oct 2015 19:03:24 +0000 (21:03 +0200)]

Fix another #if issue, this time in dither_effect.frag. Reported by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 7 Oct 2015 18:21:08 +0000 (20:21 +0200)]

Install the new GLSL 1.50 shaders.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 7 Oct 2015 18:12:25 +0000 (20:12 +0200)]

Make the demo program run with core contexts.

Also, if SDL2 is in use, actually _ask_ for a core context.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 7 Oct 2015 17:24:19 +0000 (19:24 +0200)]

Add separate shaders for GLSL 1.50.

Seemingly, Apple's drivers (in OS X) do not support GLSL 1.30 in
core contexts, only 1.50. Reported by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 6 Oct 2015 18:08:05 +0000 (20:08 +0200)]

Fix GLSL compilation errors on some drivers.

Evidently #if FOO is illegal in GLSL if FOO is not defined,
unlike in C++. Reported by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 23:24:26 +0000 (01:24 +0200)]

Make get_current_context_identifier() understand EGL.

If we're using EGL and not GLX (typically because we're using GLES,
but also increasingly with desktop GL), we'd always return NULL.
This could FBOs to be confused between contexts.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 22:06:14 +0000 (00:06 +0200)]

Disable dither explicitly per frame; fixes some weird artifacts I found.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 21:08:35 +0000 (23:08 +0200)]

Call init_lanczos_table() once instead of checking for it all the time.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 20:43:59 +0000 (22:43 +0200)]

Get rid of a bunch of STL inefficiencies in FBO freelist handling.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 19:04:29 +0000 (21:04 +0200)]

Support GL_RGB565 targets.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 18:49:10 +0000 (20:49 +0200)]

Bump version number after support for external OpenGL textures (it was forgotten).

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 5 Oct 2015 17:49:51 +0000 (19:49 +0200)]

Make FlatInput and YCbCrInput support taking in external OpenGL textures.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 4 Oct 2015 00:44:13 +0000 (02:44 +0200)]

Unbreak make install after the last changes.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 4 Oct 2015 00:43:02 +0000 (02:43 +0200)]

Some small cleanups after we got rid of GLSL 1.10; we can now unify 1.30 and ES 3.00 some places.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 4 Oct 2015 00:37:56 +0000 (02:37 +0200)]

Allow dual Y'CbCr/RGBA outputs.

The intended use case is to have Y'CbCr for encoding output but keep
RGBA around for easier preview. This causes a few effects to need to
send arrays around; it's a bit ugly to special-case them like this,
but I'm concerned about going generic wrt. how good various shader
compilers are to optimize if we went full multi-model everywhere
(without having tested, though).

ABI break due to changed EffectChain size.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 4 Oct 2015 00:05:33 +0000 (02:05 +0200)]

Remove support for GLSL 1.10.

In practice, we haven't _actually_ supported this since we used integers
in ResampleEffect (and ResampleEffect is a pretty central effect),
so let's be honest with ourselves. (Also, we will soon start using arrays
in some cases, which are cumbersome pre-1.30.) I don't know of any drivers
that support all the other stuff we want but not GLSL 1.30 anyway;
it came with OpenGL 3.0, in 2008.

This actually isn't an ABI break, at least not on the C++ level.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 25 Sep 2015 23:33:29 +0000 (01:33 +0200)]

In ycbcr_conversion_effect_test, use a non-float framebuffer.

This way, we let the card convert float-to-int, which we have reasonable
control over, as opposed to glReadPixels(), which is rather unpredictable.
Fixes unit test failures on Broadwell on Linux (Mesa 10.1).

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 25 Sep 2015 23:29:40 +0000 (01:29 +0200)]

Fix a buffer overflow in ycbcr_conversion_effect_test.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 24 Sep 2015 16:44:01 +0000 (18:44 +0200)]

Release Movit 1.2.0.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 24 Sep 2015 00:12:40 +0000 (02:12 +0200)]

In ResampleEffect, precompute the Lanczos function into a table.

A 2048-element table (with linear interpolation between the elements)
is seemingly enough to get down to beyond float epsilon, and this
saves a lot of CPU time when computing large filter kernels.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 23 Sep 2015 23:59:47 +0000 (01:59 +0200)]

Fix a bug where combined fp16 weights would be horribly wrong.

Seemingly weights were always returned as float, and then cast
to fp16_int_t -- without proper conversion! And sum_sq_error
would be calculated based on the correct value, not the broken-
casted one.

It's a small miracle the unit tests didn't catch this; they didn't
until I started introducing small errors for another reason.
Most real-world testing seems to have hit fp32, and thus this
wasn't caught there either.

Also make fp16_int_t a struct so that it is not implicitly
convertible to/from numeric types, so this never ever can happen again.

A library for high-quality, high-performance video filters.

RSS Atom