]> git.sesse.net Git - movit/log
movit
9 years agoDrop setting the locale altogether.
Steinar H. Gunderson [Sat, 7 Mar 2015 01:01:45 +0000 (02:01 +0100)]
Drop setting the locale altogether.

Trying to use sprintf and floats right in a portable manner is seemingly
impossible (MinGW doesn't support the per-thread locale stuff), so simply
do it a different way; stop sprintf-ing floats and use std::stringstream
instead. I dislike the iostream interface a lot, but it can do per-stream
locales, which is exactly what we want here.

9 years agoFix build on OS X and MinGW.
Dan Dennedy [Thu, 5 Mar 2015 07:41:39 +0000 (23:41 -0800)]
Fix build on OS X and MinGW.

OS X requires the xlocale.h header to define locale_t:
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html

MinGW does not include implementations for newlocale() and uselocale().
Instead, use the previous approach using setlocale().

9 years agoUse thread-local locale.
Steinar H. Gunderson [Tue, 3 Mar 2015 22:03:54 +0000 (23:03 +0100)]
Use thread-local locale.

setlocale() affects the whole process, not just the current thread
as I assumed; uselocale() (available since glibc 2.3, so basically
forever) is per-thread, and also conveniently seems to avoid the
issue of the returned pointer being destroyed (unless the driver
uses the return value of uselocale() as a base, which I really hope
it doesn't).

I'm slightly worried that since this overrides setlocale(), buggy drivers
might get confused when they try to do setlocale() and something else
overrides that precedence, but hopefully this shouldn't be a case.

Also add a unit test for locale handling while we're at it. It doesn't
test multi-threaded behavior, though, only the simple case.

Reported by Christophe Thommeret.

9 years agoIn ResampleEffect, ignore near-zero weights when combining.
Steinar H. Gunderson [Mon, 23 Feb 2015 19:41:45 +0000 (20:41 +0100)]
In ResampleEffect, ignore near-zero weights when combining.

9 years agoUse the F16C instruction set when available.
Steinar H. Gunderson [Mon, 23 Feb 2015 19:17:49 +0000 (20:17 +0100)]
Use the F16C instruction set when available.

For most users, this is mostly theoretical, as it requires compiling
with -march=native or similar. And these are definitely meant for
vectorizing, although it's still 2-3x as fast to use them as our own
software fallback.

These are supported starting from Haswell, and also by some AMD CPUs.

9 years agoRevert the optimization of the bilinear weights.
Steinar H. Gunderson [Mon, 23 Feb 2015 00:19:03 +0000 (01:19 +0100)]
Revert the optimization of the bilinear weights.

For the case where the resampling changed every frame (e.g. a zoom),
it just consumed too much CPU to be worth it, especially in memory
management; this is painful because it was an elegant solution to
a tricky problem, but it just has to go for now.

Also drop out to fp32 at the first sight of too-high error.

9 years agoUpdate a comment that wasn't really wrong, but less relevant in this context.
Steinar H. Gunderson [Sun, 22 Feb 2015 23:42:24 +0000 (00:42 +0100)]
Update a comment that wasn't really wrong, but less relevant in this context.

9 years agoBring the variable names in optimize_sum_sq_error() closer to the comments.
Steinar H. Gunderson [Sun, 22 Feb 2015 23:30:35 +0000 (00:30 +0100)]
Bring the variable names in optimize_sum_sq_error() closer to the comments.

9 years agoIn ResampleEffect, optimize the bilinear weights on a global scale.
Steinar H. Gunderson [Sun, 22 Feb 2015 23:20:49 +0000 (00:20 +0100)]
In ResampleEffect, optimize the bilinear weights on a global scale.

In addition to the individual weight optimization we do when combining samples,
this technique optimizes the weights as a whole, through some linear algebra.
This means it can take into account effects such as multiple bilinear samples
influencing the same coefficient (which normally should not happen, but might
nevertheless due to imprecisions in the stored texture coordinates), or
non-combined sample positions that can't hit the exact middle of the texel.

In practical tests, this is extremely effective; it often reduces the computed
sum of squared coefficient errors by as much as a factor 1000, although I
haven't verified how often it actually saves us from having to do fp32 fallback
with the rather tight error bounds that are in place.

9 years agoMake ResampleEffect fall back to fp32 as needed.
Steinar H. Gunderson [Sat, 21 Feb 2015 17:54:56 +0000 (18:54 +0100)]
Make ResampleEffect fall back to fp32 as needed.

This should kill all precision issues when zooming. There are still
a few tricks we can do to improve fp16, but that's primarily a
performance issue.

9 years agoMake combine_two_samples() into a template instead of having manual rounding checks.
Steinar H. Gunderson [Sat, 21 Feb 2015 14:52:54 +0000 (15:52 +0100)]
Make combine_two_samples() into a template instead of having manual rounding checks.

9 years agoFix combining in ResampleEffect again.
Steinar H. Gunderson [Sat, 21 Feb 2015 14:33:33 +0000 (15:33 +0100)]
Fix combining in ResampleEffect again.

It was completely broken after the last patch, so we'd effectively
never combine.

9 years agoAdd some fp16 conversion overloads, for making code that can be templatized across...
Steinar H. Gunderson [Sat, 21 Feb 2015 14:26:57 +0000 (15:26 +0100)]
Add some fp16 conversion overloads, for making code that can be templatized across fp16 and fp32.

9 years agoWhen combining samples, take fp16 rounding into account.
Steinar H. Gunderson [Sat, 21 Feb 2015 01:27:14 +0000 (02:27 +0100)]
When combining samples, take fp16 rounding into account.

This makes us somewhat more conservative in combining samples;
when we are near the lower/right edges of the image, we are starting
to get close to 1.0, and fp16 just doesn't have enough precision
to give us the 6 or 8 bits of subpixel precision we want (it is
hardly enough to address individual pixels!). In particular, this
can affect zooming with ResampleEffect, as reported by Christophe
Thommeret.

This does not fix all cases (especially not non-power-of-two cases);
for that, we will probably need to be able to fall back to fp32
when we detect fp16 doesn't work well.

9 years agoIn ResampleEffect, use a struct instead of manually fiddling with the two elements...
Steinar H. Gunderson [Fri, 20 Feb 2015 22:24:55 +0000 (23:24 +0100)]
In ResampleEffect, use a struct instead of manually fiddling with the two elements ourselves.

9 years agoCheck for __APPLE__ instead of __DARWIN__.
Steinar H. Gunderson [Thu, 15 Jan 2015 21:47:29 +0000 (22:47 +0100)]
Check for __APPLE__ instead of __DARWIN__.

Fixes compile with recent epoxy. Bug report and suggestion
by Dan Dennedy.

9 years agoMake number of BlurEffect taps configurable.
Steinar H. Gunderson [Mon, 22 Dec 2014 15:34:55 +0000 (16:34 +0100)]
Make number of BlurEffect taps configurable.

This can be useful if you are using blur as part of a larger effect
chain, where artifacts get masked by further processing.

Request and initial patch by Christophe Thommeret, although the patch
was redone from scratch.

10 years agoFix some typos that would cause the sampler number not to be incremented.
Steinar H. Gunderson [Thu, 16 Oct 2014 20:07:29 +0000 (22:07 +0200)]
Fix some typos that would cause the sampler number not to be incremented.

Found by Christophe Thommeret, who also noticed these are most likely
harmless since both effects with the bug are typically last in their
chain.

10 years agoRelease Movit 1.1.2. 1.1.2
Steinar H. Gunderson [Tue, 12 Aug 2014 21:02:03 +0000 (23:02 +0200)]
Release Movit 1.1.2.

10 years agoCorrect the number of blur taps read.
Steinar H. Gunderson [Sat, 26 Jul 2014 23:17:12 +0000 (01:17 +0200)]
Correct the number of blur taps read.

We read about twice as many as we should have; the others were
probably just set to 0.0, which has no effect but still burns
arithmetic, unless your driver happens to optimize very aggressively
for this (which I don't think anyone does anymore).

Found by Christophe Thommeret.

10 years agoFix a typo in a comment.
Steinar H. Gunderson [Mon, 21 Jul 2014 09:40:12 +0000 (11:40 +0200)]
Fix a typo in a comment.

10 years agoWhen the texture freelist is too large, cut from the back, not the front.
Steinar H. Gunderson [Tue, 17 Jun 2014 19:54:28 +0000 (21:54 +0200)]
When the texture freelist is too large, cut from the back, not the front.

All the other freelists had this right, but the texture freelist would
start pruning the _newest_ entries, which obviously gave poor performance.

Patch by Christophe Thommeret.

10 years agoDo not export inlines from the shared library by default. Reduces the number of expor...
Steinar H. Gunderson [Thu, 8 May 2014 04:17:50 +0000 (21:17 -0700)]
Do not export inlines from the shared library by default. Reduces the number of exports somewhat, and helps code generation a tiny bit.

10 years agoRelease Movit 1.1.1. 1.1.1
Steinar H. Gunderson [Sat, 12 Apr 2014 12:24:52 +0000 (14:24 +0200)]
Release Movit 1.1.1.

10 years agoFix an issue where we could take an FBO off a freelist but not properly clean fbo_for...
Steinar H. Gunderson [Sat, 12 Apr 2014 00:25:13 +0000 (02:25 +0200)]
Fix an issue where we could take an FBO off a freelist but not properly clean fbo_formats.

10 years agoRelease Movit 1.1. 1.1
Steinar H. Gunderson [Wed, 9 Apr 2014 22:21:28 +0000 (00:21 +0200)]
Release Movit 1.1.

10 years agouse Requires for the libs movit depends on
Dan Horák [Wed, 9 Apr 2014 12:26:36 +0000 (14:26 +0200)]
use Requires for the libs movit depends on

10 years agoProperly restore the LC_NUMERIC locale after finalizing.
Steinar H. Gunderson [Sun, 6 Apr 2014 21:58:54 +0000 (23:58 +0200)]
Properly restore the LC_NUMERIC locale after finalizing.

There were two issues here:

 1. setlocale(LC_NUMERIC, "C") always returns C, not the previous
    locale.
 2. The return value of setlocale() may point into static storage,
    which may be corrupted when we call into libGL, if e.g.
    the shader compiler calls setlocale() on its own.

Patch by Jean-Baptiste Mardelle.

10 years agoFix a leak in DiffusionEffect in an edge case.
Steinar H. Gunderson [Sat, 5 Apr 2014 00:16:14 +0000 (02:16 +0200)]
Fix a leak in DiffusionEffect in an edge case.

Found by Coverity Scan.

10 years agoRewrite extension checking.
Steinar H. Gunderson [Thu, 3 Apr 2014 21:11:30 +0000 (23:11 +0200)]
Rewrite extension checking.

Two big changes:

 1. If you're missing some functionality, Movit will now tell you
    on stderr what you're missing. (We might suppress this later
    if it turns out that people want to init_movit() but are actually
    fine with it failing.)
 2. Use a table instead of repeated if-then logic, since this started
    to become a bit messy after we added OpenGL-version-equivalence
    checks.

10 years agoLoosen up the 0.499 vs. 0.501 subpixel resample test.
Steinar H. Gunderson [Thu, 3 Apr 2014 20:54:13 +0000 (22:54 +0200)]
Loosen up the 0.499 vs. 0.501 subpixel resample test.

Seemingly these limits were a bit too tight for something that's
actually supposed to be approximate.

10 years agoRe-add resample kernel normalization, which was broken by accident.
Steinar H. Gunderson [Thu, 3 Apr 2014 20:40:23 +0000 (22:40 +0200)]
Re-add resample kernel normalization, which was broken by accident.

10 years agoFix a bug where having two DeconvolutionSharpenEffects in one chain would cause shade...
Steinar H. Gunderson [Thu, 3 Apr 2014 00:05:35 +0000 (02:05 +0200)]
Fix a bug where having two DeconvolutionSharpenEffects in one chain would cause shader compile errors.

10 years agoAdd zooming to ResampleEffect.
Steinar H. Gunderson [Tue, 1 Apr 2014 00:21:00 +0000 (02:21 +0200)]
Add zooming to ResampleEffect.

Same rationale as with the offset; we need resampling for proper zoom.

The look at heavy zoom isn't _quite_ what I had hoped for (although it's OK),
and there's a hint of shimmering in the zoom center if there's high-contrast
material there. For now, I'll write off the latter as Lanczos ringing;
I'll need to see what it does to video eventually (only tested with stills).

10 years agoFix a bug when scaling and doing offset at the same time. (At least one more remains.)
Steinar H. Gunderson [Sun, 30 Mar 2014 17:12:22 +0000 (19:12 +0200)]
Fix a bug when scaling and doing offset at the same time. (At least one more remains.)

10 years agoAdd support for offsets in ResampleEffect.
Steinar H. Gunderson [Sat, 29 Mar 2014 23:33:52 +0000 (00:33 +0100)]
Add support for offsets in ResampleEffect.

This enables smooth (subpixel) panning that people frequently want for stills
and titles, but that you couldn't do in a subpixel fashion before (PaddingEffect
could only do integer pixel offsets).

The placement (ResampleEffect) might seem a bit off at first, but subpixel
offset needs resampling, and ResampleEffect already has all the logic in place
for that. We could have used the GPU's built-in bilinear resampling, of course,
but it doesn't look all that good for high-contrast situations (although working
in linear light should help some).

10 years agoAdd some asserts.
Steinar H. Gunderson [Sat, 29 Mar 2014 22:47:08 +0000 (23:47 +0100)]
Add some asserts.

10 years agoMerge branch 'epoxy'
Steinar H. Gunderson [Fri, 28 Mar 2014 20:15:05 +0000 (21:15 +0100)]
Merge branch 'epoxy'

Conflicts:
effect_chain.cpp
resource_pool.cpp

10 years agoFix typo in luma_mix shader.
Dan Dennedy [Thu, 27 Mar 2014 03:17:12 +0000 (20:17 -0700)]
Fix typo in luma_mix shader.

10 years agoAdd a inverse flag to LumaMixEffect.
Steinar H. Gunderson [Thu, 27 Mar 2014 00:38:43 +0000 (01:38 +0100)]
Add a inverse flag to LumaMixEffect.

This is mainly a convenience so that you can change e.g. a left-to-right
wipe into a right-to-left wipe without having to add a separate inverting
effect to the luma. Suggested by Dan Dennedy.

10 years agoMake the ResourcePool hold FBOs as a per-context resource.
Steinar H. Gunderson [Wed, 26 Mar 2014 00:02:20 +0000 (01:02 +0100)]
Make the ResourcePool hold FBOs as a per-context resource.

This is an attempt to get out of the FBO sharability mess (unfortunately
we can't just stop having persistent FBOs, due to NVidia performance).
We now require the client to tell us whenever a context is going away,
and we try to be more careful about not deleting them in the wrong context.

Also, we assumed FBO names were globally unique, which isn't necessarily
true, so re-key them.

For good measure, we were deleting FBOs off the freelist from the front,
not the back as we should have -- fixed.

10 years agoHack around FBO/VAO sharability issues.
Steinar H. Gunderson [Tue, 25 Mar 2014 01:15:16 +0000 (02:15 +0100)]
Hack around FBO/VAO sharability issues.

We have a problem when trying to delete an EffectChain or ResourcePool;
we might have created FBOs or VAOs in the wrong context. Work around it
for now (unbreaking Kdenlive) by making VAOs non-persistent again,
and simply never deleting FBOs (leaking them).

A proper solution here will be hard, unfortunately, and will nede some thought.

10 years agoAdd proper formats for RGB without alpha.
Steinar H. Gunderson [Tue, 25 Mar 2014 00:21:26 +0000 (01:21 +0100)]
Add proper formats for RGB without alpha.

10 years agoAdd proper formats for sRGB without alpha. epoxy
Steinar H. Gunderson [Mon, 24 Mar 2014 23:45:08 +0000 (00:45 +0100)]
Add proper formats for sRGB without alpha.

10 years agoFix a typo in the make install target.
Steinar H. Gunderson [Mon, 24 Mar 2014 22:52:15 +0000 (23:52 +0100)]
Fix a typo in the make install target.

10 years agoMerge branch 'epoxy' of ssh://pannekake.samfundet.no/srv/git.sesse.net/www/movit...
Steinar H. Gunderson [Sun, 23 Mar 2014 11:41:46 +0000 (12:41 +0100)]
Merge branch 'epoxy' of ssh://pannekake.samfundet.no/srv/git.sesse.net/www/movit into epoxy

10 years agoWe switched to #version 300 es shaders for GLES a while back, so we now have round().
Steinar H. Gunderson [Sun, 23 Mar 2014 11:38:17 +0000 (12:38 +0100)]
We switched to #version 300 es shaders for GLES a while back, so we now have round().

10 years agoAdd some skeleton code for using GL_ARB_debug_output (disabled by default).
Steinar H. Gunderson [Sun, 23 Mar 2014 11:17:42 +0000 (12:17 +0100)]
Add some skeleton code for using GL_ARB_debug_output (disabled by default).

10 years agoMerge branch 'master'
Steinar H. Gunderson [Sun, 23 Mar 2014 01:35:22 +0000 (02:35 +0100)]
Merge branch 'master'

10 years agoImprove macro hygiene in .frag files slightly.
Steinar H. Gunderson [Sun, 23 Mar 2014 01:27:39 +0000 (02:27 +0100)]
Improve macro hygiene in .frag files slightly.

10 years agoFix a small overallocation.
Steinar H. Gunderson [Sat, 22 Mar 2014 22:25:14 +0000 (23:25 +0100)]
Fix a small overallocation.

10 years agoCache the FFT support texture.
Steinar H. Gunderson [Sat, 22 Mar 2014 16:18:12 +0000 (17:18 +0100)]
Cache the FFT support texture.

Regenerating it every time is a waste of CPU, and also of GL
state changes.

10 years agoUse a smaller support texture for the FFT.
Steinar H. Gunderson [Sat, 22 Mar 2014 16:09:36 +0000 (17:09 +0100)]
Use a smaller support texture for the FFT.

Many of the rows in the support texture are exactly the same,
so don't store the duplicates; gives a small performance boost.
In a sense, this is exactly the same property that GPUwave uses
with drawing multiple quads at the lower level.

10 years agoFix a tiny leak (that would cause an assertion failure on exit).
Steinar H. Gunderson [Sat, 22 Mar 2014 15:18:12 +0000 (16:18 +0100)]
Fix a tiny leak (that would cause an assertion failure on exit).

10 years agoStop the FFTPassEffect Repeat test after FFT size 128.
Steinar H. Gunderson [Sat, 22 Mar 2014 15:12:47 +0000 (16:12 +0100)]
Stop the FFTPassEffect Repeat test after FFT size 128.

The reason is that the 256 test uses texture sizes of 256*31=7936,
and above ~3900, some cards (at least both my Intel and NVidia card)
start having accuracy issues on some sizes. The test happens not to
die on this for semi-obscure reasons, but that's mostly by accident,
and in any case, requiring 8k textures for a unit test might be
a bit on the upper side.

10 years agoMerge branch 'master' into epoxy
Steinar H. Gunderson [Sat, 22 Mar 2014 14:53:11 +0000 (15:53 +0100)]
Merge branch 'master' into epoxy

10 years agoFactor out the actual phase execution into a function.
Steinar H. Gunderson [Sat, 22 Mar 2014 14:49:09 +0000 (15:49 +0100)]
Factor out the actual phase execution into a function.

10 years agoFactor out RTT sampler setting in its own function.
Steinar H. Gunderson [Sat, 22 Mar 2014 14:34:07 +0000 (15:34 +0100)]
Factor out RTT sampler setting in its own function.

10 years agoMerge branch 'master' into epoxy
Steinar H. Gunderson [Sat, 22 Mar 2014 13:57:53 +0000 (14:57 +0100)]
Merge branch 'master' into epoxy

Conflicts:
flat_input.cpp

10 years agoRedo FBO association yet again, this time per-texture.
Steinar H. Gunderson [Sat, 22 Mar 2014 13:54:43 +0000 (14:54 +0100)]
Redo FBO association yet again, this time per-texture.

According to http://adrienb.fr/blog/wp-content/uploads/2013/04/PortingSourceToLinux.pdf,
you want an FBO per-texture, not just format. And indeed, I can measure a very slight
performance improvement on both NVidia and ATI for this.

10 years agoHave separate FBOs per resolution and format.
Steinar H. Gunderson [Fri, 21 Mar 2014 23:29:21 +0000 (00:29 +0100)]
Have separate FBOs per resolution and format.

Seemingly this _also_ costs on NVidia; the demo app is down 0.9 ms/frame or so.
This rapidly started approaching complexity worthy of the ResourcePool,
so I moved the functionality in there even though it's not context-shareable.

10 years agoRemove obsolete comment.
Steinar H. Gunderson [Fri, 21 Mar 2014 22:42:08 +0000 (23:42 +0100)]
Remove obsolete comment.

10 years agoFix a buffer overflow in MixEffectTest.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:48:46 +0000 (23:48 +0100)]
Fix a buffer overflow in MixEffectTest.

10 years agoConvert another glReadPixels() to RGBA.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:15:23 +0000 (23:15 +0100)]
Convert another glReadPixels() to RGBA.

10 years agoProperly ignore the sign bit when comparing NaNs.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:59:59 +0000 (22:59 +0100)]
Properly ignore the sign bit when comparing NaNs.

Fixes fp16_test test failure on Clang.

10 years agoDitch BGRA use in OverlayEffectTest.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:48:32 +0000 (22:48 +0100)]
Ditch BGRA use in OverlayEffectTest.

10 years agoFix non-float framebuffers in EffectChainTester.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:46:32 +0000 (21:46 +0100)]
Fix non-float framebuffers in EffectChainTester.

Again, GLES fix.

10 years agoRemove unused private members from FFTConvolutionEffect.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:27:37 +0000 (21:27 +0100)]
Remove unused private members from FFTConvolutionEffect.

10 years agoUse the right internal format for FORMAT_R non-float textures.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:15:13 +0000 (21:15 +0100)]
Use the right internal format for FORMAT_R non-float textures.

10 years agoAdd a few check_error() calls.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:14:22 +0000 (21:14 +0100)]
Add a few check_error() calls.

10 years agoDon't use GL_RGBA32F/GL_RGBA16F with GL_UNSIGNED_BYTE.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:12:31 +0000 (21:12 +0100)]
Don't use GL_RGBA32F/GL_RGBA16F with GL_UNSIGNED_BYTE.

GLES doesn't like this, even with NULL data. Replace with GL_FLOAT.

10 years agoDocument that we can now run on core and ES contexts.
Steinar H. Gunderson [Fri, 21 Mar 2014 01:02:06 +0000 (02:02 +0100)]
Document that we can now run on core and ES contexts.

10 years agoAdd support for multiple shader models.
Steinar H. Gunderson [Fri, 21 Mar 2014 00:32:42 +0000 (01:32 +0100)]
Add support for multiple shader models.

We support 1.10 (for OpenGL 2.1 cards), 1.30 (for OpenGL 3.2 core contexts),
and 3.00 ES (for GLES3). There's some code duplication, but thankfully
not a whole lot.

With this, we compile in core contexts without any warning from ATI's driver,
and should also in theory be GLES3 compliant (tested on NVidia's desktop driver).

10 years agoCheck for GLES in init_movit().
Steinar H. Gunderson [Thu, 20 Mar 2014 23:01:48 +0000 (00:01 +0100)]
Check for GLES in init_movit().

This sort of worked by accident already, since 30 was interpreted
as desktop OpenGL 3.0, but this is more proper.

10 years agoFix a buffer overflow in MixEffectTest.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:48:46 +0000 (23:48 +0100)]
Fix a buffer overflow in MixEffectTest.

10 years agoMake handling of non-RGBA sRGB textures more consistent.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:35:59 +0000 (23:35 +0100)]
Make handling of non-RGBA sRGB textures more consistent.

Previously, we'd ask the driver to convert these to RGBA, which maybe
isn't ideal, and certainly doesn't work with GLES. Now we send in
the right format for RGB and RGBA, and refuse hardware conversions with
single-channel (which GLES doesn't accept). I don't think this is optimal,
but finding a use-case for sRGB single-channel is a bit tricky anyway,
and the fallback is fast, too.

10 years agoConvert another glReadPixels() to RGBA.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:15:23 +0000 (23:15 +0100)]
Convert another glReadPixels() to RGBA.

10 years agoFix a few signed/unsigned warnings.
Steinar H. Gunderson [Thu, 20 Mar 2014 22:10:24 +0000 (23:10 +0100)]
Fix a few signed/unsigned warnings.

10 years agoProperly ignore the sign bit when comparing NaNs.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:59:59 +0000 (22:59 +0100)]
Properly ignore the sign bit when comparing NaNs.

Fixes fp16_test test failure on Clang.

10 years agoEmulate glReadPixels of GL_ALPHA.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:51:48 +0000 (22:51 +0100)]
Emulate glReadPixels of GL_ALPHA.

10 years agoDitch BGRA use in OverlayEffectTest.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:48:32 +0000 (22:48 +0100)]
Ditch BGRA use in OverlayEffectTest.

10 years agoEmulate glReadPixels of GL_BLUE.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:46:37 +0000 (22:46 +0100)]
Emulate glReadPixels of GL_BLUE.

10 years agoStop using BGR, BGRA and grayscale formats.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:42:53 +0000 (22:42 +0100)]
Stop using BGR, BGRA and grayscale formats.

Neither of these are properly supported by GLES3, so just give in
more standard, boring formats, and then do the swizzles in the shader.

10 years agoDo not store RGB textures with RGBA internal format.
Steinar H. Gunderson [Thu, 20 Mar 2014 21:05:33 +0000 (22:05 +0100)]
Do not store RGB textures with RGBA internal format.

Once again, GLES doesn't like this, even though the GPU probably does it
internally nevertheless.

10 years agoFix non-float framebuffers in EffectChainTester.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:46:32 +0000 (21:46 +0100)]
Fix non-float framebuffers in EffectChainTester.

Again, GLES fix.

10 years agoAdd some precision statements to make GLES slightly happier.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:45:57 +0000 (21:45 +0100)]
Add some precision statements to make GLES slightly happier.

10 years agoRemove unused private members from FFTConvolutionEffect.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:27:37 +0000 (21:27 +0100)]
Remove unused private members from FFTConvolutionEffect.

10 years agoSome GLES fixes in ResourcePool::create_2d_texture().
Steinar H. Gunderson [Thu, 20 Mar 2014 20:19:27 +0000 (21:19 +0100)]
Some GLES fixes in ResourcePool::create_2d_texture().

In particular, some of the type strictness needed to be fixed here to,
and there are some fixes for some of the more obscure formats.

10 years agoDo not glReadPixels() with type GL_RED.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:16:49 +0000 (21:16 +0100)]
Do not glReadPixels() with type GL_RED.

GLES can only read RGBA pixels. Downconvert ourselves when we need to.

10 years agoUse the right internal format for FORMAT_R non-float textures.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:15:13 +0000 (21:15 +0100)]
Use the right internal format for FORMAT_R non-float textures.

10 years agoAdd a few check_error() calls.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:14:22 +0000 (21:14 +0100)]
Add a few check_error() calls.

10 years agoDon't use GL_RGBA32F/GL_RGBA16F with GL_UNSIGNED_BYTE.
Steinar H. Gunderson [Thu, 20 Mar 2014 20:12:31 +0000 (21:12 +0100)]
Don't use GL_RGBA32F/GL_RGBA16F with GL_UNSIGNED_BYTE.

GLES doesn't like this, even with NULL data. Replace with GL_FLOAT.

10 years agoAdd a temporary variable to reduce the amount of tedious typing.
Steinar H. Gunderson [Wed, 19 Mar 2014 20:06:17 +0000 (21:06 +0100)]
Add a temporary variable to reduce the amount of tedious typing.

10 years agoFix a typo.
Steinar H. Gunderson [Wed, 19 Mar 2014 23:42:02 +0000 (00:42 +0100)]
Fix a typo.

10 years agoMerge branch 'epoxy' into epoxy
Steinar H. Gunderson [Tue, 18 Mar 2014 23:21:54 +0000 (00:21 +0100)]
Merge branch 'epoxy' into epoxy

Conflicts:
Makefile.in
movit.pc.in

10 years agoMerge branch 'master' into epoxy
Steinar H. Gunderson [Tue, 18 Mar 2014 23:20:55 +0000 (00:20 +0100)]
Merge branch 'master' into epoxy

Conflicts:
Makefile.in
README
movit.pc.in

10 years agoReduce the amount of arithmetic in the BlurEffect shader a bit.
Steinar H. Gunderson [Tue, 18 Mar 2014 23:12:34 +0000 (00:12 +0100)]
Reduce the amount of arithmetic in the BlurEffect shader a bit.

We did additions and subtractions with zero, which is sort of a waste
on scalar architectures. Helps ever so slightly on the demo app on my NVidia
card (3–4%).

10 years agoMake VAOs persistent.
Steinar H. Gunderson [Tue, 18 Mar 2014 22:18:56 +0000 (23:18 +0100)]
Make VAOs persistent.

Seemingly helps ~0.5 ms/frame (which is quite significant for small
resolutions) on the demo applications on my NVidia card.

10 years agoKeep FBOs around in EffectChain again.
Steinar H. Gunderson [Tue, 18 Mar 2014 21:16:36 +0000 (22:16 +0100)]
Keep FBOs around in EffectChain again.

Seemingly creating and deleting them is crazy expensive on NVidia
(~3 ms for a create/delete pair), so 6dea8d2 caused a performance
regression at high frame rates. Now we instead keep one around per
context (they cannot be shared), which brings us basically back
to where we were performance-wise.

Reported by Christophe Thommeret.

10 years agoMake Phase take other Phases as inputs, not Nodes.
Steinar H. Gunderson [Mon, 17 Mar 2014 23:57:53 +0000 (00:57 +0100)]
Make Phase take other Phases as inputs, not Nodes.

This was a refactoring I wanted to do for a while, but actually finding
the right structure was a bit tricky. In the process, the entire phase
generation logic was rewritten, but the separation between compilation
and Phase construction is much cleaner now, and the logic in general
is easier to follow with more use of explicit recursion.

I'm still not 100% happy about what might be overuse of output_node;
we still need to link Phase and Node (the link just goes the other way
now), but I'm not sure we need to use it in all the cases we currently do.