git.sesse.net Git - movit/log

]> git.sesse.net Git - movit/log

projects / movit / log

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 16 Sep 2015 18:02:30 +0000 (20:02 +0200)]

Add some check_error() for shaders miscompiling.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 23:18:38 +0000 (01:18 +0200)]

Reuse the VAO across all phases.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 23:12:43 +0000 (01:12 +0200)]

Help the compiler out a tiny bit.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 21:15:37 +0000 (23:15 +0200)]

Reduce the boilerplate around uniforms a bit.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 18:40:58 +0000 (20:40 +0200)]

Cleanup: Make uniforms for RTT samplers like all other uniforms.

This also removes an ugly special-casing where one single place
in the entire code would call glUniform1i directly.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 18:15:29 +0000 (20:15 +0200)]

Handle sampler2D uniforms specially.

We're going to need this soon, since sampler uniforms are special
in that they cannot be in a uniform block.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 15:44:58 +0000 (17:44 +0200)]

Rework uniform setting.

One would think something as mundane as setting a few uniforms wouldn't
really mean much for performance, but seemingly this is not always so --
I had a real-world shader that counted no less than 55 uniforms.
Of course, not all of these were actually used, but we still have to go
through looking up the name etc. for every single one, every single frame.

Thus, we introduce a new way of dealing with uniforms: Register them before
finalization time, and then EffectChain can store their numbers once and
for all, instead of this repeated lookup. The system is also set up such
that we can go to uniform buffer objects (UBOs) in the very near future.

It's a bit unfortunate that uniform declaration now is removed from the
.frag files, where it sat very nicely, but the alternative would be to
try to parse GLSL, which I'm a bit wary at right now. All effects are
converted, leaving the set_uniform_* functions without any users, but
they are kept around for now in case external effects want them.

This seems to bring 1–2% speedup for my use case; hopefully UBOs will
bring a tiny bit more.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 15:39:52 +0000 (17:39 +0200)]

Add default constructors to Point2D/RGBTuple/RGBATuple, for convenience.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 13 Sep 2015 10:09:08 +0000 (12:09 +0200)]

Add a version header file to help clients that need to relate to multiple versions of Movit.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 9 Sep 2015 21:53:12 +0000 (23:53 +0200)]

Update README: It's now 2015. :-)

Needs to be done in time for 2016, of course.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 9 Sep 2015 21:51:48 +0000 (23:51 +0200)]

Add support for Y'CbCr output.

Currently only 8-bit and only 4:4:4 packed, but it should be a useful
building block.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 8 Sep 2015 23:28:40 +0000 (01:28 +0200)]

Prepare for better understanding of 10- and 12-bit Y'CbCr.

Seemingly there is trickiness in how to interpret the integer
values that is different from what you'll typically see in R'G'B'
(or just GPUs and TV standards differ on that point as well).
Add an explanatory comment, and add a data member to YCbCrFormat
to prepare for correct 10/12-bit level handlings. We'll stay 8-bit
only for now, though, to avoid an API break for existing clients
for no good reason (there's no 10-bit input, really).

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 6 Sep 2015 22:10:48 +0000 (00:10 +0200)]

Minor optimization in ResampleEffect: Set less GL state.

In particular, if we can avoid it, use glTexSubImage2D instead of glTexImage2D.
This actually has a real effect, at least on Intel/Linux, where the drive seems
to stall on some mappings.

Of course, this only really helps for things like pans, not zooms.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Sep 2015 22:57:25 +0000 (00:57 +0200)]

Make the PaddingEffect border 1-pixel soft.

Note that this is an API break; PaddingEffect now does something else
from what it used to do before when it comes to fractional offsets.
But I feel this is more useful; it allows PaddingEffect to be used
more efficiently for moving things smoothly around.

Also add a concept of border offset which moves the border around
without changing the pixels; useful if you want the subpixel placement
to be done by ResampleEffect (put the integral offset into top/left
and then move the border by the fractional amount it missed).

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Sep 2015 19:04:31 +0000 (21:04 +0200)]

Fix a comment.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Sep 2015 14:37:47 +0000 (16:37 +0200)]

Mark ResampleEffect as not one-to-one sampling.

The assumption is broken whenever a non-integral top or left parameter
is specified. Instead, make an IntegralResampleEffect that enforces
these parameters to be integers, and then mark it as one-to-one sampling.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 2 Sep 2015 23:46:58 +0000 (01:46 +0200)]

Collapse passes more aggressively in the face of size changes.

The motivating chain for this change was a case where we had
a SinglePassResampleEffect (the second half of a ResampleEffect)
feeding into a PaddingEffect, feeding into an OverlayEffect.
Currently, since the two former change output size, we'd bounce
to a temporary texture twice (output size changes would always
cause bounces).

However, this is needlessly conservative. The reason for bouncing
when changing output size is really if you want to get rid of
data by downscaling and then later upsampling, e.g. for a blur.
(It could also be useful for cropping, but we don't really use
that right now; PaddingEffect, which does crop, explicitly checks
the borders anyway to set the border color manually.) But in this case,
we are not downscaling at all, so we could just drop the bounce,
saving tons of texture bandwidth.

Thus, we add yet more parameters that effects can specify; first,
that an effect uses _one-to-one_ sampling; that is, that it
will only use its input as-is without sampling
between texels or outside the border (so the different
interpolation and border behavior will be irrelevant).
(Actually, almost all of our effects fall into this category.)
Second, a flag saying that even if an effect changes size,
it doesn't use virtual sizes (otherwise even a one-to-one effect
would de-facto be sampling between texels). If these flags
are set on the input and the output respectively, we can avoid
the bounce, at least unless there's an effect that's _not_
one-to-one further up the chain.

For my motivating case, this folded eight phases into four,
changing ~16.0 ms into ~10.6 ms rendering time. Seemingly
memory bandwidth is a really precious resource on my laptop's
GPU.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 2 Sep 2015 22:00:46 +0000 (00:00 +0200)]

Convert an overly cut-and-pasted comment for AlphaDivisionEffect.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 2 Sep 2015 21:52:47 +0000 (23:52 +0200)]

Draw an oversized triangle instead of a quad.

This is mostly theoretical; I've never been able to measure any
sort of real change from this. But according to popular cargo-culting,
it might have an effect since there are fewer edge pixels to shade.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 1 Sep 2015 23:02:38 +0000 (01:02 +0200)]

Propagate size correctly across effects that change output size.

When propagating size information between effects in a phase,
we'd forget to check if the effect wanted to change size
and use that information instead of our own heuristics.
Fix that.

This is currently a no-op, since right now we always break a phase
when an effect changes output size, but there are very real situations
where we'd be fine with not doing so, so this patch paves the way
for that.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 31 Aug 2015 23:56:42 +0000 (01:56 +0200)]

Fix broken YCbCr subpixel positioning. Caught by the unit tests.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 31 Aug 2015 17:11:00 +0000 (19:11 +0200)]

Support timer queries for phases.

This is useful for debugging slow chains; it can give information
about which phase takes the most time. Right now there seems to be
~5 ms in one of my test chains that disappear into nothing
(ie. show up in the fps counter with vsync off, but not in any
phase), but hopefully we can eventually solve that discrepancy.

Note that this is an ABI break.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 15:53:54 +0000 (17:53 +0200)]

Add ycbcr.h to HDRS.

Reported by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 11:35:20 +0000 (13:35 +0200)]

Do some IEEE trickery to help the shader optimizer remove a sub or two in some YCbCr cases.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 11:09:12 +0000 (13:09 +0200)]

Use std::scientific when outputting floats, so we do not get issues with 0.0 being outputs as 0 (which is an int, which cannot always be implicitly converted to float in GLSL).

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 11:08:31 +0000 (13:08 +0200)]

If a shader fails to compile, output it for easier debugging.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 10:48:40 +0000 (12:48 +0200)]

Add a missing entry in .gitignore.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 10:40:36 +0000 (12:40 +0200)]

Add a unit test for luma interpolation in YCbCr422InterleavedInput.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 10:40:11 +0000 (12:40 +0200)]

Add a small note on unit testing of ycbcr.cpp.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 30 Jul 2015 10:39:32 +0000 (12:39 +0200)]

Small whitespace fix.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 29 Jul 2015 23:38:38 +0000 (01:38 +0200)]

Add an effect for 4:2:2 interleaved YCbCr input (UYVY).

This is primarily motivated by the fact that DeckLink uses this format
natively.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 29 Jul 2015 23:28:24 +0000 (01:28 +0200)]

Rename the YCbCrInput test to YCbCrInputTest, for consistency.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 29 Jul 2015 11:53:59 +0000 (13:53 +0200)]

Small refactoring in YCbCrInput.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 29 Jul 2015 11:53:44 +0000 (13:53 +0200)]

Unbreak YCbCrInput (it needs to still support setting the "needs_mipmaps" int to zero).

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 28 Jul 2015 23:28:30 +0000 (01:28 +0200)]

Allow inputs to say they cannot support mipmaps.

Really only FlatInput can easily support mipmaps; for things like YCbCrInput
that combine multiple inputs, it's hard (probably not downright impossible,
but at least not immediately obvious without thinking about it a bit) and for
FFTInput it makes no sense.

Thus, we allow an input to say that it can't do this, and then bounce it
to a texture if needed. Hopefully this should happen quite rarely.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 28 Jul 2015 12:01:45 +0000 (14:01 +0200)]

Save a mul in YCbCrInput by folding the scaling into the matrix.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 30 Mar 2015 20:54:21 +0000 (22:54 +0200)]

Fix a C++11-related warning.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 29 Mar 2015 00:06:58 +0000 (01:06 +0100)]

Release Movit 1.1.3.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 7 Mar 2015 01:06:29 +0000 (02:06 +0100)]

Make read_file() thread-safe.

This is long overdue, of course; I knew this function was a quick hack,
but didn't realize it was a problem until Christophe Thommeret reported
an issue that looked a lot like this.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 7 Mar 2015 01:01:45 +0000 (02:01 +0100)]

Drop setting the locale altogether.

Trying to use sprintf and floats right in a portable manner is seemingly
impossible (MinGW doesn't support the per-thread locale stuff), so simply
do it a different way; stop sprintf-ing floats and use std::stringstream
instead. I dislike the iostream interface a lot, but it can do per-stream
locales, which is exactly what we want here.

commit | commitdiff | tree

Dan Dennedy [Thu, 5 Mar 2015 07:41:39 +0000 (23:41 -0800)]

Fix build on OS X and MinGW.

OS X requires the xlocale.h header to define locale_t:
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html

MinGW does not include implementations for newlocale() and uselocale().
Instead, use the previous approach using setlocale().

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 3 Mar 2015 22:03:54 +0000 (23:03 +0100)]

Use thread-local locale.

setlocale() affects the whole process, not just the current thread
as I assumed; uselocale() (available since glibc 2.3, so basically
forever) is per-thread, and also conveniently seems to avoid the
issue of the returned pointer being destroyed (unless the driver
uses the return value of uselocale() as a base, which I really hope
it doesn't).

I'm slightly worried that since this overrides setlocale(), buggy drivers
might get confused when they try to do setlocale() and something else
overrides that precedence, but hopefully this shouldn't be a case.

Also add a unit test for locale handling while we're at it. It doesn't
test multi-threaded behavior, though, only the simple case.

Reported by Christophe Thommeret.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 23 Feb 2015 19:41:45 +0000 (20:41 +0100)]

In ResampleEffect, ignore near-zero weights when combining.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 23 Feb 2015 19:17:49 +0000 (20:17 +0100)]

Use the F16C instruction set when available.

For most users, this is mostly theoretical, as it requires compiling
with -march=native or similar. And these are definitely meant for
vectorizing, although it's still 2-3x as fast to use them as our own
software fallback.

These are supported starting from Haswell, and also by some AMD CPUs.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 23 Feb 2015 00:19:03 +0000 (01:19 +0100)]

Revert the optimization of the bilinear weights.

For the case where the resampling changed every frame (e.g. a zoom),
it just consumed too much CPU to be worth it, especially in memory
management; this is painful because it was an elegant solution to
a tricky problem, but it just has to go for now.

Also drop out to fp32 at the first sight of too-high error.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Feb 2015 23:42:24 +0000 (00:42 +0100)]

Update a comment that wasn't really wrong, but less relevant in this context.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Feb 2015 23:30:35 +0000 (00:30 +0100)]

Bring the variable names in optimize_sum_sq_error() closer to the comments.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 22 Feb 2015 23:20:49 +0000 (00:20 +0100)]

In ResampleEffect, optimize the bilinear weights on a global scale.

In addition to the individual weight optimization we do when combining samples,
this technique optimizes the weights as a whole, through some linear algebra.
This means it can take into account effects such as multiple bilinear samples
influencing the same coefficient (which normally should not happen, but might
nevertheless due to imprecisions in the stored texture coordinates), or
non-combined sample positions that can't hit the exact middle of the texel.

In practical tests, this is extremely effective; it often reduces the computed
sum of squared coefficient errors by as much as a factor 1000, although I
haven't verified how often it actually saves us from having to do fp32 fallback
with the rather tight error bounds that are in place.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Feb 2015 17:54:56 +0000 (18:54 +0100)]

Make ResampleEffect fall back to fp32 as needed.

This should kill all precision issues when zooming. There are still
a few tricks we can do to improve fp16, but that's primarily a
performance issue.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Feb 2015 14:52:54 +0000 (15:52 +0100)]

Make combine_two_samples() into a template instead of having manual rounding checks.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Feb 2015 14:33:33 +0000 (15:33 +0100)]

Fix combining in ResampleEffect again.

It was completely broken after the last patch, so we'd effectively
never combine.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Feb 2015 14:26:57 +0000 (15:26 +0100)]

Add some fp16 conversion overloads, for making code that can be templatized across fp16 and fp32.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 21 Feb 2015 01:27:14 +0000 (02:27 +0100)]

When combining samples, take fp16 rounding into account.

This makes us somewhat more conservative in combining samples;
when we are near the lower/right edges of the image, we are starting
to get close to 1.0, and fp16 just doesn't have enough precision
to give us the 6 or 8 bits of subpixel precision we want (it is
hardly enough to address individual pixels!). In particular, this
can affect zooming with ResampleEffect, as reported by Christophe
Thommeret.

This does not fix all cases (especially not non-power-of-two cases);
for that, we will probably need to be able to fall back to fp32
when we detect fp16 doesn't work well.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 20 Feb 2015 22:24:55 +0000 (23:24 +0100)]

In ResampleEffect, use a struct instead of manually fiddling with the two elements ourselves.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 15 Jan 2015 21:47:29 +0000 (22:47 +0100)]

Check for __APPLE__ instead of __DARWIN__.

Fixes compile with recent epoxy. Bug report and suggestion
by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 22 Dec 2014 15:34:55 +0000 (16:34 +0100)]

Make number of BlurEffect taps configurable.

This can be useful if you are using blur as part of a larger effect
chain, where artifacts get masked by further processing.

Request and initial patch by Christophe Thommeret, although the patch
was redone from scratch.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 16 Oct 2014 20:07:29 +0000 (22:07 +0200)]

Fix some typos that would cause the sampler number not to be incremented.

Found by Christophe Thommeret, who also noticed these are most likely
harmless since both effects with the bug are typically last in their
chain.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 12 Aug 2014 21:02:03 +0000 (23:02 +0200)]

Release Movit 1.1.2.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 26 Jul 2014 23:17:12 +0000 (01:17 +0200)]

Correct the number of blur taps read.

We read about twice as many as we should have; the others were
probably just set to 0.0, which has no effect but still burns
arithmetic, unless your driver happens to optimize very aggressively
for this (which I don't think anyone does anymore).

Found by Christophe Thommeret.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 21 Jul 2014 09:40:12 +0000 (11:40 +0200)]

Fix a typo in a comment.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 17 Jun 2014 19:54:28 +0000 (21:54 +0200)]

When the texture freelist is too large, cut from the back, not the front.

All the other freelists had this right, but the texture freelist would
start pruning the _newest_ entries, which obviously gave poor performance.

Patch by Christophe Thommeret.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 8 May 2014 04:17:50 +0000 (21:17 -0700)]

Do not export inlines from the shared library by default. Reduces the number of exports somewhat, and helps code generation a tiny bit.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 12 Apr 2014 12:24:52 +0000 (14:24 +0200)]

Release Movit 1.1.1.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 12 Apr 2014 00:25:13 +0000 (02:25 +0200)]

Fix an issue where we could take an FBO off a freelist but not properly clean fbo_formats.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 9 Apr 2014 22:21:28 +0000 (00:21 +0200)]

Release Movit 1.1.

commit | commitdiff | tree

Dan Horák [Wed, 9 Apr 2014 12:26:36 +0000 (14:26 +0200)]

use Requires for the libs movit depends on

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 6 Apr 2014 21:58:54 +0000 (23:58 +0200)]

Properly restore the LC_NUMERIC locale after finalizing.

There were two issues here:

1. setlocale(LC_NUMERIC, "C") always returns C, not the previous
    locale.
2. The return value of setlocale() may point into static storage,
    which may be corrupted when we call into libGL, if e.g.
    the shader compiler calls setlocale() on its own.

Patch by Jean-Baptiste Mardelle.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 5 Apr 2014 00:16:14 +0000 (02:16 +0200)]

Fix a leak in DiffusionEffect in an edge case.

Found by Coverity Scan.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 3 Apr 2014 21:11:30 +0000 (23:11 +0200)]

Rewrite extension checking.

Two big changes:

1. If you're missing some functionality, Movit will now tell you
    on stderr what you're missing. (We might suppress this later
    if it turns out that people want to init_movit() but are actually
    fine with it failing.)
2. Use a table instead of repeated if-then logic, since this started
    to become a bit messy after we added OpenGL-version-equivalence
    checks.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 3 Apr 2014 20:54:13 +0000 (22:54 +0200)]

Loosen up the 0.499 vs. 0.501 subpixel resample test.

Seemingly these limits were a bit too tight for something that's
actually supposed to be approximate.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 3 Apr 2014 20:40:23 +0000 (22:40 +0200)]

Re-add resample kernel normalization, which was broken by accident.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 3 Apr 2014 00:05:35 +0000 (02:05 +0200)]

Fix a bug where having two DeconvolutionSharpenEffects in one chain would cause shader compile errors.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 1 Apr 2014 00:21:00 +0000 (02:21 +0200)]

Add zooming to ResampleEffect.

Same rationale as with the offset; we need resampling for proper zoom.

The look at heavy zoom isn't _quite_ what I had hoped for (although it's OK),
and there's a hint of shimmering in the zoom center if there's high-contrast
material there. For now, I'll write off the latter as Lanczos ringing;
I'll need to see what it does to video eventually (only tested with stills).

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 30 Mar 2014 17:12:22 +0000 (19:12 +0200)]

Fix a bug when scaling and doing offset at the same time. (At least one more remains.)

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 29 Mar 2014 23:33:52 +0000 (00:33 +0100)]

Add support for offsets in ResampleEffect.

This enables smooth (subpixel) panning that people frequently want for stills
and titles, but that you couldn't do in a subpixel fashion before (PaddingEffect
could only do integer pixel offsets).

The placement (ResampleEffect) might seem a bit off at first, but subpixel
offset needs resampling, and ResampleEffect already has all the logic in place
for that. We could have used the GPU's built-in bilinear resampling, of course,
but it doesn't look all that good for high-contrast situations (although working
in linear light should help some).

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 29 Mar 2014 22:47:08 +0000 (23:47 +0100)]

Add some asserts.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 28 Mar 2014 20:15:05 +0000 (21:15 +0100)]

Merge branch 'epoxy'

Conflicts:
effect_chain.cpp
resource_pool.cpp

commit | commitdiff | tree

Dan Dennedy [Thu, 27 Mar 2014 03:17:12 +0000 (20:17 -0700)]

Fix typo in luma_mix shader.

commit | commitdiff | tree

Steinar H. Gunderson [Thu, 27 Mar 2014 00:38:43 +0000 (01:38 +0100)]

Add a inverse flag to LumaMixEffect.

This is mainly a convenience so that you can change e.g. a left-to-right
wipe into a right-to-left wipe without having to add a separate inverting
effect to the luma. Suggested by Dan Dennedy.

commit | commitdiff | tree

Steinar H. Gunderson [Wed, 26 Mar 2014 00:02:20 +0000 (01:02 +0100)]

Make the ResourcePool hold FBOs as a per-context resource.

This is an attempt to get out of the FBO sharability mess (unfortunately
we can't just stop having persistent FBOs, due to NVidia performance).
We now require the client to tell us whenever a context is going away,
and we try to be more careful about not deleting them in the wrong context.

Also, we assumed FBO names were globally unique, which isn't necessarily
true, so re-key them.

For good measure, we were deleting FBOs off the freelist from the front,
not the back as we should have -- fixed.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 25 Mar 2014 01:15:16 +0000 (02:15 +0100)]

Hack around FBO/VAO sharability issues.

We have a problem when trying to delete an EffectChain or ResourcePool;
we might have created FBOs or VAOs in the wrong context. Work around it
for now (unbreaking Kdenlive) by making VAOs non-persistent again,
and simply never deleting FBOs (leaking them).

A proper solution here will be hard, unfortunately, and will nede some thought.

commit | commitdiff | tree

Steinar H. Gunderson [Tue, 25 Mar 2014 00:21:26 +0000 (01:21 +0100)]

Add proper formats for RGB without alpha.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 24 Mar 2014 23:45:08 +0000 (00:45 +0100)]

Add proper formats for sRGB without alpha.

commit | commitdiff | tree

Steinar H. Gunderson [Mon, 24 Mar 2014 22:52:15 +0000 (23:52 +0100)]

Fix a typo in the make install target.

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 23 Mar 2014 11:41:46 +0000 (12:41 +0100)]

Merge branch 'epoxy' of ssh://pannekake.samfundet.no/srv/git.sesse.net/www/movit into epoxy

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 23 Mar 2014 11:38:17 +0000 (12:38 +0100)]

We switched to #version 300 es shaders for GLES a while back, so we now have round().

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 23 Mar 2014 11:17:42 +0000 (12:17 +0100)]

Add some skeleton code for using GL_ARB_debug_output (disabled by default).

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 23 Mar 2014 01:35:22 +0000 (02:35 +0100)]

Merge branch 'master'

commit | commitdiff | tree

Steinar H. Gunderson [Sun, 23 Mar 2014 01:27:39 +0000 (02:27 +0100)]

Improve macro hygiene in .frag files slightly.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 22:25:14 +0000 (23:25 +0100)]

Fix a small overallocation.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 16:18:12 +0000 (17:18 +0100)]

Cache the FFT support texture.

Regenerating it every time is a waste of CPU, and also of GL
state changes.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 16:09:36 +0000 (17:09 +0100)]

Use a smaller support texture for the FFT.

Many of the rows in the support texture are exactly the same,
so don't store the duplicates; gives a small performance boost.
In a sense, this is exactly the same property that GPUwave uses
with drawing multiple quads at the lower level.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 15:18:12 +0000 (16:18 +0100)]

Fix a tiny leak (that would cause an assertion failure on exit).

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 15:12:47 +0000 (16:12 +0100)]

Stop the FFTPassEffect Repeat test after FFT size 128.

The reason is that the 256 test uses texture sizes of 256*31=7936,
and above ~3900, some cards (at least both my Intel and NVidia card)
start having accuracy issues on some sizes. The test happens not to
die on this for semi-obscure reasons, but that's mostly by accident,
and in any case, requiring 8k textures for a unit test might be
a bit on the upper side.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 14:53:11 +0000 (15:53 +0100)]

Merge branch 'master' into epoxy

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 14:49:09 +0000 (15:49 +0100)]

Factor out the actual phase execution into a function.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 14:34:07 +0000 (15:34 +0100)]

Factor out RTT sampler setting in its own function.

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 13:57:53 +0000 (14:57 +0100)]

Merge branch 'master' into epoxy

Conflicts:
flat_input.cpp

commit | commitdiff | tree

Steinar H. Gunderson [Sat, 22 Mar 2014 13:54:43 +0000 (14:54 +0100)]

Redo FBO association yet again, this time per-texture.

According to http://adrienb.fr/blog/wp-content/uploads/2013/04/PortingSourceToLinux.pdf,
you want an FBO per-texture, not just format. And indeed, I can measure a very slight
performance improvement on both NVidia and ATI for this.

commit | commitdiff | tree

Steinar H. Gunderson [Fri, 21 Mar 2014 23:29:21 +0000 (00:29 +0100)]

Have separate FBOs per resolution and format.

Seemingly this _also_ costs on NVidia; the demo app is down 0.9 ms/frame or so.
This rapidly started approaching complexity worthy of the ResourcePool,
so I moved the functionality in there even though it's not context-shareable.

A library for high-quality, high-performance video filters.

RSS Atom