]> git.sesse.net Git - nageru/log
nageru
5 years agoName some threads.
Steinar H. Gunderson [Sun, 16 Sep 2018 16:42:06 +0000 (18:42 +0200)]
Name some threads.

5 years agoSubsample chroma on the GPU instead of the CPU.
Steinar H. Gunderson [Sun, 16 Sep 2018 16:23:05 +0000 (18:23 +0200)]
Subsample chroma on the GPU instead of the CPU.

Faster, and also gets the subsampling right. The shaders come from Nageru,
but the support code is heavily tweaked to be more like flow.h.

5 years agoChange from operating point 3 to 2 (more laptop-friendly debugging).
Steinar H. Gunderson [Sun, 16 Sep 2018 15:36:36 +0000 (17:36 +0200)]
Change from operating point 3 to 2 (more laptop-friendly debugging).

5 years agoDo deinterleaving on the GPU (subsampling still remains).
Steinar H. Gunderson [Sun, 16 Sep 2018 15:35:45 +0000 (17:35 +0200)]
Do deinterleaving on the GPU (subsampling still remains).

5 years agoDo the interpolation in Y'CbCr instead of RGBA; saves some conversions back and forth...
Steinar H. Gunderson [Sun, 16 Sep 2018 13:35:29 +0000 (15:35 +0200)]
Do the interpolation in Y'CbCr instead of RGBA; saves some conversions back and forth. Subsampling is stil done on the CPU (to be fixed).

5 years agoFix so that make clean removes all objects.
Steinar H. Gunderson [Sun, 16 Sep 2018 13:34:31 +0000 (15:34 +0200)]
Fix so that make clean removes all objects.

5 years agoGive the VideoStream thread a name.
Steinar H. Gunderson [Sun, 16 Sep 2018 11:18:46 +0000 (13:18 +0200)]
Give the VideoStream thread a name.

5 years agoMake VideoStream capable of using the shared JPEG cache, saving lots of CPU.
Steinar H. Gunderson [Sat, 15 Sep 2018 20:19:12 +0000 (22:19 +0200)]
Make VideoStream capable of using the shared JPEG cache, saving lots of CPU.

5 years agoRelease flow textures when we are done with them.
Steinar H. Gunderson [Wed, 12 Sep 2018 22:48:01 +0000 (00:48 +0200)]
Release flow textures when we are done with them.

5 years agoMake the output actually follow the input in an interpolated fashion.
Steinar H. Gunderson [Tue, 11 Sep 2018 21:56:13 +0000 (23:56 +0200)]
Make the output actually follow the input in an interpolated fashion.

5 years agoRead timebase from the input video.
Steinar H. Gunderson [Mon, 10 Sep 2018 21:57:00 +0000 (23:57 +0200)]
Read timebase from the input video.

5 years agoFix some flickering due to YCbCr interpretation.
Steinar H. Gunderson [Sun, 9 Sep 2018 22:45:51 +0000 (00:45 +0200)]
Fix some flickering due to YCbCr interpretation.

5 years agoEncode JPEGs from the interpolated frames.
Steinar H. Gunderson [Sun, 9 Sep 2018 22:39:28 +0000 (00:39 +0200)]
Encode JPEGs from the interpolated frames.

5 years agoStart hacking in support for interpolated frames in the main application.
Steinar H. Gunderson [Sun, 9 Sep 2018 19:21:39 +0000 (21:21 +0200)]
Start hacking in support for interpolated frames in the main application.

5 years agoFix an issue where interpolation would not work, since something (Qt?) turned off...
Steinar H. Gunderson [Sun, 9 Sep 2018 18:29:14 +0000 (20:29 +0200)]
Fix an issue where interpolation would not work, since something (Qt?) turned off depth writing.

5 years agoEditorial changes.
Steinar H. Gunderson [Fri, 24 Aug 2018 07:33:44 +0000 (09:33 +0200)]
Editorial changes.

5 years agoSplit out the flow code from the example driver.
Steinar H. Gunderson [Thu, 23 Aug 2018 23:00:38 +0000 (01:00 +0200)]
Split out the flow code from the example driver.

5 years agoMake disabling variational refinement somewhat more efficient.
Steinar H. Gunderson [Thu, 23 Aug 2018 22:30:10 +0000 (00:30 +0200)]
Make disabling variational refinement somewhat more efficient.

OP2 minus variational refinement is now sub-millisecond for
1024x436 forward flow on my Haswell laptop. (Full OP2 forward+backward
with interpolation is easily realtime.)

5 years agoParametrize patch size and number of iterations.
Steinar H. Gunderson [Thu, 23 Aug 2018 22:21:48 +0000 (00:21 +0200)]
Parametrize patch size and number of iterations.

5 years agoStart parametrizing the operating points for DIS.
Steinar H. Gunderson [Thu, 23 Aug 2018 22:12:50 +0000 (00:12 +0200)]
Start parametrizing the operating points for DIS.

5 years agoMove flow classes into a header file; first step on the way to making it accessible.
Steinar H. Gunderson [Mon, 20 Aug 2018 22:34:58 +0000 (00:34 +0200)]
Move flow classes into a header file; first step on the way to making it accessible.

5 years agoMove stream generation into a new class VideoStream, which will also soon deal with...
Steinar H. Gunderson [Mon, 20 Aug 2018 21:43:29 +0000 (23:43 +0200)]
Move stream generation into a new class VideoStream, which will also soon deal with the GPU.

5 years agoActually send the MJPEG frames on to the HTTP stream.
Steinar H. Gunderson [Sat, 18 Aug 2018 22:03:11 +0000 (00:03 +0200)]
Actually send the MJPEG frames on to the HTTP stream.

5 years agoImport a bunch of http/mux code from Nageru.
Steinar H. Gunderson [Sat, 18 Aug 2018 19:48:39 +0000 (21:48 +0200)]
Import a bunch of http/mux code from Nageru.

5 years agoav_register_all() is deprecated, so do not call it anymore (no replacement needed).
Steinar H. Gunderson [Sat, 18 Aug 2018 17:55:47 +0000 (19:55 +0200)]
av_register_all() is deprecated, so do not call it anymore (no replacement needed).

5 years agoSupport rendering forward and backward flow in parallel.
Steinar H. Gunderson [Tue, 7 Aug 2018 22:02:41 +0000 (00:02 +0200)]
Support rendering forward and backward flow in parallel.

~15% faster flow computation on GTX 950; the lower resolutions are so
inherently low-parallel, so we get backward flow on those levels
essentially for free. Should be even more important on larger GPUs.

5 years ago16-bit depth should be plenty.
Steinar H. Gunderson [Wed, 8 Aug 2018 19:08:04 +0000 (21:08 +0200)]
16-bit depth should be plenty.

5 years agoUse a renderbuffer instead of a depth texture; potentially faster.
Steinar H. Gunderson [Wed, 8 Aug 2018 18:13:58 +0000 (20:13 +0200)]
Use a renderbuffer instead of a depth texture; potentially faster.

5 years agoRename “Total” to something more descriptive.
Steinar H. Gunderson [Tue, 7 Aug 2018 16:59:56 +0000 (18:59 +0200)]
Rename “Total” to something more descriptive.

5 years agoFix a warning in motion_search.frag.
Steinar H. Gunderson [Tue, 7 Aug 2018 16:11:48 +0000 (18:11 +0200)]
Fix a warning in motion_search.frag.

5 years agoFix patch placement. Again.
Steinar H. Gunderson [Mon, 6 Aug 2018 18:47:34 +0000 (20:47 +0200)]
Fix patch placement. Again.

5 years agoAdd a warmup option to get somewhat more consistent timings.
Steinar H. Gunderson [Sat, 4 Aug 2018 20:37:41 +0000 (22:37 +0200)]
Add a warmup option to get somewhat more consistent timings.

5 years agoFix a bug where the first black pass of SOR would read junk data.
Steinar H. Gunderson [Sat, 4 Aug 2018 20:35:43 +0000 (22:35 +0200)]
Fix a bug where the first black pass of SOR would read junk data.

5 years agoRename du_dv_tex to diff_flow_tex, for consistency.
Steinar H. Gunderson [Sat, 4 Aug 2018 19:31:15 +0000 (21:31 +0200)]
Rename du_dv_tex to diff_flow_tex, for consistency.

5 years agoUpdate the SOR comment about twinned buffering.
Steinar H. Gunderson [Sat, 4 Aug 2018 19:20:26 +0000 (21:20 +0200)]
Update the SOR comment about twinned buffering.

5 years agoFix an outdated comment.
Steinar H. Gunderson [Sat, 4 Aug 2018 13:33:29 +0000 (15:33 +0200)]
Fix an outdated comment.

5 years agoMicrooptimization in the SOR fragment shader.
Steinar H. Gunderson [Sat, 4 Aug 2018 12:43:22 +0000 (14:43 +0200)]
Microoptimization in the SOR fragment shader.

5 years agoSplit the equation texture in two, which speeds up SOR by ~30%.
Steinar H. Gunderson [Sat, 4 Aug 2018 12:24:11 +0000 (14:24 +0200)]
Split the equation texture in two, which speeds up SOR by ~30%.

5 years agoFix a NaN issue on Intel.
Steinar H. Gunderson [Fri, 3 Aug 2018 19:02:34 +0000 (21:02 +0200)]
Fix a NaN issue on Intel.

5 years agoPack the gradients and image together into a single 32-bit texture; seems to help...
Steinar H. Gunderson [Fri, 3 Aug 2018 18:53:36 +0000 (20:53 +0200)]
Pack the gradients and image together into a single 32-bit texture; seems to help ~1.5 ms for flow on NVIDIA.

5 years agoRemove some redundant glUseProgram() calls.
Steinar H. Gunderson [Fri, 3 Aug 2018 16:07:07 +0000 (18:07 +0200)]
Remove some redundant glUseProgram() calls.

5 years agoRemove an unused uniform.
Steinar H. Gunderson [Thu, 2 Aug 2018 22:19:55 +0000 (00:19 +0200)]
Remove an unused uniform.

5 years agoProperly release the flow texture; saves 1 ms (!) on FBO creation.
Steinar H. Gunderson [Thu, 2 Aug 2018 22:09:20 +0000 (00:09 +0200)]
Properly release the flow texture; saves 1 ms (!) on FBO creation.

5 years agoCompute diffusivity instead of smoothness, which saves a flow-size texture; shaves...
Steinar H. Gunderson [Thu, 2 Aug 2018 18:17:30 +0000 (20:17 +0200)]
Compute diffusivity instead of smoothness, which saves a flow-size texture; shaves about 0.2 ms off 720p flow on GTX 950.

5 years agoRemove some dead code.
Steinar H. Gunderson [Thu, 2 Aug 2018 17:31:38 +0000 (19:31 +0200)]
Remove some dead code.

5 years agoMicrooptimization in splat.vert.
Steinar H. Gunderson [Thu, 2 Aug 2018 17:31:23 +0000 (19:31 +0200)]
Microoptimization in splat.vert.

5 years agoShare VAOs between all the passes. Much less code, less rebinding overhead.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:35:17 +0000 (17:35 +0200)]
Share VAOs between all the passes. Much less code, less rebinding overhead.

5 years agoFix an issue where we would lose >1 ms for computing flow on NVIDIA, due to lack...
Steinar H. Gunderson [Thu, 2 Aug 2018 15:59:58 +0000 (17:59 +0200)]
Fix an issue where we would lose >1 ms for computing flow on NVIDIA, due to lack of fast clears.

5 years agoMake PersistentFBOSet handle depth.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:56:58 +0000 (17:56 +0200)]
Make PersistentFBOSet handle depth.

5 years agoRemove the rather pointless enable_if tests for now. And move to C++14.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:49:49 +0000 (17:49 +0200)]
Remove the rather pointless enable_if tests for now. And move to C++14.

5 years agoFix a GLSL syntax error that tripped up NVIDIA.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:20:32 +0000 (17:20 +0200)]
Fix a GLSL syntax error that tripped up NVIDIA.

5 years agoDisable dither; we don't need it.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:20:24 +0000 (17:20 +0200)]
Disable dither; we don't need it.

5 years agoWhen timing a level, print the resolution.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:20:02 +0000 (17:20 +0200)]
When timing a level, print the resolution.

5 years agoMake a new flag --detailed-timing for microsecond measurements and more.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:19:49 +0000 (17:19 +0200)]
Make a new flag --detailed-timing for microsecond measurements and more.

5 years agoMove GPUTimers into its own file.
Steinar H. Gunderson [Thu, 2 Aug 2018 15:14:28 +0000 (17:14 +0200)]
Move GPUTimers into its own file.

5 years agoRemove an unused uniform.
Steinar H. Gunderson [Wed, 1 Aug 2018 23:15:30 +0000 (01:15 +0200)]
Remove an unused uniform.

5 years agoUse the same PBO readback system for interpolated images as flows.
Steinar H. Gunderson [Tue, 31 Jul 2018 23:09:43 +0000 (01:09 +0200)]
Use the same PBO readback system for interpolated images as flows.

5 years agoSpeed up hole filling by ~10%.
Steinar H. Gunderson [Tue, 31 Jul 2018 15:06:19 +0000 (17:06 +0200)]
Speed up hole filling by ~10%.

5 years agoImplement hole filling.
Steinar H. Gunderson [Mon, 30 Jul 2018 23:10:15 +0000 (01:10 +0200)]
Implement hole filling.

5 years agoPut depth in 0..1; evidently even fp32 depth is clamped in the ARB version.
Steinar H. Gunderson [Mon, 30 Jul 2018 21:18:43 +0000 (23:18 +0200)]
Put depth in 0..1; evidently even fp32 depth is clamped in the ARB version.

5 years agoStart working on interpolation code.
Steinar H. Gunderson [Mon, 30 Jul 2018 16:08:23 +0000 (18:08 +0200)]
Start working on interpolation code.

5 years agoDo RGB -> grayscale conversion on the GPU.
Steinar H. Gunderson [Sat, 28 Jul 2018 18:53:46 +0000 (20:53 +0200)]
Do RGB -> grayscale conversion on the GPU.

5 years agoSplit texture pooling out from DISComputeFlow into its own class.
Steinar H. Gunderson [Sat, 28 Jul 2018 14:43:51 +0000 (16:43 +0200)]
Split texture pooling out from DISComputeFlow into its own class.

5 years agoHide the OpenGL window; it is rather annoying.
Steinar H. Gunderson [Sat, 28 Jul 2018 14:37:43 +0000 (16:37 +0200)]
Hide the OpenGL window; it is rather annoying.

5 years agoRemove a TODO.
Steinar H. Gunderson [Sat, 28 Jul 2018 14:01:18 +0000 (16:01 +0200)]
Remove a TODO.

Making the penalizer smaller (as one should if adjusting it towards
a smaller range, as there's effectively I² numerator and I denominator)
does not seem to have much effect; actually, increasing it to 0.01
seems to give better results on alley-2, but that's the “wrong way”.

5 years agoStop leaking texture views (and by extension, textures).
Steinar H. Gunderson [Sat, 28 Jul 2018 13:56:16 +0000 (15:56 +0200)]
Stop leaking texture views (and by extension, textures).

5 years agoSmall code cleanup.
Steinar H. Gunderson [Fri, 27 Jul 2018 14:10:23 +0000 (16:10 +0200)]
Small code cleanup.

5 years agoUse textureSize() instead of sending in uniforms manually. Same result, less code...
Steinar H. Gunderson [Fri, 27 Jul 2018 14:07:47 +0000 (16:07 +0200)]
Use textureSize() instead of sending in uniforms manually. Same result, less code, less error-prone.

5 years agoRemove a TODO; setting up equations is not where our time goes.
Steinar H. Gunderson [Fri, 27 Jul 2018 08:52:40 +0000 (10:52 +0200)]
Remove a TODO; setting up equations is not where our time goes.

5 years agoHalve the number of motion search iterations, to eight.
Steinar H. Gunderson [Thu, 26 Jul 2018 23:30:27 +0000 (01:30 +0200)]
Halve the number of motion search iterations, to eight.

The DIS code claims this is allowed after they changed their Sobel code;
for us, seemingly SOR was the breaking point. EPE is hardly moving
(<1% for most Sintel tests I've run), but speed goes up markedly.

5 years agoFinally get SOR working.
Steinar H. Gunderson [Thu, 26 Jul 2018 22:21:01 +0000 (00:21 +0200)]
Finally get SOR working.

The trick here was something I'd considered for a long time,
namely red-black SOR so that we update only half the values
every iteration. The implementation is annoyingly inefficient,
but convergence is so much better that it's worth it (a few
percent EPE improvement).

5 years agoFix a mixup in the variational refinement text.
Steinar H. Gunderson [Thu, 26 Jul 2018 21:29:28 +0000 (23:29 +0200)]
Fix a mixup in the variational refinement text.

5 years agoClose off a TODO.
Steinar H. Gunderson [Thu, 26 Jul 2018 10:45:50 +0000 (12:45 +0200)]
Close off a TODO.

5 years agoRework patch placement. Finally inches our EPE just below the reference code, it...
Steinar H. Gunderson [Thu, 26 Jul 2018 10:19:15 +0000 (12:19 +0200)]
Rework patch placement. Finally inches our EPE just below the reference code, it seems.

5 years agoTweak the default variational refinement weights (optimized on alley_2).
Steinar H. Gunderson [Wed, 25 Jul 2018 23:43:30 +0000 (01:43 +0200)]
Tweak the default variational refinement weights (optimized on alley_2).

5 years agoGive the variational refinement terms slightly less mysterious names.
Steinar H. Gunderson [Wed, 25 Jul 2018 23:39:47 +0000 (01:39 +0200)]
Give the variational refinement terms slightly less mysterious names.

5 years agoFix a done TODO (gamma is for E_G, not E_S, and we multiply in the alpha in the smoot...
Steinar H. Gunderson [Wed, 25 Jul 2018 23:35:21 +0000 (01:35 +0200)]
Fix a done TODO (gamma is for E_G, not E_S, and we multiply in the alpha in the smoothness).

5 years agoSample gradient as zero outside the image, instead of repeating them. Helps dramatica...
Steinar H. Gunderson [Wed, 25 Jul 2018 23:23:54 +0000 (01:23 +0200)]
Sample gradient as zero outside the image, instead of repeating them. Helps dramatically on edge patches. 20% EPE improvement on alley_2.

5 years agoSmall syntactic tweak.
Steinar H. Gunderson [Wed, 25 Jul 2018 13:32:56 +0000 (15:32 +0200)]
Small syntactic tweak.

5 years agoFix the patch out-of-bounds check in motion search (it was all broken).
Steinar H. Gunderson [Tue, 24 Jul 2018 21:37:16 +0000 (23:37 +0200)]
Fix the patch out-of-bounds check in motion search (it was all broken).

5 years agoFix a problem with visualizing flow that goes exactly left.
Steinar H. Gunderson [Tue, 24 Jul 2018 20:56:11 +0000 (22:56 +0200)]
Fix a problem with visualizing flow that goes exactly left.

5 years agoUse asynchronous readback when doing many flows. Speeds up 50-frame jobs by about...
Steinar H. Gunderson [Tue, 24 Jul 2018 10:56:53 +0000 (12:56 +0200)]
Use asynchronous readback when doing many flows. Speeds up 50-frame jobs by about 30%; PNG reading is now the dominant cost (60% or so).

5 years agoFix some uniforms not getting through to the motion search vertex shader.
Steinar H. Gunderson [Mon, 23 Jul 2018 23:29:02 +0000 (01:29 +0200)]
Fix some uniforms not getting through to the motion search vertex shader.

5 years agoAdd a debugging flag to disable/ignore variational refinement.
Steinar H. Gunderson [Mon, 23 Jul 2018 23:22:25 +0000 (01:22 +0200)]
Add a debugging flag to disable/ignore variational refinement.

5 years agoChange the discard condition for motion search.
Steinar H. Gunderson [Mon, 23 Jul 2018 18:16:29 +0000 (20:16 +0200)]
Change the discard condition for motion search.

5 years agoFix a typo.
Steinar H. Gunderson [Mon, 23 Jul 2018 16:35:32 +0000 (18:35 +0200)]
Fix a typo.

5 years agoMake motion search happen mostly in pixels; a bit less code that way.
Steinar H. Gunderson [Mon, 23 Jul 2018 16:31:41 +0000 (18:31 +0200)]
Make motion search happen mostly in pixels; a bit less code that way.

5 years agoPrint EPE on stdout, since it is not an error.
Steinar H. Gunderson [Mon, 23 Jul 2018 13:52:49 +0000 (15:52 +0200)]
Print EPE on stdout, since it is not an error.

5 years agoMake flow writing a bit faster.
Steinar H. Gunderson [Mon, 23 Jul 2018 11:13:15 +0000 (13:13 +0200)]
Make flow writing a bit faster.

5 years agoPrint out the first flow pair, too.
Steinar H. Gunderson [Mon, 23 Jul 2018 11:13:02 +0000 (13:13 +0200)]
Print out the first flow pair, too.

5 years agoAdd a --disable-timing flag (less spew, less GPU waiting).
Steinar H. Gunderson [Mon, 23 Jul 2018 11:12:13 +0000 (13:12 +0200)]
Add a --disable-timing flag (less spew, less GPU waiting).

5 years agoAdd support for computing many flows sequentially (reduces startup overhead).
Steinar H. Gunderson [Mon, 23 Jul 2018 11:00:05 +0000 (13:00 +0200)]
Add support for computing many flows sequentially (reduces startup overhead).

5 years agoRefactor the flow writing.
Steinar H. Gunderson [Mon, 23 Jul 2018 10:42:45 +0000 (12:42 +0200)]
Refactor the flow writing.

5 years agoReuse textures between flow invocations.
Steinar H. Gunderson [Mon, 23 Jul 2018 10:33:22 +0000 (12:33 +0200)]
Reuse textures between flow invocations.

5 years agoMake a wrapper class for all the flow logic.
Steinar H. Gunderson [Mon, 23 Jul 2018 10:17:57 +0000 (12:17 +0200)]
Make a wrapper class for all the flow logic.

5 years agoRefactor FBO creation. A step on the way to persistent FBOs and temporary textures.
Steinar H. Gunderson [Mon, 23 Jul 2018 09:42:21 +0000 (11:42 +0200)]
Refactor FBO creation. A step on the way to persistent FBOs and temporary textures.

5 years agoSmall cleanup.
Steinar H. Gunderson [Mon, 23 Jul 2018 08:56:30 +0000 (10:56 +0200)]
Small cleanup.

5 years agoMake the eval tool capable of taking the average over a series of flow files.
Steinar H. Gunderson [Sun, 22 Jul 2018 21:52:37 +0000 (23:52 +0200)]
Make the eval tool capable of taking the average over a series of flow files.

5 years agoMake it possible to set alpha/delta/gamma on the command line, for grid searches.
Steinar H. Gunderson [Sun, 22 Jul 2018 13:33:28 +0000 (15:33 +0200)]
Make it possible to set alpha/delta/gamma on the command line, for grid searches.

5 years agoAdd in the relative weighting of the variational refinement terms.
Steinar H. Gunderson [Sun, 22 Jul 2018 13:16:43 +0000 (15:16 +0200)]
Add in the relative weighting of the variational refinement terms.