+
+ // For many resamplings (e.g. 640 -> 1280), we will end up with the same
+ // set of samples over and over again in a loop. Thus, we can compute only
+ // the first such loop, and then ask the card to repeat the texture for us.
+ // This is both easier on the texture cache and lowers our CPU cost for
+ // generating the kernel somewhat.
+ num_loops = gcd(src_size, dst_size);
+ slice_height = 1.0f / num_loops;
+ unsigned dst_samples = dst_size / num_loops;
+