From ab3b838374200c9050ac57c53d3c183fbb58a7be Mon Sep 17 00:00:00 2001 From: "Steinar H. Gunderson" Date: Sat, 4 Aug 2018 21:20:26 +0200 Subject: [PATCH] Update the SOR comment about twinned buffering. --- sor.frag | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sor.frag b/sor.frag index e1f86bb..ef431d3 100644 --- a/sor.frag +++ b/sor.frag @@ -45,8 +45,9 @@ void main() // just immediately throws away half of the warp, but it helps convergence // a _lot_ (rough testing indicates that five iterations of SOR is as good // as ~50 iterations of Jacobi). We could probably do better by reorganizing - // the data into two-values-per-pixel, so-called “twinning buffering”, - // but it makes for rather annoying code in the rest of the pipeline. + // the data into two-values-per-pixel, so-called “twinned buffering”; + // seemingly, it helps Haswell by ~15% on the SOR code, but GTX 950 not at all + // (at least not on 720p). Presumably the latter is already bandwidth bound. int color = int(round(element_sum_idx)) & 1; if (color != phase) discard; -- 2.39.2