Reducing number of divisions for polygon mappers - part 5 April 26, 2008
Posted by winden in coding, demoscene.1 comment so far
I had a revelation today. It may not work, or be totally irrelevant, but that’s not important for me at the moment. The idea itself is cool, and that is worth writing about.
When doing perspective correct mapping, you have to interpolate u/z and 1/z horizontally and perform (u/z)/(1/z) every so often. That division is usually done in a pipelined manner so that we continue performing mapping while it is calculated, so in effect it may be “free”.
But we can also “accumulate” all these divisions and perform them in one go, reusing the same technique we used on the first part to compute the 3d-to-2d projections for the three triangle vertices.
Let’s play an example in a scanline that has 4 control points:
- numerators: n0, n1, n2 ,n3
- denominators: d0, d1, d2, d3
We can calculate the values we need in the old way:
- u0 = n0 / d0
- u1 = n1 / d1
- u2 = n2 / d2
- u3 = n3 / d3
But we can also factor out the divisions like this:
- dddd = 1 / ( d0 * d1 * d2 * d3 )
- nddd = n0 * d1 * d2 * d3
- dndd = d0 * n1 * d2 * d3
- ddnd = d0 * d1 * n2 * d3
- dddn = d0 * d1 * d2 * n3
- u0 = nddd * dddd
- u1 = dndd * dddd
- u2 = ddnd * dddd
- u3 = dddn * dddd
The result, as usual, is trading divisions for many multiplies.
Now some parts of the audience will say “But I’m already pipelining my perspective divides with my texturing, what can I gain from this technique?”.
As I said at the start, you could gang all the usual divides you’d do in a scanline into just one, which would mean a lot of work into calculating all the different numerators. Other way would be to gang only 2 consecutive divides, which is easier on the code and be a very simple way to achieve 8 pixel steps instead of 16 pixel ones.
I feel there are a lot of things which can be improved by generalizing this technique.