Been reading some RGBA articles… August 27, 2005
Posted by winden in coding, demoscene.add a comment
I was reading some articles yesterday, which touched on some very interesting topics like mesh-storage optimization for intros. They were taking to basic aproaches:
1. Store a basic quad-mesh and refine it with bicubic surfaces: What it didn’t say was if the meshes were then stored pre-surfaced on memory or rather they were re-surfaced with vertex shaders realtime while rendering. Of course this approach makes meshes much easier to pack, since you don’t really need all that extra detail that is readly available with the interpolation.
2. These basic meshes were then delta-encoded, most probably a byte-per-coord one, so that they could then be packed better. This has been standard procedure for sound data (“The Player” mod-player surfaced could pack the 8bit samples with 8bit deltas, or even 4bit deltas for lossy compression), but is not so used for meshes… or is it?
But my main point in commenting the articles, is to point out how much PC coding is being affected by PS2 coding, even if it’s indirectly and the same tradeoffs are done for completely different purposes.
Taking the delta coding I just wrote about, it’s in fact a hardware feature on PS2 to take a DMA’d chunk of coordinates which come packed with byte-per-coord delta and are unpacked to 32bits before reaching the vector units that drive the graphics chips… but it’s done to save bus bandwidth, because PS2 can draw much more polys than it could receive from the busses!
One of the first games which looked really good on PS2 was SSX, a snowboarding game which used realtime-surfaces to render the tracks with dynamic level-of-detail, also due to the speed mismatch… in fact doing these surfaces were making more vertexes and more detail without forcing any slowdown!
So there you have, what may we spec when the new heavy multiprocessing PS3 hits? I sure wish we all get cell-like processors to play with! ;)
Why is learning assembler-democoding easier than C-democoding? August 24, 2005
Posted by winden in coding, demoscene.2 comments
This may be a shock for some of you, but in fact I think that’s true. Let’s take a simple thing… loading a texture to use it for a rotozoom for example. If you were coding in C, there is no easy way (for a complete beginner) to load an image into memory.
The C-approved method would be to use a library that takes a filename and decodes it into a memory area. But, first you have to learn how to link to libraries, and then how to use said library to load the image. And it even depends on the operating system, you will end up using quicktime, sdl, datatypes.library or some windows thing.
Compare it to the asm-approved method: using raw data. Take your image and load into a converter, save it to raw, and then use “incbin” to place the data into the executable. No libraries, no external files, no whatever. Just mere bytes, and only the bytes we really need, the image ones and not any extre header nor footer data to confuse things. Just learn to save the raw data from the converter and there you go.
Simplicity is golden, even more so to a beginner.
Six years later… August 23, 2005
Posted by winden in coding, demoscene.add a comment
Today I was a having a refreshment about 68030 and 68060 optimizing, and found out a nice detail: both 127(a0,d0) and 127(pc,d0) addressing modes take the same time to execute! :) What does this mean:
A. Good things: In a texture mapping loop, we could address the texture using pc-relative addressing, which means that we get a free address register for other purposes.
B. Bad things: For each texture we want to use with a texturing loop, we have to copy the loob body just below the texture memory.
Some examples, first the old standard way:
loop: move.w d0,d6 move.b d1,d6 addx.l d2,d0 addx.l d3,d1 move.b (a0,d6.l),(a1)+ dbf d7,loop rts
and now the “new and improved” way:
loop: move.w d0,d6 move.b d1,d6 addx.l d2,d0 addx.l d3,d1 move.b (tex-*)(pc,d6.l),(a1)+ dbf d7,loop rts cnop 0,4 tex: incbin texture.raw
UPDATE 24/08/2005:
Of course the times were the same for adress-register-relative and pc-relative addressing, but it was re-reading that specific piece together with having been working for some hours on texturemappers that led me to the “logical” result of instancing a new filler specifically for each texture.