So for all you non programming types here are the two pictures:
After Compression (Normal Packing):
See the difference? Good. I didn't think so :)
I don't usually post code as I figure it would bore most of you but as I'm mainly just doing behind the scenes code anyway here is a post out for all the game programmers out there. Ignore the rest if you are not of the code persuasion.
I ran into a bottle neck on my GeForce 6600 card in the shader bandwidth. I'm currently using 4 render targets and i wanted to see if I could get it down to 2. The answer is yes. I was twittering a question about a way to pack bits in the GPU. I had done it on the CPU using the I (or) operator and knew that didn't work so well on a graphics card. Well @paveltumik came to my rescue with an answer that should have been pretty obvious in hindsite. You can store 1 number in the fractional portion .001 and one in the non fractional portion 100. Here is the code if you are interested.
1) You are compressing numbers between -1 and 1.
2) You only care about two digits of precision which is good since I am packing into a 16bit float format
3) I deal with the 1.0 case by using a range 0.0 - 0.8f.
For me for normals this was perfect.
//Thanks Pavel Tumik! @paveltumik for the original code in comments
//pack: f1=(f1+1)*0.5; f2=(f2+1)*0.5; res=floor(f1*1000)+f2;
inline float PackFloat16bit2(float2 src)
return floorf((src.x+1)*0.5f * 100.0f)+((src.y+1)*0.4f);
//unpack: f2=frac(res); f1=(res-f2)/1000; f1=(f1-0.5)*2;f2=(f2-0.5)*2;
inline float2 UnPackFloat16bit2(float src)
float fFrac = frac(src);
o.y = (fFrac-0.4f)*2.5f;
o.x = ((src-fFrac)/100.0f-0.5f)*2;
Also some of the numbers I didn't combined just to make it more readable. Hope this helps someone else.