Csillámtrace - my new graphics engine

Practical and theoretical implementation discussion.
Vilem Otte
Posts: 24
Joined: Sun Dec 25, 2011 12:42 am

Re: Csillámtrace - my new graphics engine

Post by Vilem Otte » Sat Apr 14, 2012 3:44 pm

Geri wrote:and i also dont want to use/rely on any thirdparty code/codesnippets/solutions/principles
I understand this, I also have my code all written from scratch ;) It gives better feeling about it, not mentioning when a bug is occured (then it can be a bit nightmarish when heavily using third party code).
Geri wrote:i didnt knewd whats a sponza is. http://hdri.cgtechniques.com/~sponza/files/ you meant this? ^^
Yup, thats the Sponza. There are some other well-known models to work with (Sibenik cathedral, Happy buddha, Power plant, etc.). Of course none of them will actually give the best info you need - as your scenes will definitely differ (and different geometry layout in scenes *can* make quite difference).

I also ran the benchmark on Core i7 (system Linux, Wine 1.4 ... why it doesn't write out the correct system :mrgreen:)

Code: Select all

          MMX: OK   3Dnow: x    SSE: OK   SSE2: OK   SSE3: OK   SSE4: OK   
SYS_init: Windows NT/XP/2k/Vista/7/8
CPU raw performance:
  Cache16k read:         1525 MByte/sec
  Cache128k read:        4746 MByte/sec
  Cache256k read:        4502 MByte/sec
  Cache512k read:        4808 MByte/sec
  Cache16k write:        2533 MByte/sec
  Cache128k write:       5808 MByte/sec
  Cache256k write:       6571 MByte/sec
  Cache512k write:       6774 MByte/sec
  MIPS (simple) SMP:     4963
  MIPS (simple):         4994
  MIPS SMP:              1203
  MIPS:                  1237
  CMP:                   1099 million ops/sec
FPU raw performance:
  FLOPS (simple) SMP:    1211 (megaflops)
  FLOPS (simple):        1181 (megaflops)
  FLOPS SMP:             353 (megaflops)
  FLOPS:                 337 (megaflops)
  FCMP:                  1162 million ops/sec
  FDIV:                  112 million ops/sec
CPU algorithmic performance:
  PI:                    56 million iteration/sec
  VEC3 normalize:        37 million/sec
  VEC3 normalize (SSE):  45 million/sec
  Ray-Sphere collision:  225 million/sec
  Ray-Tri collision:     9 million/sec
  TMU (+bilinear filter):11.2 million sample/sec
  Square Root (formula): 8.3 million res/sec
  Square Root (x87):     111.2 million res/sec
  Shell sort 1M:         6.2 result/sec
  8 queen:               3.7 million queen test/sec
  RSA gen/en/decrypt:    0.72 million/sec
SYSTEM performance:
  Memory Read SMP:       2560 MByte/sec
  Memory Read:           3800 MByte/sec
  Memory Write SMP:      1655 MByte/sec
  Memory Write:          2999 MByte/sec
  Memory Copy SMP:       1050 MByte/sec
  Memory Copy:           Press any key to continue... 1050 MByte/sec
  Address translation:   825 million result/sec
  Library calls:         86 million call/sec
  malloc/free:           4.8 million pairs/sec
  Thread creation:       11249 thread/sec
Minibenchmark done.
I think I would get better numbers, but I'm currently running just Linux systems :ugeek: , so this ran under Wine 1.4

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Sat Apr 14, 2012 5:12 pm

ty for watching my benchmark.
8 queen: 3.7 million queen test/sec
RSA gen/en/decrypt: 0.72 million/sec

this two is broken in that version, there was some bugs but i fixed it. (also maybe ray-tri is also broken in that.) strange that it not dumps the cpu type, becouse if its not able to request it from the system, it should be get it with cpuid. and yes, wine runied the results, it wasnt able to get the number of the cpu, so SMP tests also ran without threads.
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Sun Apr 15, 2012 10:41 pm

i created an algorithm that detects the ugly flickering of the voxels

now i fine tune it, it seems good.

(the activation indicated with green)

Image
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Mon Apr 16, 2012 1:07 am

Image

native sky support, to avoid the unhandlable amout of the voxelisation of a skysphere.
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Vilem Otte
Posts: 24
Joined: Sun Dec 25, 2011 12:42 am

Re: Csillámtrace - my new graphics engine

Post by Vilem Otte » Tue Apr 17, 2012 9:35 am

The sky support is handled through sky-dome (testing ray against sphere), sky box (testing ray agains aabb), or just through "hack"?

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Tue Apr 17, 2012 12:49 pm

if(result==sky){
float u, float v;dual_paraboloid_mapping(&u, &v, vector);
TMU(result_rgba, u,v,sky_texid);
return result;
}
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Vilem Otte
Posts: 24
Joined: Sun Dec 25, 2011 12:42 am

Re: Csillámtrace - my new graphics engine

Post by Vilem Otte » Tue Apr 24, 2012 12:57 am

Maybe I'll spam a little bit here, but I think jbikker did a little mistake during computation of MRays/s and cycles...
Let's start with the 550M = 24cycles:
At 3.4Ghz, 12 cores do 12 * 3.4 * 10^9 cycles = 4.8e10. Divided by 5.5e7 is 872.7, *per ray*, which seems reasonable.
We've got 12 * 3.4 * 10^9 = 40.8 * 10^9 = 4.08 * 10^10 cycles.
Also we've got 550 MRays/s - thats 5.5 * 10^8 MRays/s.

And according to wolframalpha - 4.08 * 10^10 / (5.5 * 10^8) is approx. 74.2 cycles *per ray*.

So actually his code is 10 times faster than he states - 74.2 cycles *per ray* (which is amazingly fast, because from what I've got now - Core i3 @ 2.27, on single core, I can get roughly 2.7 MRays/s -> with KD-Tree -> So I'm at some 840 cycles *per ray* now -> 10 times slower than Arauna :o ).

For single triangle it's quite different... I can get some 89 MRay-tri tests per second - for packets (on 1920x1080 screen) - thats some 25 cycles for ray-tri test :twisted: (100 cycles for 2x2 ray packet - tri test)! (the assembly of code (not hand written! - generated by compiler) has about 130 lines) - A good optimization Arauna is using (and I'm not) is that it first tests ray packet frustum against triangle, whether it can hit - which indeed gives huge boost (sometimes). I get some 14 ray-triangle tests on average (interior "ball-in-a-stadium" like scene) - so it means, 350 cycles *per ray* are ray-triangle tests, so another 490 cycles I search in KD-tree.

Getting to code as fast (or should I rather say slow) as mine is quite easy. I just have compact classes for everything and try-and-fail optimized intrinsics functions. Getting to Arauna from here is a lot harder - In my opinion (from what I've gathered when reading Arauna code):
1.) Trade-off speed for precision - I'm still using (right now) quite precise ray-triangle test (barycentric one), Arauna uses Pluecker (imprecise, but a lot faster).
2.) Use ray packets and frustums -> Frustums will quickly tell whether the whole packet hits the node (or the object), and SoA layout of ray packets will give you some 2x - 2.5x speedup over standard AoS code (I'm currently using just ray packets, not frustums).
3.) Do everything in SoA layout, it'll speed things up a bit.
4.) You still won't be there, but you'll be just 2 times or so slower, that'd be the time to use magic wand! :mrgreen: Or you can buy 4 16-core Opterons :ugeek:

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Tue Apr 24, 2012 7:06 pm

Image
hdr (80% done)
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Tue May 01, 2012 12:12 pm

i am currently writing key/mouse/joystic input system, becouse replace that too. so no strictly ray tracing related thing in the project in these days. after i finished with that, i will expand the scene rebuilding algorithm to be able to process up to 64 threads instead of the current maximum 8, then i will refactor the the camera and character handling from my old engine. I also should do a linux port from the platform api, wich will be easy, according to the platform parts are strictly separated away from the algorithmical implementations. if i finish this, i will bugfix the 2d handling and implements some missing effects to the ray tracing pipeline, and so i will have the base engine in fully functional state, wich i am very waiting, since i want to test it on some complex situations.
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Geri
Posts: 146
Joined: Fri Mar 02, 2012 7:01 pm

Re: Csillámtrace - my new graphics engine

Post by Geri » Sun May 13, 2012 9:59 pm

intelligent self-working movement detector detects the possible fugly block-spaces on the screen, and solves the blocking on bigger camera movements

(i am also suprised that this thing actually works)

Image
Csontos kezünkbe a nyomor
Ezer év rúnáit véste

Post Reply