SSE 4-Ray vs. Sphere?

Starting easy.
Posts: 8
Joined: Sun Dec 30, 2012 9:48 pm

SSE 4-Ray vs. Sphere?

Postby ironwallaby » Thu Aug 08, 2013 8:48 pm

I'm finally getting with the program and working on learning SSE, but I'm having trouble trying to build a 4 ray vs. 1 sphere intersection function. While it's trivial in single-ray land, when working with SIMD instructions I'm getting pretty confused about how one goes about handling branching.

Does anybody have any pointers to where I can read more about the topic, or (better yet) have any example code that I could dig into?


Posts: 156
Joined: Mon Nov 28, 2011 7:28 pm

Re: SSE 4-Ray vs. Sphere?

Postby graphicsMan » Thu Aug 08, 2013 8:52 pm

This appears to show how to do this: ... onal-code/

Posts: 8
Joined: Sun Dec 30, 2012 9:48 pm

Re: SSE 4-Ray vs. Sphere?

Postby ironwallaby » Fri Aug 09, 2013 7:33 pm

Thanks a ton, that was exactly the nudge I needed.

For example, here's a 4-ray vs. sphere:

Code: Select all

static v4sf intersect(const ray4_t *ray) {
  /* Compute discriminant. */
  const v4sf b = -dot(&(ray->origin), &(ray->direction)),
             d = b * b - sq(&(ray->origin)) + V4SF_ONE,
             t = b - __builtin_ia32_sqrtps(d);

  /* Which rays hit the sphere? (Note that this works since any rays that
   * didn't hit the sphere will be NaN, because we had to sqrt a negative
   * number. Comparing that to zero will be false as a result.) */
  const v4si mask = (v4si) __builtin_ia32_cmpgeps(t, V4SF_ZERO);

  /* Return the rays that hit the sphere, or infinity otherwise. */
  return __builtin_ia32_orps(
    __builtin_ia32_andps((v4sf) mask, t),
    __builtin_ia32_andnps((v4sf) mask, V4SF_INFINITY)

(Yeah, yeah, the sphere is always a unit sphere centered on the origin and we only check against the outside, but one thing at a time. :) I also don't know if it's faster to bail early if all of the discriminants are negative or not. I'll have to test.)

Return to “My First...”

Who is online

Users browsing this forum: No registered users and 1 guest