Fast, Branchless Ray/Bounding Box Intersections

(Update: part 2)

Axis-aligned bounding boxes (AABBs) are universally used to bound finite objects in ray-tracing. Ray/AABB intersections are usually faster to calculate than exact ray/object intersections, and allow the construction of bounding volume hierarchies (BVHs) which reduce the number of objects that need to be considered for each ray. (More on BVHs in a later post.) This means that a ray-tracer spends a lot of its time calculating ray/AABB intersections, and therefore this code ought to be highly optimised.

The fastest method for performing ray/AABB intersections is the slab method. The idea is to treat the box as the space inside of three pairs of parallel planes. The ray is clipped by each pair of parallel planes, and if any portion of the ray remains, it intersected the box.

[Slab method demonstration]

A simple implementation of this algorithm might look like this (in two dimensions for brevity):

bool intersection(box b, ray r) {
    double tmin = -INFINITY, tmax = INFINITY;
 
    if (ray.n.x != 0.0) {
        double tx1 = (b.min.x - r.x0.x)/r.n.x;
        double tx2 = (b.max.x - r.x0.x)/r.n.x;
 
        tmin = max(tmin, min(tx1, tx2));
        tmax = min(tmax, max(tx1, tx2));
    }
 
    if (ray.n.y != 0.0) {
        double ty1 = (b.min.y - r.x0.y)/r.n.y;
        double ty2 = (b.max.y - r.x0.y)/r.n.y;
 
        tmin = max(tmin, min(ty1, ty2));
        tmax = min(tmax, max(ty1, ty2));
    }
 
    return tmax >= tmin;
}

However, those divisions take quite a bit of time. Since when ray-tracing, the same ray is tested against many AABBs, it makes sense to pre-calculate the inverses of the direction components of the ray. If we can rely on the IEEE 754 floating-point properties, this also implicitly handles the edge case where a component of the direction is zero - the tx1 and tx2 values (for example) will be infinities of opposite sign if the ray is within the slabs, thus leaving tmin and tmax unchanged. If the ray is outside the slabs, tx1 and tx2 will be infinities with the same sign, thus making tmin == +inf or tmax == -inf, and causing the test to fail.

The final implementation would look like this:

bool intersection(box b, ray r) {
    double tx1 = (b.min.x - r.x0.x)*r.n_inv.x;
    double tx2 = (b.max.x - r.x0.x)*r.n_inv.x;
 
    double tmin = min(tx1, tx2);
    double tmax = max(tx1, tx2);
 
    double ty1 = (b.min.y - r.x0.y)*r.n_inv.y;
    double ty2 = (b.max.y - r.x0.y)*r.n_inv.y;
 
    tmin = max(tmin, min(ty1, ty2));
    tmax = min(tmax, max(ty1, ty2));
 
    return tmax >= tmin;
}

Since modern floating-point instruction sets can compute min and max without branches, this gives a ray/AABB intersection test with no branches or divisions.

My implementation of this in my ray-tracer Dimension can be seen here.

19 thoughts on “Fast, Branchless Ray/Bounding Box Intersections”

  1. This doesn't seem to work out for me for negative direction vectors. Looking forward, I can see a box that's there, but looking backwards, I again see the same box mirrored, even though in this direction it's not "there". Following the algorithm step by step manually for both a positive and a negative z-dir (with x-dir and y-dir set to 0) gives the same near and far planes in both directions:

    /*
    box = MIN 0,0,0  MAX 256,256,256
    ray POS 128,128,-512
    case 1: ray DIR 0,0,0.9 -- inverse: (inf,inf,1.1111)
    case 2: ray DIR 0,0,-0.9 -- inverse: (inf,inf,-1.1111)
    */
    picker.tx1, picker.tx2 = (me.Min.X - ray.Pos.X) * ray.invDir.X, (me.Max.X - ray.Pos.X) * ray.invDir.X	//	-inf,inf			-inf,inf
    picker.ty1, picker.ty2 = (me.Min.Y - ray.Pos.Y) * ray.invDir.Y, (me.Max.Y - ray.Pos.Y) * ray.invDir.Y	//	-inf,inf			-inf,inf
    picker.tz1, picker.tz2 = (me.Min.Z - ray.Pos.Z) * ray.invDir.Z, (me.Max.Z - ray.Pos.Z) * ray.invDir.Z	//	-142.22,142.22		142.22,-142.22
    picker.txn, picker.txf = math.Min(picker.tx1, picker.tx2), math.Max(picker.tx1, picker.tx2)				//	-inf,inf			-inf,inf
    picker.tyn, picker.tyf = math.Min(picker.ty1, picker.ty2), math.Max(picker.ty1, picker.ty2)				//	-inf,inf			-inf,inf
    picker.tzn, picker.tzf = math.Min(picker.tz1, picker.tz2), math.Max(picker.tz1, picker.tz2)				//	-142.22,142.22		-142.22,142.22
    picker.tnear = math.Max(picker.txn, math.Max(picker.tyn, picker.tzn))									//	-142.22				-142.22
    picker.tfar = math.Min(picker.txf, math.Min(picker.tyf, picker.tzf))									//	142.22				142.22
    if picker.tfar < picker.tnear {
    	return true
    }
  2. Right, because the test is only for whether the line intersects the box at all. The line extends both forwards and backwards. Just add a tmax >= 0 check. It's tmax, not tmin, since tmin will be < 0 if the ray originates inside the box.

  3. To get the actual intersection, would I just use tmin, and multiply it with ray dir, adding ray origin? And what if I'm just interested in which face was intersected? x+,x-,y+,y-,z+ or z-?

    1. Yes, that's what I'd do. Except if the ray origin is inside the box (tmin < 0), you need to use tmax instead.

      To see what face was intersected, there's a few different ways. You can keep track of which slab is intersecting in the above algorithm at all times, but that slows it down.

      For a cube centered at the origin, a neat trick is to take the component of the intersection point with the largest absolute value.

      1. So if I return a vec3 with tmin, tmax and a float hit = step(tmin,tmax)*step(0,tmax), I basically know that if hit > 0 then ro+rd*tmin is my lower bound intersection point (entry point) and ro+rd*tmax is my higher bound intersection point (exit point), right? However, if tmin < 0, I'm inside the bounding box, which means I don't need the entry point and I can just use the ray origin as my starting point.

          1. step(edge,A) returns 1 if A >= edge, 0 elsewise. So I know if it was a hit if tmax >= tmin AND tmax >= 0. Then we also know tmin is the closest intersection and tmax is the furthest.

            I implemented this in C++ and it works perfectly for any ray direction. However, in GLSL it seems to have problems with negative directions.

  4. Thanks Tavianator,

    Yeah, the cube-trick requires sorting.
    So for now I test proximity to face within an epsilon.
    If close enough to face, I assume that face was hit.
    I can live with the few false positives, as I shoot over 100M photons each frame anyway.

  5. Thanks for posting this (so long ago)! I used this in my own code, and wondered if it is possible to save 2 of the subtractions by taking advantage of the fact that min(x+a, y+a) = min(x,y)+a (for some constant a):

    r.offset = (r.n_inv.y * r.x0.y) - (r.n_inv.x * r.x0.x);
     
    bool
    intersection(box b, ray r)
    {
      double tx1 = b.min.x * r.n_inv.x;
      double tx2 = b.max.x * r.n_inv.x;
     
      double tmin = min(tx1, tx2) + r.offset;
      double tmax = max(tx1, tx2) + r.offset;
     
      double ty1 = b.min.y * r.n_inv.y;
      double ty2 = b.max.y * r.n_inv.y;
     
      tmin = max(tmin, min(ty1, ty2));
      tmax = min(tmax, max(ty1, ty2));
     
      return tmax >= tmin;
    }
    1. I think that should work, but there's a couple bugs in your example code. And actually for the test as written you can omit r.offset entirely since it doesn't affect the tmax >= tmin check at the end. But since you probably want a tmax >= 0.0 check too, this should work:

      r.offset = (r.n_inv.y * r.x0.y) + (r.n_inv.x * r.x0.x);
       
      bool
      intersection(box b, ray r)
      {
        double tx1 = b.min.x * r.n_inv.x;
        double tx2 = b.max.x * r.n_inv.x;
       
        double tmin = min(tx1, tx2);
        double tmax = max(tx1, tx2);
       
        double ty1 = b.min.y * r.n_inv.y;
        double ty2 = b.max.y * r.n_inv.y;
       
        tmin = max(tmin, min(ty1, ty2));
        tmax = min(tmax, max(ty1, ty2));
       
        return tmax >= max(tmin, r.offset);
      }

      I'll try it out and see how much faster it is, thanks!

  6. Very useful! faster the any method i tried. not because of the !(use of divisions), but because the implementation of the slab method is far simpler than others do. They use things(slow things) like square roots, DOT products and lots of checks with points. now my raycasting method has the speed i wanted. Thanks for that.

Leave a Reply

Your email address will not be published. Required fields are marked *