# A quick trick for faster naïve matrix multiplication

If you need to multiply some matrices together very quickly, usually it's best to use a highly optimized library like ATLAS. But sometimes adding such a dependency isn't worth it, if you're worried about portability, code size, etc. If you just need good performance, rather than the best possible performance, it can make sense to hand-roll your own matrix multiplication function. Continue reading A quick trick for faster naïve matrix multiplication

# bfs from the ground up, part 1: traversal

bfs is a tool I've been writing for about a year, with the goal of being a drop-in replacement for the UNIX find command. This series of posts will be deep technical explorations of its implementation, starting from the lower levels all the way up to the user interface. bfs is small (only about 3,500 lines of C code), which makes it possible to do a fairly complete analysis. But the codebase is fairly clean and highly optimized, which should make the analysis interesting. Continue reading bfs from the ground up, part 1: traversal

# The Approximating and Eliminating Search Algorithm

Nearest neighbour search is a very natural problem: given a target point and a set of candidates, find the closest candidate to the target. For points in the standard k-dimensional Euclidean space, k-d trees and related data structures offer a good solution. But we're not always so lucky.

# Java autoboxing performance

In Java, primitives (int, double, char) are not Objects. But since a lot of Java code requires Objects, the language provides boxed versions of all primitive types (Integer, Double, Character). Autoboxing allows you to write code like

# Fast, Branchless Ray/Bounding Box Intersections, Part 2: NaNs

In part 1, I outlined an algorithm for computing intersections between rays and axis-aligned bounding boxes. The idea to eliminate branches by relying on IEEE 754 floating point properties goes back to Brian Smits in [1], and the implementation was fleshed out by Amy Williams. et al. in [2].

# Efficient Integer Exponentiation in C

It's surprisingly difficult to find a good code snippet for this on Google, so here's an efficient computation of integer powers in C, using binary exponentiation:

# I confused the compiler

Clang is known for its great error messages, but I did manage to horribly confuse it:

# Standards-compliant* alloca()

The alloca() function in C is used to allocate a dynamic amount of memory on the stack. Despite its advantages in some situations, it is non-standard and will probably remain so forever.

# The Visitor Pattern in Python

The visitor pattern is tremendously useful when working with certain kinds of information like abstract syntax trees. It's basically a poor man's version of sum types for languages that don't natively support them. Unfortunately, they take advantage of function overloading, something which duck-typed languages like Python lack.

# Exact Bounding Boxes for Spheres/Ellipsoids

Finding the tightest axis-aligned bounding box for a sphere is trivial: the box extends from the center by the radius in all dimensions. But once the sphere is transformed, finding the minimal bounding box becomes trickier. Rotating a sphere, for example, shouldn't change its bounding box, but naïvely rotating the bounding box will expand it unnecessarily. Luckily there's a trick to computing minimal bounding boxes by representing the transformed sphere as a quadric surface.