If you need to multiply some matrices together very quickly, usually it's best to use a highly optimized library like ATLAS. But sometimes adding such a dependency isn't worth it, if you're worried about portability, code size, etc. If you just need *good* performance, rather than *the best possible* performance, it can make sense to hand-roll your own matrix multiplication function. Continue reading A quick trick for faster naïve matrix multiplication

# A quick trick for faster naïve matrix multiplication

# bfs from the ground up, part 1: traversal

`bfs`

is a tool I've been writing for about a year, with the goal of being a drop-in replacement for the UNIX `find`

command. This series of posts will be deep technical explorations of its implementation, starting from the lower levels all the way up to the user interface. `bfs`

is small (only about 3,500 lines of C code), which makes it possible to do a fairly complete analysis. But the codebase is fairly clean and highly optimized, which should make the analysis interesting. Continue reading bfs from the ground up, part 1: traversal

# The Approximating and Eliminating Search Algorithm

Nearest neighbour search is a very natural problem: given a target point and a set of candidates, find the closest candidate to the target. For points in the standard k-dimensional Euclidean space, k-d trees and related data structures offer a good solution. But we're not always so lucky.

Continue reading The Approximating and Eliminating Search Algorithm

# Java autoboxing performance

In Java, primitives (`int`

, `double`

, `char`

) are not `Object`

s. But since a lot of Java code requires `Object`

s, the language provides *boxed* versions of all primitive types (`Integer`

, `Double`

, `Character`

). Autoboxing allows you to write code like

# Fast, Branchless Ray/Bounding Box Intersections, Part 2: NaNs

In part 1, I outlined an algorithm for computing intersections between rays and axis-aligned bounding boxes. The idea to eliminate branches by relying on IEEE 754 floating point properties goes back to Brian Smits in [1], and the implementation was fleshed out by Amy Williams. et al. in [2].

Continue reading Fast, Branchless Ray/Bounding Box Intersections, Part 2: NaNs

# Efficient Integer Exponentiation in C

It's surprisingly difficult to find a good code snippet for this on Google, so here's an efficient computation of integer powers in C, using binary exponentiation:

# I confused the compiler

Clang is known for its great error messages, but I did manage to horribly confuse it:

# Standards-compliant* alloca()

The `alloca()`

function in C is used to allocate a dynamic amount of memory on the stack. Despite its advantages in some situations, it is non-standard and will probably remain so forever.

# The Visitor Pattern in Python

The visitor pattern is tremendously useful when working with certain kinds of information like abstract syntax trees. It's basically a poor man's version of sum types for languages that don't natively support them. Unfortunately, they take advantage of function overloading, something which duck-typed languages like Python lack.

# Exact Bounding Boxes for Spheres/Ellipsoids

Finding the tightest axis-aligned bounding box for a sphere is trivial: the box extends from the center by the radius in all dimensions. But once the sphere is transformed, finding the minimal bounding box becomes trickier. Rotating a sphere, for example, shouldn't change its bounding box, but naïvely rotating the bounding box will expand it unnecessarily. Luckily there's a trick to computing minimal bounding boxes by representing the transformed sphere as a quadric surface.

Continue reading Exact Bounding Boxes for Spheres/Ellipsoids