Today is the release of version 1.0 of
bfs, a fully-compatible* drop-in replacement for the UNIX
find command. I thought this would be a good occasion to write more about its implementation. This post will talk about how I parse the command line. Continue reading bfs from the ground up, part 2: parsing
If you need to multiply some matrices together very quickly, usually it's best to use a highly optimized library like ATLAS. But sometimes adding such a dependency isn't worth it, if you're worried about portability, code size, etc. If you just need good performance, rather than the best possible performance, it can make sense to hand-roll your own matrix multiplication function. Continue reading A quick trick for faster naïve matrix multiplication
bfs is a tool I've been writing for about a year, with the goal of being a drop-in replacement for the UNIX
find command. This series of posts will be deep technical explorations of its implementation, starting from the lower levels all the way up to the user interface.
bfs is small (only about 3,500 lines of C code), which makes it possible to do a fairly complete analysis. But the codebase is fairly clean and highly optimized, which should make the analysis interesting. Continue reading bfs from the ground up, part 1: traversal
In Java, primitives (
char) are not
Objects. But since a lot of Java code requires
Objects, the language provides boxed versions of all primitive types (
Character). Autoboxing allows you to write code like
Continue reading Java autoboxing performance
In part 1, I outlined an algorithm for computing intersections between rays and axis-aligned bounding boxes. The idea to eliminate branches by relying on IEEE 754 floating point properties goes back to Brian Smits in , and the implementation was fleshed out by Amy Williams. et al. in .
Continue reading Fast, Branchless Ray/Bounding Box Intersections, Part 2: NaNs
It's surprisingly difficult to find a good code snippet for this on Google, so here's an efficient computation of integer powers in C, using binary exponentiation:
Continue reading Efficient Integer Exponentiation in C
Clang is known for its great error messages, but I did manage to horribly confuse it:
Continue reading I confused the compiler
alloca() function in C is used to allocate a dynamic amount of memory on the stack. Despite its advantages in some situations, it is non-standard and will probably remain so forever.
Continue reading Standards-compliant* alloca()
The visitor pattern is tremendously useful when working with certain kinds of information like abstract syntax trees. It's basically a poor man's version of sum types for languages that don't natively support them. Unfortunately, they take advantage of function overloading, something which duck-typed languages like Python lack.
Continue reading The Visitor Pattern in Python
Never seen this one before:
malloc.c:2369: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.