Today marks the 10 year anniversary of the greatest day of my career.

On this day, 10 years ago, I removed thousands of lines of pointer arithmetic from a C code base.

1/11
The code implemented an algorithm that used a treap: https://en.m.wikipedia.org/wiki/Treap 

Each node in the treap had a parent, left, and right pointer, and the algorithm created a lot of nodes.

An enterprising engineer realized a lot of memory was being used to store the pointers.

2/11
So the engineer decided to save memory by doing one big malloc at the start of the algorithm and allocating all nodes from that block of memory.

If they did that, instead of keeping the actual pointer in left, right, and parent, they could keep the offset into the block.

3/11
This allowed them to capture the offset in 3 bytes instead of 4. Since they packed the struct, they actually got some savings here.

But there were some downsides...

4/11
1) They never freed any of the nodes, so if the algorithm used up the nodes in the preallocated block, it would fail. To "fix" this, they would retry the algorithm over and over changing configurations until it worked within the allocated block.

5/11
2) Debugging the treap was really annoying because you had to do the offset math to get the pointer to look at the content of any nodes.

6/11
My first project in the code base was to fix # 1 (memory exhaustion) without touching the overall approach. I had to implement a basic garbage collector on top of the block. It was not fun, but it fixed the fail/retry situation.

7/11
So what happened 10 years ago? Well, some executive said algorithm performance was the # 1 thing keeping us from being competitive. My boss asked if I knew of anything we could do to make the algorithm faster. I told him I'd go look at the performance profile.

8/11
Lo and behold, we were spending 15% of our time just trying to dereference these weird 3-byte pseudo-pointers. The garbage collector was also expensive.

I removed it all, ran all the tests, and benchmarked.

It was ~25% faster and only needed a bit more memory.

9/11
The absence of real memory impact wasn't because the author of the pseudo pointer code was wrong. It was because over time, more and more fields had been added to the nodes in the treap. The pointers were no longer the bulk of the memory footprint.

10/11
So the moral of the story is that initial assumptions about the things you need to optimize can be invalidated as your code evolves. Revisit past decisions.

Also, deleting code is fun. This is still one of my favorite projects 10 years later.

11/11
You can follow @michellebrush.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: