Software optimizations, in a nutshell

I think pretty much all software optimizations can be summed up like this:

Don’t do more work than you need to.

Whether that means choosing a better algorithm, reducing memory allocations, or minimizing instruction counts, that’s really what it comes down to.

By choosing a better algorithm, you’re doing less work.  By reducing memory allocations, you’re moving things around less in memory, and are doing less work.  By minimizing instruction counts, you’re doing less work.

Let me give an example.  Earlier this week, a co-worker needed some way to check values in our production data.  This was previously possible through the UI, so there was a manual process of switching between two websites to check for a particular condition.  Because of some recent UI changes, this was no longer possible, so they needed another way of doing it.  I was able to put together a small Python script that did this.  The script made some HTTP calls against our production site.

The first version of the script I did worked, but was not efficient.  First, it made one call to fetch a list of items, which it then iterated over.  For each of those items, it would make another HTTP call to get some additional values.  It worked, but it was slow – especially in cases where the first call returned a large number of items.  Supposing that the secondary calls took at least one second, if the first call returned 1000 items, you’d be waiting over 16 minutes for it to complete.  Not cool.  Yes, it worked, but it was way too slow to be useful.  In the mantra of “Make it.  Make it work.  Make it work right.  Make it work right, fast.”, the second set of modifications for the script made just two HTTP calls total.  The calls were still fairly large, but the overall script was much faster.  It went from a run-time on 1000 items down to somewhere < 10 seconds.

I could have taken things further, and rather than iterating over a list of 1000 items (several times), I could have swapped the list for a hash table.  But at this point, it was already fast enough (and I had a large number of other tasks on my plate).  We already took a manual process from 10-ish minutes a day down to ~30 seconds.  (A user still had to go and check the CSV file generated by the script).  It would have certainly been possible to cut that 10-ish second run time down to just a second or two, but the law of diminishing returns kicks in, and at a certain point, it wasn’t worth more optimizing.

I find the amount of complexity that we add to systems: we add layers upon layers of abstractions.  We add frameworks that call other frameworks that call microservices.  Some of this does indeed help make a developer’s life easier.  But at the same time, it also adds more work.  I’d be curious to know how many large scale systems could be sped up by simply doing less work.

The moral of the story here is:  Don’t do more work than you need to.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s