Software optimizations, in a nutshell

I think pretty much all software optimizations can be summed up like this:

Don’t do more work than you need to.

Whether that means choosing a better algorithm, reducing memory allocations, or minimizing instruction counts, that’s really what it comes down to.

By choosing a better algorithm, you’re doing less work.  By reducing memory allocations, you’re moving things around less in memory, and are doing less work.  By minimizing instruction counts, you’re doing less work.

Let me give an example.  Earlier this week, a co-worker needed some way to check values in our production data.  This was previously possible through the UI, so there was a manual process of switching between two websites to check for a particular condition.  Because of some recent UI changes, this was no longer possible, so they needed another way of doing it.  I was able to put together a small Python script that did this.  The script made some HTTP calls against our production site.

The first version of the script I did worked, but was not efficient.  First, it made one call to fetch a list of items, which it then iterated over.  For each of those items, it would make another HTTP call to get some additional values.  It worked, but it was slow – especially in cases where the first call returned a large number of items.  Supposing that the secondary calls took at least one second, if the first call returned 1000 items, you’d be waiting over 16 minutes for it to complete.  Not cool.  Yes, it worked, but it was way too slow to be useful.  In the mantra of “Make it.  Make it work.  Make it work right.  Make it work right, fast.”, the second set of modifications for the script made just two HTTP calls total.  The calls were still fairly large, but the overall script was much faster.  It went from a run-time on 1000 items down to somewhere < 10 seconds.

I could have taken things further, and rather than iterating over a list of 1000 items (several times), I could have swapped the list for a hash table.  But at this point, it was already fast enough (and I had a large number of other tasks on my plate).  We already took a manual process from 10-ish minutes a day down to ~30 seconds.  (A user still had to go and check the CSV file generated by the script).  It would have certainly been possible to cut that 10-ish second run time down to just a second or two, but the law of diminishing returns kicks in, and at a certain point, it wasn’t worth more optimizing.

I find the amount of complexity that we add to systems: we add layers upon layers of abstractions.  We add frameworks that call other frameworks that call microservices.  Some of this does indeed help make a developer’s life easier.  But at the same time, it also adds more work.  I’d be curious to know how many large scale systems could be sped up by simply doing less work.

The moral of the story here is:  Don’t do more work than you need to.

Working from a queue?

There’s an idea that keeps floating around my mind, but I’m not quite sure how to describe it: What if everyone was to work out of a queue?

We’ve all got task lists.  In the particular office environment that I work in, I can get a request in any number of forms: an email, a JIRA ticket, a slack message, or someone actually coming up to me in person to ask me something.  What if all these got funneled into a priority queue?  The reason for doing this is:

  1. People making urgent requests would be able to see the impact of their request in relation to other pending tasks.
  2. The important items would float to the top more quickly.
  3. It’d be easier to concentrate on getting things done if I can turn off notifications in one spot.

Imagine if all email requests, slack messages, JIRA tickets, and anything else (except in-person communication) got pushed into a queue.  As a user, you would have the ability to prioritize which items come first.  For example, my priorities would look something like:

  1. Outlook Calendar notifications
  2. Direct slack messages
  3. Specific project channel related Slack messages
  4. Outlook emails
  5. Github pull request/comment notifications
  6. Less important Slack channels
  7. Emails sent from particular people
  8. RSS feeds
  9. Twitter, etc.

It would be possible to see the queue length, as well as “peek” at any given thing in the queue, but the general idea is that you are always working off the one end.  Messages are shoved into one end, and processed out the other.  The priority allows me to set the order that I want to see things.  For example, if someone sends me a direct slack message at the same time I have a meeting reminder, the meeting reminder takes precedence, and shows up on the top.  Once I’ve dismissed it, then I’d see the next thing in the queue – the direct message.

Users would be able to see the position of their requests in other user’s queues.  Privacy settings would allow you to make some queue items public (so anyone can see it) vs. private, so no one can actually see the contents of the queue messages.  (By default, queued items would be private, except perhaps the source – Outlook, Slack, Github, etc.).

Users would also be able to put their own items into their own queue.  For example, I could put in an item like “Send a Happy Father’s Day message to dad” into my queue.  Users would also have the ability to push things back into the queue, to a given depth.  For example, maybe I’m dealing with a few really important tasks right now, so I push a reminder 10 or 15 items deep into the queue.  It’d be similar to snoozing an alarm clock or meeting notification.

There’s also the idea of allowing a user to push their item to a higher priority, and in doing so, anyone else who’s requests are bumped by it, would be able to see it.  That way, if there’s several people waiting in the request line, and someone comes along with a “high priority” request, and bumps everyone else down, those other people can then go contend with the person making the high priority request.  That way it’s not up to the person handling the requests to settle the fight of who’s request is highest priority – the people waiting in line get to decide that, leaving the person handling requests alone, so they can actually get some work done.

I imagine that writing such a system would be pretty complicated – you’d need hooks into the various systems, and it would require everyone in a given office to be on board with it.  That being said, I think the idea sounds entertaining, if anything, at least for an experiment.

Pubescent Software

A co-worker of mine described the current state of a certain software company (paraphrased):

It’s no longer a start-up.  It’s bigger than that.  It’s not a mature company either.  It’s somewhere in between.  It’s pubescent.  It’s gangly, awkward, and clumsy.

I’d say that’s a pretty good description of the phase between start-up and mature company.  It’s no longer small, young, and agile, but it isn’t yet old and fixed in its ways.  The clumsiness comes from not knowing quite how to operate.  It can’t continue to operate in the way that it did (in “start-up mode”) because things no longer fit.  But it also doesn’t have down the efficient business processes that come with experience.

Do it right, or do it wrong, maintain it, then eventually do it again.

You may have heard the following saying:

“Do it right, or do it twice.”

In software development, it’s more like:

“Do it right, or do it wrong, then support that wrong, then eventually do it again.”

This is something that I keep thinking about with software development.  Why do we do things wrong the first time?

  • It’s done under the guise of being “agile”, because we want to satisfy customers as quickly as possible.
  • We lack the knowledge to do it right the first time.
  • We tell ourselves that we’ll go back and fix it later (but we rarely do).
  • We feel like we lack the time to do it right, right now.  A quick fix is quicker.

Far too often, though, such development ends up costing more time and more effort in the long run.  The band-aids and duct tape approach only leads to more work in the long run.  Instead of implementing a solution once, you built it the first time, deal with all the related maintenance headaches, then end up re-building it.

Do it.  Do it right.  Do it right the first time.

“I don’t care.”

When I hear someone say “I don’t care” regarding a decision in a meeting, it raises the hair on the back of my neck.  It causes the following things come to mind:

  • I have no vested interest in the outcome one way or the other.
  • I don’t value whatever resources are involved in the decision being made.
  • I would rather be doing other things with my time.
  • Your suffering (or anyone else’s suffering) doesn’t matter to me.
  • I’m fine with making short-term gains in exchange for long-term pains because I’m not concerned about the long-term impact of this decision.

 

There’s often a better, more specific way of getting your point across:

  • “I’m not sure this decision has any impact on me, so I’m fine either way.”
  • “You choose whichever option you think is best.”
  • “I’m not sure I could recommend one option or the other… they both seem [good/bad].”
  • “Do what you need to do in order to make [the goal] happen.  I’ll support you in doing this.”
  • “Let’s go with the best long-term option.”

“Not sure why”

I just saw a comment like this in a piece of code written long ago:

“Do this particular thing because it behaves like this.  Not sure why.”

That, to me, is a code smell.  If you aren’t sure why a particular piece of code is behaving the way it is, it’s best to take the time to understand why it is behaving the way it is, rather than compensating for it in other places.  Compensating for unknown behaviour leads to messy code and unexpected bugs.  It makes maintenance harder, because you then have to update code in more than one place.  Duplicate code = duplicate bugs = more maintenance = it takes you longer to get new features in.

If you aren’t sure why – learn why!  An hour spent now saves ten hours spent later.

If you don’t add logging/auditing now, you’ll regret it later.

At some point in time, you will be given a task like “What happened to X?  Why can I no longer see it?”  or “Why does the date on X say ‘Dec. 31’ when it previously said ‘Jan. 1’?”  With any sort of complex system, these kinds of questions are inevitable.  A user changes something without telling another user, or a weird process messes with data, or worse yet, a bug happens (which is almost inevitable).

With if you are properly logging an audit trail, these kinds of questions become really easy to solve.  Want to know who deleted X?  Check the logs.  The logs don’t even need to contain all the detail – just a general idea.  For example “User ‘Gordon Freeman’ deleted X on Aug. 1, 2012” is detailed enough to tell you who did it, and when it happened.  It’s so much easier to add some basic logging in at the start, then to spend time trying to track theses kind of things down after the fact.

Save yourself some time.  Add logging/auditing right now.