February 2012 Archives

Bytes Matter

| No TrackBacks

I love to profile applications, because I always learn something that surprises me.

Initial Profiler Surprise: Client Side
Case in point, I was recently profiling our Android application, the Famigo Sandbox. This app sends a lot of data back and forth with our API, as we try to determine which of the apps on your phone are safe for your kids. I always assumed that, if app performance suffered during some of the chattier features, it was probably due to slow cell reception.

The profiler told me that I was wrong; the transfer time was almost always negligible. What wasn't negligible was the amount of CPU time it took to parse the JSON coming from the API into native types. (Note that I'm measuring JSON parse time across an average app session, not just for one call.)

Like most JSON decoders, we parse everything, regardless of whether we use it or not. I took another look at our API responses and learned that our app actually didn't need half of what we were sending.

Now, we weren't doing anything too crazy on any individual API call. We consistently returned too much data everywhere, though, across many API calls. In aggregate, these bytes mattered. Once we learned this, we streamlined the data returned from our API and quickly saw our JSON parsing bottleneck go away.

Subsequent Profiler Surprise: Server Side
Here, I was profiling our website, which is essentially an app recommendation engine for families. We consistently see some calls take a long time, and I assumed it was the complexity of the queries. For example, our queries to find and sort the best iphone apps or free android apps take into account a lot of disparate data from our own reviewers, the app stores, and all of our family users.

When I profiled these calls again, I was shocked. The queries were actually well-tuned (as of the author of these queries, yes, this is shocking); the slowness was coming from the ORM (pedantic note: it's really an ODM - shakes TI85 threateningly) we use to turn our MongoDB documents into our lovely Python models.

This problem was actually very similar to the problem seen in our Android app. MongoDB documents are encoded in BSON, which is very similar to JSON, and our ORM is responsible for parsing that BSON into usable types. On almost all of these queries, we were asking our db drivers to parse the entire document when we really only needed a small subset (1/3 or 1/4) of the fields. That's hardly noticeable when you're dealing with a few documents, but it becomes quite a bottleneck with thousands of documents. Again, I realized that bytes matter.

Once I figured out the problem, the fix was easy. Instead of asking for every field on every document in the query, I simply specified the fields I wanted. When this change went live, the bottleneck disappeared and we got an easy 40% improvement in average render time.

Let Us Conclude
I don't think I need to restate this, but I will, because it's my website and we hammer points into the ground 'round these parts. The lesson is that the more data you return, the more you must process.

This is so basic that it's often easy to ignore entirely. However, once you have real users and real data, bytes matter, and they matter more and more as you scale. Use them wisely.

Understanding-Driven Development

| No TrackBacks

I have a weird idea. What if, with every change we made to our codebase, we tried to increase our understanding of it a little bit?

Entropy Tries to Thwart Us
This is challenging because codebases always go in the opposite direction. As you make more changes and new people join the team, everybody understands less and less of what ought to be happening; the fact the code works at all is nearly miraculous! Soon, everyone who touches the codebase adopts an "If it ain't broke, don't fix it" attitude.

Success depends on understanding, though. We have to understand the code to add new features, fix important bugs, refactor, and bring new teammates aboard. Not only that, but problems that are deeper than code, like architecture and scalability, can't be addressed without first understanding.

Understanding Must Be Widely Distributed
One person understanding isn't enough. After all, what happens if that one person gets eaten by a komodo dragon?

There are deeper problems than that, though. Imagine that your brain becomes tightly coupled with a bit of code. The first problem is that your brain is faulty, and you will forget. The second (scarier) problem is that, if you're the only person who understands a piece of code, you own it and you'll maintain it. Forever. It doesn't matter what you else you progress to, when a problem arises with that code, it's your problem. It encourages context switching, and lots of tiny, strange code silos.

How to Create Understanding
How do you increase understanding on a large scale, then? Let's go through a few approaches, none of which are earth shattering.

  1. Automated tests. When you have simple, isolated tests that are run often, it means anyone can learn about the code, make a change, see the effects, and feel good about the work they just did; you are creating understanding. Unit tests, BDD-style tests, integration tests? All of these work.

  2. Refactoring. As you are adding features or fixing bugs, you can create understanding if you're constantly working to make the code as clear as possible. The great thing about these changes is that they can be trivial. One technique is just to revisit the names used in a chunk of old code. If a variable contains sales invoices and you change its name from temp to sales_invoices, you have succeeded. Make more changes like that!

  3. Documentation. Yes, documentation can create understanding, but only if it accurately reflects the current state of your code. The most effective way to do this is to generate it dynamically based on the code itself: method signatures, assertions, url routes, the requirements stated in your BDD tests.

  4. Environment automation. There are probably a lot of magical bits in your environment. Maybe your build process doesn't work unless this one particular directory is owned by this one particular user, or your CDN occasionally serves up old assets and you have to poke around in the Amazon Web Service dashboard to fix it. These weird workarounds are often simple, but you encounter them infrequently enough that no one remembers exactly what's happened or why. Do your brains a favor: automate all of this. Once it's written, it can be understood.

How to Create Misunderstanding
You can easily abuse all of the methods I just said, and actually use them to create misunderstanding.

  1. A test creates misunderstanding if it depends on data that's changed by other tests. If your tests don't repeatably succeed, regardless of order, you're causing confusion.

  2. Refactoring can create misunderstanding if you take well-understood code and change it dramatically, without also writing tests.

  3. Documentation often causes more harm than good. Think about the nearest gigantic, outdated Word doc, or the comments in your code you fail to revise as you refactor. At some point, someone will read that and get confused.

  4. Environment automation causes misunderstanding if it doesn't accurately reflect the state of your enviroment. Maybe you have some disaster recovery scripts lying around. Do they work, or would looking at them only give you misconceptions about the way your environment used to look?

Conclusion: Be Smarter.
Ultimately, software development is really, really hard. We have to think in terms from single bits to clusters of super-powered VMs. The best (only?) way to work effectively together and build great things is to constantly and collectively work towards a better understanding of our code.

About the Author

The Art of Delightful Software is written by Cody Powell. I'm a dev manager at Amazon where I work on the Instant Video, Mobile Clients team. Before that, I was CTO at Famigo, a venture-funded startup that helps families find and manage mobile content.

Are you interested in building great mobile apps for Amazon Instant Video? Email me!

Twitter: @codypo
LinkedIn: codypo's profile
Personal blog: Goulash
Email: firstname + firstname lastname dot com