We ran into a positively bewildering bug in our Android app last week. It was the perfect storm of bug-dom: it looked incredibly simple, but actually debugging it was horrendous due to several threads, API calls, and strangely-timed UI events. On top of everything, this bug didn't really affect any user; it just made the UI look slightly weird if you were paying a lot of attention.
If you look at the Famigo Sandbox
app, you'll notice there's a scroller at top where we list app recommendations. The issue we had was that the app scroller would occasionally list a blank app. Instead of an an app title and an icon, we'd show TextView as the app name and no icon at all. You can scroll through hundreds of app recommendations, but you'd only get the blank app once. Sounds trivial, no?
Last week, we were readying a great new version of the app to push to Google Play. We had one known issue left: the blank spot in the app scroller. A few of us had actually spent a bit of time looking at this bug in the past, but no one had figured it out. We had a bit of time in our schedule before we needed to push the app, and I thought it'd be great to finally nail that bug. Fixing it would mean no known issues; hooray! Also, even though the bug was largely harmless, I worried that it'd be a broken window
that might lead us towards sloppiness in the future.
I paired up with John, one of our developers, to squash that sucker. When we sat down, we fully expected to fix the bug within the hour; I think I actually said that out loud. (Sidenote: never do that on a weird bug. If you do say something like that, the software gods might overhear and punish you for your hubris.)
We began creating breakpoints, logging everything, and stepping through the code. Within a few minutes, I was baffled; this particular bit of functionality was far more complex than I thought. Due to the number of threads, it was even hard to figure out if it was an issue with the rendering logic or with the underlying data structure. Our joint mindset quickly went from "Haha, let's solve this silly bug" to "Hmm, interesting" to "I don't get it" to "Is there another line of work we're qualified for? Maybe garbage men, or is that a union deal?"
After several hours, we literally had no idea what the problem might be. The rest of the office seemed to really enjoy our sighs, profanity, and nonsensical ramblings as we talked through what might be happening.
It was tempting to stay at the office until we found the bug. I have tried that before, and I found that, past a certain amount, my efforts become detrimental. I get tired, I mess things up even worse, and I spend the entire next day just working my way back to the original bug. We made a note of what we were last looking at, then we left the office broken and dejected.
That night, I actually dreamed about the bug. Yep, I couldn't get away from it even in my sleep. (Even worse, in the dream, I was pair programming with Dog the Bounty Hunter. Let's all choose not to analyze that.)
Both John and I got back into the office early that next day. I imagine we looked like a couple of grizzled, old soldiers headed into battle, since no one dared to joke about the bug or even make eye contact with us. As we sat down together, I thought it made sense to put a time limit on our debugging. If we couldn't fix this ridiculous, silly bug in 90 minutes, then it was a sign from the cosmos that the app was destined to ship with one empty app in the app scroller.
We referred to the note we made last night on where to pick back up and we jumped back into code. About 10 lines below, we saw something very, very strange. If we got a certain response from the API, we would insert a blank app into the app scroller to keep the numbering even across our paged requests. "That's... weird," we both said, trying not to sound optimistic.
We deleted that chunk of code, tested the app, and saw that everything now worked great. We had 72 minutes left on the timer, and we had actually broke for 5 minutes for our standup! We had been looking right at that function the night before, and we missed the issue entirely.
The lesson? When you've stared at something for hours and still don't get it, go home. You'll see the problem with new eyes tomorrow.