Just a short note tonight, as I've got some computational bacon frying in the pan, if you get my drift. I've been reading a bit about GUIDs lately. GUIDs are global unique identifiers often used as primary keys in databases. Essentially, they're just really, really big numbers. While they're enormous, they're not guaranteed to be unique and this causes some people to worry. "I use primary keys out the wazoo! Since GUIDs aren't necessarily unique, I could have duplicate keys, thus backing up my wazoo! My HMO won't cover a backed-up wazoo, no matter how sad the story is!"
I'm not a man to debate the woes of a backed-up wazoo. I am, however, a man to look at things mathematically. With that in mind, let's take a look at what GUIDs actually are.
GUIDs are 128 bit numbers. Knowing that, we can compute that the largest GUID is 2^128, or 3.402 * 10^38. We can all agree that's a big number. It's so big, however, it's almost hard to contemplate, much like the number of Police Academy series. Let's put that number into friendlier terms.
Assume we have a container that can hold 2^128 numbers. Let's continue assuming that we're generating numbers as quickly as possible, one hundred billion a second. Busting out the calculator, we can determine that works out to be 6 * 10^12 numbers a minute, 3.6 * 10^14 an hour, and 8.64 * 10 ^15 a day. Multiply that by 365, and we can do 3.1536 * 10^19 each year. That's a whole lot of numbers!
Now assume we can generate numbers like that for one hundred billion years. By the end of that, our container has got to be filled, right? In fact, we're so sure about it, we bet all of our money, our cars, and our teeth. Well, get ready to gum your food for a while because that container is nowhere near filled. Yeah, we've generated a bunch of numbers: 3.1536 * 10^30. However, we've only filled 1% of 1% of 1% of 1% of the container. When it comes to magnitude, we're a fly on the hippo's behind here. Consider the wazoo safe for another day.
Recently, an acquaintance asked me if I'd be interested in doing some contract work for him. What he wanted was a database, along with a PHP front-end, that could store the hands dealt on an online poker website. I politely declined for two reasons. First, I'm so busy right now that I consider putting on pants a waste of my precious time. I've just been wearing potato sacks for like a week now, and I don't want to think what the next progression is there if I add in another activity. Second, I'm not sure how legit this activity is. The guys behind these sites probably have names like Guido the Lamprey, and I don't want any first-hand knowledge of where that moniker comes from. After I declined, I thought it over a little bit, how I'd design and implement something like this. It seems easy enough, but it took me a few tries before I got anything approaching decency.
If you want any level of detail whatsoever, you'd need the cards involved for each hand. What I slowly began to realize is that how you store the cards in the database determines a whole lot about your application. Let's say you decide to break the hand data into two separate tables: one table which contains data about the hand (like the date and site it was dealt), and then one table which contains the cards (we'll call this one HandCards). What does the HandCards table look like?
My first idea is that you could describe each card with a simple 2 character field. For each card, that field would contain something like '8C' for 8 of clubs or 'JD' for Jack of diamonds. Seems easy enough, but think for a second about figuring out the hands. For example, how would you write a method to determine if a hand in question is a straight? First, you'd have to dissect that card field so you're only looking at the first character. Second, you couldn't abstract it totally because you're not dealing with numerical values. The compiler has no way of knowing that K, Q, J, 10, 9 is a straight, so that data would have to be specified somewhere. Good glayvin, this is the path to madness and wearing sweatpants out in public!
How about, in HandCards, we add an additional field: numerical value. That way, if the user is dealt a Jack, you could stick an 11 in there to show that a Jack is one above a 10. There's a problem here, though: this design isn't normalized. Let's say an avid player uses this database. Over the course of time, they receive the Jack of Diamonds 100 times. In each one of these records, there'd be a 'JD' and then an eleven for the numerical value. Over time, you'd see a lot of repetition there.
Let's normalize this turkey. We create a new table called Card. We have fields for suit and card; each card in the deck has a record in this table. Notice that suit and card have been separated out so we don't have to do any more parsing. The only card description in HandCards then is a foreign key back to this Card table. To handle the numerical value of the face cards, we create a new table called CardValue with two columns: card (eg, 8 or J) and numerical value. It would be possible to put that data in the Card table, but the database wouldn't be normalized. This entire approach gives us the best of both worlds: we can abstract the way we determine our hands, and we do it without repeating a lot of data. After discovering this, you're completely within your rights to stand up on your desk and start dancing the robot.
If I were Doogie Howser, I probably would've seen this solution right away. However, it took me 3 tries to get it correct; not only am I not Doogie Howser, I'm not even Vinnie. Trivial problems don't always have trivial solutions, and it's something I forget constantly. The only way I can approach success is to take my time and review everything with a critical eye. No one ever said being Doogie Howser was easy.
In an attempt to plague my life with anxiety, someone arranged it so that any tool I use at work must be the least user-friendly tool available. As soon as someone invents a text editor that shoots spears at the user at random intervals, there's no doubt I'd immediately have to start using it for my job. I don't know who to blame for this situation, so I will start with the members of the band Foreigner. Among the worst of these tools is Adobe Acrobat. How such a sluggish, crash-prone viewer has become a de facto standard both puzzles and enrages me. Once again, I blame the members of the band Foreigner.
As I was saying, I have plenty of usability complaints about Acrobat. Most of these stem from the fact that Acrobat is a document viewer that operates in a vastly different way from the world's most popular document viewer, the web browser. In a web browser, if you want to select a chunk of text, you can click and drag your mouse. In Acrobat, if you try this method, you scroll down the page. There are so many of these little vexations in Acrobat, I think an entirely new number called the Acrobat should be placed somewhere on the number line between a trillion and infinity. "How many bugs are in this code?" "Ohh, I'd say around Acrobat." "For the love of Pete, just how incompetent are you?!"
Ironically, one of the most infuriating of these usability issues may not even be Adobe's fault. It really, really pains me to say that, but I think it's true. As part of my job, I read a lot of development white papers (eg, stuff off Microsoft's Patterns and Practices website). These are typically distributed in pdf format. Invariably, the pages of these papers are numbered, but the numbering is a few pages off from Acrobat's page numbering. Acrobat starts counting from the very start of the document, whereas the document doesn't start counting until after the title page and the table of contents.
What's the problem here? Well, imagine you're looking at the table of contents for a particular item. The table of contents says it can be found on page 45. In Acrobat, you enter the command to go to page 45. It works perfectly, except you don't go to the right page 45. Acrobat takes you to its idea of page 45, which is a few pages away from the document's idea of page 45. Do this a couple of times a month, and you'll soon find yourself nursing your very own irrational hatred for a 70s rock band.
Clearly, a mistake exists here. The document is numbered two separate ways, and there's no way to translate the numbering from one way to the other. A user cannot help but be confused over something like that. Whose fault is it, though?
It's hard to blame Adobe, since their numbering is completely consistent with their concept of a document. It's also hard to blame the organization responsible for the document, because their numbering is also competely consistent with their concept of a document. What gives then? The fault, I think, lies in the collaboration between the two.
When you view a PDF document, the document is a joint presentation by both Adobe and the organization behind the document. One side creates the document, the other side displays it. When the document gets converted into PDF, we get the disagreement with the numbering methods. Knowing absolutely nothing about the subject, it seems to me like there needs to be a way, in that conversion process, of adjusting one set of page numbers so it's consistent with the other set. That way, there's only one set of page numbers and no more user confusion. (I don't know the specifics of implementation, but I can think of a few different ideas here.)
I think is an important usability idea, that problems may arise from collaboration. It's entirely possible that each entity is completely consistent and easy to understand, but the moment it comes into contact with another, that can all change. It's unreasonable to expect the user to navigate these changes gracefully. No, that burden should fall on the collaborators' shoulders. The collaboration isn't complete until the interaction between entities is completely consistente and seamless. Or, if the fault isn't theirs, then perhaps the members of the band Foreigner are responsible. Whatever, I'm open to suggestions.
The last time we met, we got down and dirty with basic exception handling. And, to be frank, that stuff was pretty straight-forward. If you just think about it for a second, it becomes clear how you should order your catch statements. What is less clear is why pizza sauce is on the condiment aisle at my local grocery store, not the pasta aisle. Come on, how on earth am I supposed to find that? What is also less clear is some of the specifics of exception handling, specifically checked versus unchecked exceptions.
I like to read a few different development forums, mainly to reinforce the idea that, while there are lots of better coders out there, few of them can use semicolons as well as I can. A recent topic I read dealt with technical interviews for a Java position, and how candidates were consistently stumped by a question on the differences between checked and unchecked exceptions. I'm familiar with the terms, but if I had to answer the question, I'd probably end up babbling about how unchecked exceptions are far less likely to come from Prague (okay, that one doesn't work as well when the words are spelled out). I know that you handle those checked and unchecked exceptions differently, but I think I'd have a hard time if someone put me on the spot. And so, as the grand duchy of Vlogsylvania, now seems like the opportune time to attack this subject.
As far as I know, Java is the only language where you have to differentiate between checked and unchecked exceptions (here's an explanation on why the C# team elected to diverge from this). I've seen the difference between the two explained a few different ways, but here's the one that made the most sense to me: checked exceptions are subclasses of Exception, while unchecked exceptions are subclasses of RuntimeException. With a little bit of deduction, we can figure out what the hell that actually means.
If unchecked exceptions are runtime exceptions, then the compiler isn't involved at all. You won't get any errors about not handling these exceptions, because the compiler can't know that they could arise; no sir, that's a runtime issue. Conversely, checked exceptions have nothing to do with runtime exceptions, and as such, are compiler issues. Since they're compiler issues, you can't ignore them; there's going to be some code involved. Specifically, you must declare each checked exception that a method throws. Then, the calls to that method must either catch the exception or throw it again. To put it simply, checked exceptions are the ones you anticipate and can do something about. Enough gobbledy-gook, let's hop aboard the Example Train.
//and now the caller.
public static void main(String[] args)
{
Alf littlehairydude = new Alf();
try
{
littlehairydude.EatCat();
}
catch (CatNotFoundException ex) //look at what we're catching.
{
system.out.println("Someone alert Willie!");
}
}
From a micro perspective, that's very cool. By declaring the kind of exceptions the method can throw, you force the implementor to plan ahead and handle any errors in a graceful manner. And in small doses, I bet this works really well. In large systems however, it gets very cumbersome. Not only must you deal with the fact that your coworker smells like rotten beets and continuously rambles on about who'd win in a fight between Uhuru and Lando Calrissian, you have to code to handle every one of his stinking exceptions. Not only that, but all the while, you have to hope he's keeping his checked and unchecked exceptions properly segregated.
There are a lot of arguments against checked exceptions (the seminal piece here), but nevertheless, they represent some neat programming power. As Spider Man taught us, with great power comes great responsibility. For some, that's no problem. For others, who man the frontlines of the great Uhuru vs. Calrissian debate, it's not quite that easy. Like the road signs say, proceed with caution.
For the past couple of months, I've been thinking a lot about exceptions. Not exceptions as in "I hate everyone on The View but Star Jones", but the erroneous/exceptional situations that arise when I run some of my programs. For instance, in Java, if I'm trying to compute an average and I inadvertently divide by 0, an ArithmeticException gets thrown. To abstract that situation, the exception is the object encapsulating that erroneous condition. Since they're objects, you can do a lot with them, and that makes exceptions very, very useful. So useful, in fact, that in my more fanciful moments, I've dreamed of making a pair of pants that could throw exceptions. "Uh oh, I've got to make a stop by the house; it seems the Levi's VM just raised a HoleInTheCrotch."
Sure, exceptions are useful, but how exactly do you make use of them? Once you're making use of them, how do you ensure you're doing it correctly? For that, I am here to investigate.
If you've ever looked at any Java or C# code, you've probably seen something like this:
Being big time, professional developers, when most of us write blocks of code, we're doing lots of stuff there. If an exceptional case does arise, it could be one of many different exceptions. Since these exceptions are different, you'd want to handle them differently. My pair of pants analogy illustrates this well. In addition to the HoleInCrotch exception, I'd also have a ZipperDown exception. Clearly, these are different cases, and they're corrected by doing different things. With one, I need to go home; with the other, I need to zip up. I can't just lump both together, because then I could either be going home to zip up my pants or attempting to zip the hole up. All of that, friends, is a balogna sandwich I don't care to eat.
If I want to address my wardrobe malfunctions appropriately, I need two separate catch statements like this:
I left the vlog alone over the holidays, in the off chance that Santa Claus might need my help with some SQL statements. Again, it appears I got out my elf suit for nothing. Ahh well, that doesn't matter; I will continue to prepare for Santa's impending OUTER JOIN emergency.
I'm fortunate to work for a company willing to reimburse its employees' tuition for any classes they take related to their work duties. I didn't even know about it until I was looking through the employee handbook one day, searching for a way to I could justify an iPod on my expense report. Right there, under the clause "Ridiculous electronics purchases shall be paid for solely by their ridiculous purchaser", I saw the bit about the tuition reimbursement. It was only about a paragraph, and it immediately set my mind to work.
Software development has an interesting relation to academics. If you want to be an accountant, you study accounting. If you want to be an electrical engineer, you study engineering. There exists a one to one relationship between profession and course of study. It would follow then that if you want to be a software developer, you study computer science. The problem is, there's a lot more to development than computer science, and there's a lot more to computer science than development. Studying only computer science and expecting to be a great developer is like studying only Happy Days and expecting to be the world's greatest Henry Winkler expert. Keep dreaming there, amigo.
To me, development is more like computer science + logic + writing + small group communication. Most people don't see that, though. They think of what we do, and they see us in front of a computer at all hours of the night, typing hurriedly and mumbling something about inverted matrices. For this misconception, I blame the movie War Games, and Matthew Broderick in particular.
I think that most other developers would see the value in me taking a course from one of these other disciplines, but what about the folks on the other side of the building, who are responsible for approving these expenses? What would they say if I turned in a tuition reimbursement request for Symbolic Logic in the Philosophy department? I imagine it'd be something along the lines of, "Get back to work, Socrates."
So what do you do? How do you justify a worthwhile course for a developer to someone who doesn't understand what we do?
A good thing here is to find a powerful geek in your office, explain your position, and let him handle the rest. By powerful, I don't mean capable of ripping apart phonebooks or killing grizzly bears with their bare hands, but someone in management. That's what I did, and it's one of the many reasons why it's great for some hardcore technical people to be in the higher ranks of a company. If that didn't work, I was prepared to print out the course description, circle the whole thing, and write, "This is what I do." Hyperbole perhaps, but it'd get the point across.
The best solution would be for some institution to form a true Software Development department. They could offer courses in all of the applicable disciplines under one convenient header. Not only would it remedy this particular problem immediately, but it'd put millions of burgeoning geeks onto the right path. Should no one take the lead here, the University of Powell will soon be taking applicants. We'll be the fighting Henry Winklers.