October 05, 2005

Fairly Certain Should Be Includeds

The only thing worse than drafting requirements is maintaining requirements. Just trust me on this one. At the very least, you stand some chance of convincing someone to do requirements by telling them, "Look, YOU get to decide what goes into the system. Isn't that exciting?!" Lots of folks will fall for that one (case in point: me). There's not such a good chance that you can convince someone to maintain the requirements by telling them, "Look, YOU get to sit in on a bunch of reviews and then integrate all of the changes into the existing documents in a timely manner. Isn't that exciting?" That's not exciting; it's pretty awful. However, if you fail to maintain the requirements, they're no longer requirements. Allow me to explain.

Like I was saying, I got tricked into doing requirements for a big application about nine months ago. It didn't take long before I regretted that decision something fierce. I realized it was important, though, so I did it. As soon as I finished the requirements, I exhaled deeply. I spit in the dirt, wiped my hands on my overalls, and declared, "My work here is done!" Then my boss said, "What about reviews?" And I said, "Didn't you hear me? My work here is done!" Of course, it wasn't; someone had to change the requirements based on everyone's feedback. Since I wrote it, why shouldn't it be me?

My argument, a very cunning one, was that the requirements were good enough. We had 90% of the system covered, and we could figure out that 10% at a later date. I had a good reason for this: I was tired of doing requirements. I wanted to do real work. I hadn't compiled anything in weeks, and I feared my code fu was leaving me.

I continued to think about it, and slowly I began to realize that if SOMEONE didn't maintain the requirements, they were no longer requirements. A requirement says that a certain element of functionality is required to be in the system; no ambiguity exists there. However, if the documents didn't get updated, we lost that certainty. No longer could we absolutely say that the feature was required to be in there. We could say that we were fairly certain that the feature should be included, but there'd be no authority. No longer were they requirements, they would devolve into Fairly Certain Should Be Includeds. Trust me, I was not happy to convince myself of that point.

I think this applies for all documentation; if it no longer reflects the project, it's no longer useful. If the requirements change, update the requirements. If the design changes, update the design doc. If the code changes, update the comments. Otherwise, the only point of the documentation is to give you a starting point, a general idea of what's going on. You can get that from a lot of places, so what's the point in some elaborate document? Why not just take notes on cocktail napkins if that's all you're using your documentation for? Or why not just barge into your coworkers' offices and ask for a one sentence explanation?

As much as I hate to admit it, documentation has a point. It exists to inform you, in a concise, easy-to-understand manner, about a project. If it doesn't do that, there's no point in having it. In order for documentation to fulfill that purpose, it must be maintained. All the scribbled cocktail napkins in the world can't get around that. Now, whether requirements maintenance has to be left to ME is another issue entirely.

Posted by Cody at 07:55 PM | Comments (0)

September 12, 2005

Field Names or Bust

I spent some time this weekend using Python's DB API. Boy, have I ever been spoiled. Using Python's DB API is kind of like going into a restaurant and ordering a cherry pie, only for the waitress to return to your table with a barrel of cherries and some frozen pie crust. It's a little rough around the edges, that's all I'm saying. I first realized this after I successfully ran my first query and tried to view a few columns. I learned then that it's impossible to refer to the fields in your result by field name. Instead, you have to refer to the field column order in the query (ie, field[5]). Clearly, that ain't gonna cut the mustard.

So, since I had nothing better to do, I put together a simple way of allowing the user to access these fields. I did it by making the result a list of dictionaries instead of a tuple.

Here's the code!


from sys import exit
import MySQLdb

def SelectQuery(queryStr):
    if (queryStr.find("*") > -1):
        print "Error: you cannot select *, you must specify the field names."
        exit(1)
    try:
        #we make the string lower case, in order to make our string comparsion easier.
        queryStr = queryStr.lower()

        #connect to db and run query
        db = MySQLdb.connect(host="localhost",user="user",passwd="pass",db="db")
        cursor=db.cursor()
        cursor.execute(queryStr)

        #return all rows.
        allRows = cursor.fetchall()
        cursor.close()
        db.close()

        #call our helper functions
        return ResultsIntoDictList(allRows, queryStr)
    except MySQLdb.Error, e:
        print "Error: %s" % e
        exit(1)

def FieldsFromQuery(queryStr):
    selectLoc = queryStr.find("select ")
    fromLoc = queryStr.find(" from ")

    #if we didn't find one of those terms, it's a problem
    if (selectLoc < 0 or fromLoc < 0):
        print "Error: didn't find select or from in the query."
        exit(1)

    #our fields are strung together by commas, so we split that into a list
    fieldList = queryStr[selectLoc + len("select "):fromLoc].split(",")
    for i in range(len(fieldList)):
        fieldList[i] = fieldList[i].strip()
    return fieldList

def ResultsIntoDictList(allRows, queryStr):
    rowList = []
    fieldList = FieldsFromQuery(queryStr)

    #iterate through all the rows
    for row in allRows:
        #create a new dictionary
        rowDict = {}

        #populate that dictionary according to field names
        for i in range(len(row)):
            rowDict[fieldList[i]] = row[i]
        #append dictionary to list
        rowList.append(rowDict)
    return rowList

(Partway through doing this, I learned such information could be gleaned easily from cursor.description. I share this code in case it ever comes in handy for anyone ever.)

Posted by Cody at 06:59 PM | Comments (0)

July 13, 2005

Paging Optimus Prime, III

Okay, I didn't get quite enough time last night to write more about prime numbers. What I wanted to talk about was just how long it takes to try to factor a number that's millions of digits long. Here's a little Python script that makes my point for me.


import time

def Start()
   global timeStart
   timeStart = time.time()

def End()
   global timeEnd
   timeEnd = time.time()

def Seconds()
   return (timeEnd - timeStart)

def TimedDivide(bigNum, divisor)
   Start()
   print "answer: " + str(bigNum / divisor)
   End()
   print "length of time: " + str(Seconds()) + " seconds"

>> TimedDivide(pow(4, 10000000), 3)

I got something like 7 hours for one divide. How many numbers would you have to divide your prime by in order to prove it's prime? Uhhh, a lot (check out Erastosthenes' sieve for more info here, although I'm sure there are better techniques). Multiply that by the number of hours it takes to do each divide, and I'm pretty sure you come up with a number greater than the amount of vienna sausages in the world. Not just North America, the WORLD! I don't mean to be an alarmist, but it's a biggun.

Posted by Cody at 07:58 PM | Comments (0)

July 12, 2005

Paging Optimus Prime, II

Here's some Python code that will give us an idea of the scope of really big numbers. It's related to yesterday's entry on giganto prime numbers, and it tells us how many digits long a number is, as well as how many bits it'd take to store the number.


import math
def HowManyBits(bigNum):
    return int(math.log(x, 2)) + 1

def HowManyDigits(bigNum):
    return int(math.log(x, 10)) + 1

def ShowNumberInfo(bigNum):
    print "length in digits: " + str(HowManyDigits(bigNum))
    print "size in bits: " + str(HowManyBits(bigNum))

>> ShowNumberInfo(pow(3, 1000000))

Posted by Cody at 08:54 AM | Comments (0)

May 23, 2005

Pish Posh!

I'm starting work on a new project at home; I'm calling it Operation World Domination Through Utter Secrecy and Crappy Coding. The scope of the project is pretty impressive; I'll be creating both a website with a webservice, and a client-side application. The initial plan was to use Python for all of that. Unorthodox, sure, but everything I read suggested that Python could handle it. Well, before I signed my first born over to the dark lords of Python, I decided to do some proof of concept code. I wanted to create some of the objects I'd be manipulating, as well as a simple GUI in wxPython. It seemed like a reasonable plan for the weekend.

Err, scratch that plan. It turned out I didn't even get to the GUI because the OOP was so confusing. And by confusing, I mean that I couldn't figure out how to make a private method for a class. I did some searches in the newsgroups, and while I found plenty of hacky ways to do it with underscore prefixes and all kinds of baloney like that, it's just not a feature of the language. Someone said that was deliberate because "we're all adults here". Pish posh!

To me, an object-oriented design is about managing complexity. Private methods are a large part of this, helping the class designer to encapsulate some of the complexity into parts of the class that the class implementer will never see. That way, one only needs to know the bare minimum in order to implement the class. I like that. If I want to implement the class, I shouldn't have to dig through a litany of methods related to the inner-workings of the object in order to find the stuff I need. And if I'm designing a class, I don't want these stupid programmers messing things up by calling the methods that they have no business using. Whether we're all adults here or not, encapsulating that complexity is a good thing.

It's possible I'm completely wrong here. I do hope so. If not, I may have to approach Operation World Domination Through Utter Secrecy and Crappy Coding from another direction entirely. It'd be equally unorthodox, of course, maybe in LOGO or something. I can make that turtle move with the best of em.

Posted by Cody at 06:40 PM | Comments (0)

February 28, 2005

The Scripting Challenge

Last time, I wrote some about learning a scripting language. Right now, I just use a compiled language for everything, and that strikes me as an inefficient way of doing quick jobs, like sysadmin tasks. Would a scripting language make that any easier? The only way to figure it out, I determined, was with a test, much like George Costanza's candy line-up on Seinfeld. Find a representative quick and dirty task, and then try it in a scripting language versus a compiled language. Whichever language's version was the easiest to write would then get to ride shotgun in my development rocket car. Well, the test is complete, but I'm having a hard time coming to any conclusions.

The task I selected was building a connection string. It seemed like a good choice; it's something that must be done frequently, and it's usually a minor inconvenience. I found a good Python example of this in Chapter 2 of Dive Into Python. Stripping it down a little bit, it comes out to 6 lines. Most of the code is pretty simple, but one line is awfully involved. Anyway, once I located that example, I decided I'd try to rewrite that in C# and see how many lines it took. The line count? *Drumroll please* 59 lines (code available if you're interested).

Now, about those 59 lines. Some of those are automatically generated by the compiler. Furthermore, some of those are OOP-related, which isn't very useful in this instance but it's a constaint imposed upon me by the language. Even taking those into consideration along with my undying love for whitespace, there's way more C# code than Python code for this example.

However, even though there's more code with C#, it's a lot easier to follow. For instance, I mentioned that the Python code contained one complicated line. That line:

return ";".join(["%s=%s" % (k, v) for k, v in params.items()])

Ooookay, I can see what that's doing, but I think it'd take me a long time to actually come up with that line by myself. Furthermore, I don't like my chances of making any sort of meaningful change to that line without first beating the program to death like a pinata.

In contrast, the C# code for that functionality is:

foreach (DictionaryEntry keyPair in connVals)
{
retVal += keyPair.Key + "=" + keyPair.Value + ";";
}

To me, it's a lot clearer that I'm dealing with a Hashtable there, and I'm joining together the keys and their values in a string. It's more code, but less complexity. I'm sure as I wrote more Python, I'd find its version easier to write, but I still think it'd take me a little while to parse that code mentally if I'm just reading through it later on.

So, I present the big question: when doing tasks like these, it readability of code or writability of code more important? For regular code to which you often refer, readability definitely takes priority, or else maintenance becomes hell on earth. That's one reason why, I think, most major projects aren't written in scripting languages. For stuff like this though, just simple admin tasks that I prefer to automate, I don't really have to worry about maintenance. It's something I spend an hour on one afternoon, then maybe revisit 15 months later. There's really nothing to gain by combing through code and aiming for elegance; the point is to get it working and get it out there.

With that in mind, I declare scripting languages to be the winner here, meaning I need to actually start using something like Python for tasks like these. I don't particularly like that. If I had my druthers, I'd write everything in LOGO and Alf would be the #1 show on TV. Unfortunately, the real world is unwilling to go along with me on those. Tasks vary, and there's not language that fits them all perfectly. As developers, we have to be willing to be flexible with which tools we use. With our TV shows, that's another matter entirely.

Posted by Cody at 06:03 PM | Comments (0)

February 22, 2005

Going On Script

Different tools for different jobs: this idea makes sense to me. Just like I would be slightly terrified if my car mechanic wanted to work on my fuel injectors with only a sledgehammer, the folks at Microsoft would feel similarly if I proposed we redo Windows XP entirely in LOGO. While I understand this idea, I don't implement it very often; this is largely because I'm an idiot. At least once a month I'll be facing a quick and dirty sysadmin task, and rather than coding something simple in a lightweight scripting language, I pick C#. In retrospect, this is kind of like buying a Ferrari to haul manure around town. (This is not to say that Microsoft products are Ferraris and sys admins are professional poo-handlers.) For these messy jobs, I need a dump truck that can handle pretty much anything without complaining. I need a good scripting language.

Looking at it now, my refusal to use scripting languages is a self-fulfilling prophecy. Since I hardly use something like Perl or Python, I'm not too comfortable with it. Thus, when the time comes to write a script, I'm much more likely to fall back on something I use often, like C# or Java. The next time one of these situations arises, I'm even rustier with the scripting language, so I'm less likely still to use it then. You see where my relationship with my scripting language is headed. Eventually, I get to the point where one of my coworkers mentions Python, and I run out into the parking lot with a machete, looking for a reptile to kill. No one wants to see that.

You may ask what the big deal is, why I'd want to add another language if I'm getting my work done as-is. Well, it seems to me that for some easy tasks, higher-level languages can introduce too much complexity. If I'm just writing a simple app to parse a text file, why should I have to put everything in a class? If I'm just parsing a text file, why should have to recompile if I want to make one little change? Why do I need to write so much code to do something so simple? As part of my relentless drive to do as little actual work as possible, I think I owe it to myself to see what my alternatives are.

So, where do I start? Well, before I get a tattoo of the Perl camel on my forehead, it seems to me that I need to verify my hypothesis that scripting languages are better for certain tasks. If I were wrong, not only would I then have the tattoo, but I'd be incredibly surly about it. People would come up to me on the street and ask about it, and I'd sputter out, "It was a mistake! And if I ever find the guy who wrote Perl, I'll karate chop him in the head until he's speaking in tongues!"

What I need then is to find a trivial task, write it in a scripting language, and then write it in a high-level language. The one that's the simplest to write will be the language best suited for these chores. I will stage, in other words, a programming version of Mad Max Beyond Thunderdome. Two languages enter, one language leaves! Much to everyone's dismay, I will not do all of this while dressed as Tina Turner. Check back next time for my exciting results.

Posted by Cody at 07:27 PM | Comments (0)