Continuing on the subject of threading, a coworker asked me an interesting question the other day. She has an application that's basically a packet-logger. She connects up to some service, sends it a request, and if she gets anything back, she logs it to an XML file. To communicate with this service takes a while, so she has two threads going at once, each one doing both polling and logging. The constraint here is that the data has to be logged in real time. (Why real time? I don't know, maybe in case of a robot attack on the data center. You think a robot will give you an extra 5 ms to write to your log file?!) Because I like to act like I know what I'm talking about, she asked me for help.
If you have two threads trying to access the same physical resource at random times, something bad is going to happen. Sooner or later, both threads will try to write at the same time and something will blow up; all you can do is hope that this blown-up something is neither you nor your pants. How do you fix this brewing catastrophe?? You fix it with locks. By locking your resource, you give one thread exclusive access, and then you alert the other thread to go ahead (in .NET, you communicate with other threads using Monitor.Pulse and Monitor.WaitPulse).
The problem here is that she said it HAS to be real-time. If you're locking something, you're not going to be logging in real-time. If a thread has the file locked that the other thread wishes to access, there must be a wait. It may not be much of a wait, but depending on how long you're talking to that service and writing to the disk, it could be a while. So how would you get around it?
One idea I had was to divorce the logging part from the service communication part. What if you still had your two threads communicating with this slow, crappy service, and whenever they had something to log, they inserted it into a shared, thread-safe queue? Let's go further and say this is a smart queue where we've added some wizardry so that anytime an element is enqueued, we write it to the disk.
The problem is that even with the queue, if you want multiple threads to access it safely, you have to lock it. The good part is that you're locking an in-memory object, not a physical resource. I would assume this is faster, although I have no numbers to back that up. In the spirit of bloggers everywhere, I will just make up a number: this solution is 27.8% faster than locking the XML file.
It's not real-time, but it's thread-safe. You can have fast and dangerous, or you can have slightly less fast and much safer. I realize most programmers have to be clubbed over the head to renounce the first, but this might be a good situation to do so, simply because threads get weird.
Posted by Cody at February 27, 2006 06:49 PM