Tuesday, April 7, 2009

When Classes Instantiated as auto Vars on Stack are Evil

I have a threaded C++ app that's using message queue to pass data among threads like so (UML sequence diagram follows):
  Thread A (with response-Q) enqueues request in Thread B's input-Q
Thread A blocks on response-Q empty
Thread B wakes up [input-Q non-empty], dequeues request
Thread B munches on the request from A
Thread B enqueues result in A's response-Q
Thread B blocks on input-Q empty
Thread A wakes up [input-Q non-empty], dequeues response
Thread A goes its merry way.

Nice and easy, eh? And it has worked OK for a while...

You know how it is when one keeps adding code the bugs tend to be shifted on the shelf and rear their nasty heads? It happens in my case that Thread A was the main thread and its response-Q was declared as
Queue respQ("A's response Q");
Now this an auto var that lives on Thread A's stack.

Thread B was doing its job but when it wanted to enqueue the response in Thread A's respQ [it got a pointer to &respQ via a param] it would block in
pthread_mutex_lock()
in libc.

Bummer! I spent three hours writing LOCK/UNLOCK macros in class Queue that would confess who called them and in what thread and I was matching the results (yes, gdb was borked on that ppc target and was useless with inf thr et al.) and I really saw that the Queue instance of Thread A was indeed blocking in
pthread_mutex_lock()
but nobody had locked that mutex before!!

The funny part is that I had that mutex properly initialised in Queue::Queue(); I even changed its attribute to error-checking but it just did not help! That mutex would behave as if uninitialised and containing garbage!

After a while you get bored of this kind of debugging so I changed Thread A's code to read
Queue* respQ = new Queue("A's response Q");
and everything went smooth afterwards.

This yields the following article of faith:
Objects declared as class auto on stack are evil.
-ulianov