On several occasions, while working on multi-threaded applications, I have experienced cases where a thread is waiting forever for a mutex owned by another thread.
Causes of this include deadlocks (where each thread attempts to lock a mutex held by the other, preventing the other from ever getting a chance to release it, resulting in both threads waiting forever) and issues as simple as forgetting to unlock the mutex.
In these cases, I’ve wished for a way to determine which thread holds the mutex. In the past. I’ve used various methods to discover this information, including:
- Adding wrappers around the lock and unlock functions to print each lock/release.
- Using GDB to view the owner field of the mutex and then checking each thread to see if it matches (when switching threads, GDB prints the thread ID).
- Giving up and getting drunk instead.
Although each of these methods were successful, performing them again and again to debug issues proved too time-consuming (in addition to the need to either add the wrappers or remember to enable core-dumps). A better, easier method was needed.
The problem is that the owner field of the mutex may mean any number of things, depending on the environment (whether threads are handled in the kernel or not), so the code below is not particularly portable. However, it works fine on both my desktop machine and ARM development board, so it’s good enough for my debugging purposes (and may or may not be of any use to you).
A call is added at the start of each new thread to a function that stores the thread’s name and ID. This function must be called WITHIN the thread in order to get the thread’s ID.
A signal handler is installed for SIGINT and SIGUSR1 which examines each mutex and if it’s locked then it checks the thread list to determine the owner and prints the owner’s name and lock count. For SIGINT, the signal handler then calls abort() to create a core dump for further debugging, and for SIGUSR1 just reinstalls the signal handler.
The code may be downloaded here: http://www.sassan.me.uk/projects/print-mutex.tar.gz
- As mentioned before, not portable (but good enough for my purposes).
- Requires a function to be called at the start of each thread (I cannot find any way to automate this with macros, etc).
- Fixed number of threads can be tracked (I’ve never been in the situation of not being able to guess, to within an order of magnitude, the number of threads my application will use. It wouldn’t be hard to make the allocation dynamic, but it would be unlikely to be useful).
- There’s currently no dynamic adding of mutexes, you need to modify the source (mutex-debug.c:dump_mutexes()).
Summary: hackily put together, unportable and completely unsuitable for production code. Provided in the hope that it might be useful to you for debugging purposes.
- Download the source and build mutex-debug.c into your project. Include mutex-debug.h. (The code may be downloaded here: http://www.sassan.me.uk/projects/print-mutex.tar.gz)
- Add calls to register_thread to the initial function of each thread.
- You can use dump_mutex() to show the status of a particular mutex at any given time.
- OR you may also modify the dump_mutexes function to dump each of your mutexes (example included) and call mutex_debug_init() to implement the most common use-case.
The code is poor, don’t bother telling me so .
But if you’d like to provide patches to make it better, feel free.