我正在使用公共基类has_threads
来管理应允许实例化boost::thread
的任何类型。每个has_threads
为什么这个线程管理模式会导致死锁?
实例拥有thread
的set
一个S(支持waitAll
和interruptAll
功能,我不包括以下),并应自动调用removeThread
当一个线程终止保持这种set
的完整性。
在我的程序中,我只有其中之一。线程每间隔10秒创建一次,每个线程执行一次数据库查找。当查找完成时,线程运行完成并调用removeThread
;通过设置互斥锁,将线程对象从内部跟踪中删除。我可以看到这个工作正常与输出ABC
。
虽然有一段时间,但机制发生了碰撞。也许两次同时执行removeThread
。我无法弄清楚为什么这导致死锁。从这一点开始的所有线程调用都不会输出除A
以外的任何内容。 [值得注意的是,我使用的是线程安全的stdlib,并且在未使用IOStream时仍然存在问题。]堆栈跟踪表明该互斥锁正在锁定这些线程,但为什么该锁最终不会被第一个线程为第二个,第二个为第三个,依此类推?
我错过了关于scoped_lock
如何工作的基础知识吗?尽管(或者甚至是由于?)使用互斥锁,是否有任何显而易见的错误可能导致僵局?
对不起,这个问题很糟糕,但是我确定你知道这个问题已经很晚了 - 不可能为这样的错误提供真正的测试用例。
class has_threads {
protected:
template <typename Callable>
void createThread(Callable f, bool allowSignals)
{
boost::mutex::scoped_lock l(threads_lock);
// Create and run thread
boost::shared_ptr<boost::thread> t(new boost::thread());
// Track thread
threads.insert(t);
// Run thread (do this after inserting the thread for tracking so that we're ready for the on-exit handler)
*t = boost::thread(&has_threads::runThread<Callable>, this, f, allowSignals);
}
private:
/**
* Entrypoint function for a thread.
* Sets up the on-end handler then invokes the user-provided worker function.
*/
template <typename Callable>
void runThread(Callable f, bool allowSignals)
{
boost::this_thread::at_thread_exit(
boost::bind(
&has_threads::releaseThread,
this,
boost::this_thread::get_id()
)
);
if (!allowSignals)
blockSignalsInThisThread();
try {
f();
}
catch (boost::thread_interrupted& e) {
// Yes, we should catch this exception!
// Letting it bubble over is _potentially_ dangerous:
// http://stackoverflow.com/questions/6375121
std::cout << "Thread " << boost::this_thread::get_id() << " interrupted (and ended)." << std::endl;
}
catch (std::exception& e) {
std::cout << "Exception caught from thread " << boost::this_thread::get_id() << ": " << e.what() << std::endl;
}
catch (...) {
std::cout << "Unknown exception caught from thread " << boost::this_thread::get_id() << std::endl;
}
}
void has_threads::releaseThread(boost::thread::id thread_id)
{
std::cout << "A";
boost::mutex::scoped_lock l(threads_lock);
std::cout << "B";
for (threads_t::iterator it = threads.begin(), end = threads.end(); it != end; ++it) {
if ((*it)->get_id() != thread_id)
continue;
threads.erase(it);
break;
}
std::cout << "C";
}
void blockSignalsInThisThread()
{
sigset_t signal_set;
sigemptyset(&signal_set);
sigaddset(&signal_set, SIGINT);
sigaddset(&signal_set, SIGTERM);
sigaddset(&signal_set, SIGHUP);
sigaddset(&signal_set, SIGPIPE); // http://www.unixguide.net/network/socketfaq/2.19.shtml
pthread_sigmask(SIG_BLOCK, &signal_set, NULL);
}
typedef std::set<boost::shared_ptr<boost::thread> > threads_t;
threads_t threads;
boost::mutex threads_lock;
};
struct some_component : has_threads {
some_component() {
// set a scheduler to invoke createThread(bind(&some_work, this)) every 10s
}
void some_work() {
// usually pretty quick, but I guess sometimes it could take >= 10s
}
};
当然,死锁只能发生,如果'releaseThread',直接或间接,最终调用自己(或'createThread')?我看不出它是怎么回事...... –
@Tomalak:我的猜测是线程可以在释放创建锁之前停止,递归地调用发行版... – neuro
哦,这是一个观点...我不期待,但我想这是可能的! –