2011-12-07 24 views
1

我想监视线程。我使用条件变量发送&接收HeartBeat &确认信号。
scnMonitor_t是一个监视器结构。当新线程被添加时,它将注册到监视器&添加到scnThreadlist_t。 monitorHeartbeatCheck是以程序开始的线程, monitorHeartbeatProcess是添加到所有线程函数的API。使用心跳信号的c中的线程监视

其实我的问题是没有正确地遵循进程索引 它以第三线程的等待HB条件结束&创建了死锁。 应该是什么问题?
在此先感谢。

typedef struct scnThreadList_{ 
     osiThread_t  thread; 
     struct scnThreadList_ *next; 
} scnThreadList_t; 

typedef struct scnMonitor_{ 
     bool   started; 
     osiThread_t  heartbeatThread; 
     osiMutex_t  heartbeatMutex; 
     osiMutex_t  ackMutex; 
     osiCond_t  heartbeatCond; 
     scnThreadList_t *threads; 
} scnMonitor_t; 
static scnMonitor_t *s_monitor = NULL; 

// Main heartbeat check thread 
void* monitorHeartbeatCheck(void *handle) 
{ 
     scnThreadList_t *pObj = NULL; 
     static int idx = 0; 
     static bool waitAck = false; 

     while (1) { 
       pObj = s_monitor->threads; 
     while (pObj && (pObj != s_monitor->heartbeatThread)) { //skip it-self from monitoring. 
       ++idx; 
       printf("\"HB Check No.%d\"\n",idx); 
       // send heartbeat 
       usleep(250 * 1000); 
       pthread_mutex_lock(s_monitor->heartbeatMutex, 1); 
       pthread_cond_signal(s_monitor->heartbeatCond);  
       printf("-->C %d HB sent\n",idx); 
       pthread_mutex_unlock(s_monitor->heartbeatMutex); 
       // wait for ACK 
       while(!waitAck){ 
         pthread_mutex_lock(s_monitor->ackMutex, 1); 
         printf("|| C %d wait Ack\n",idx); 
         waitAck = true; 
         pthread_cond_wait(s_monitor->heartbeatCond, s_monitor->ackMutex); 
         waitAck = false; 
         printf("<--C %d received Ack\n",idx); 
         pthread_mutex_unlock(s_monitor->ackMutex); 
         LOG_INFO(SCN_MONITOR, "ACK from thread %p \n", pObj->thread); 
       } 
         pObj = pObj->next; 
       } 
     } // while, infinite 
     return NULL; 
} 

// Waits for hearbeat and acknowledges 
// Call this API from every thread function that are registered 
int monitorHeartbeatProcess(void) 
{ 
     static int id = 0; 
     static bool waitHb = false; 
     ++ id; 
     printf("\"HB Process No.%d\"\n",id); 
     // wait for HB 
     while(!waitHb){ 
       pthread_mutex_lock(s_monitor->heartbeatMutex, 1); 
       printf("|| P %d wait for HB\n",id); 
       waitHb = true; 
       pthread_cond_wait(s_monitor->heartbeatCond, s_monitor->heartbeatMutex); 
       waitHb = false; 
       printf("<--P %d HB received \n",id); 
       pthread_mutex_unlock(s_monitor->heartbeatMutex); 
     } 
     // send ACK 
     uleep(250 * 1000); 
     pthread_mutex_lock(s_monitor->ackMutex, 1); 
     pthread_cond_signal(s_monitor->heartbeatCond); 
     printf("-->P %d ACK sent\n",id); 
     pthread_mutex_unlock(s_monitor->ackMutex); 
     return 1; 
} 

回答

1

您应该始终只将一个互斥锁与某个条件关联一次。同时使用两个具有相同条件的不同互斥锁可能会导致应用程序出现不可预知的序列化问题。

http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=%2Fapis%2Fusers_78.htm

你必须与你的条件heartbeatCond 2个不同的互斥。

+0

没错,但我认为如果我为HB和ACS信号使用相同的互斥量,那么他们也可以解锁自己,这也是无用的。 –

1

我认为你在这里遇到了一个僵局。调用monitorHeartbeatProcess()的线程在heartbeatMutex上使用互斥量,并等待条件变量heartbeatCond上的信号。线程调用monitorHeartbeatCheck()在ackMutex上进行互斥并等待条件变量heartbeatCond上的sognal。因此,两个线程都等待条件变量heartbeatCond导致死锁。如果你特别使用两个互斥体,为什么不使用两个条件变量?

+0

嗯..这是个好主意Ashok。我现在尝试过,但无法阻止僵局。我认为我用过的逻辑有些问题。在等待就绪之前发送信号。它可能是定时错过匹配。 –