[求解]互斥锁函数pthread_mutex_unlock什么情况会失败返回?

evfree 2013-10-30 03:47:23
在开发的项目中遇到一个非常怪异的问题,服务端运行好几天处理了大量的业务,表现算稳定。但是今天凌晨出现了不响应的情况,分析日志发现是死锁了。 可以确定的是这种死锁不是逻辑上的错误导致的死锁。

在锁的使用上我本着锁定尽可能少的代码的原则,所以加锁和解锁之间都是执行了少量的代码(里面也不会有嵌套的函数)。可以说pthread_mutex_lock和pthread_mutex_unlock是成对出现的。所以我可以确定没有逻辑上的死锁问题。

唯一的瑕疵是,我调用pthread_mutex_lock和pthread_mutex_unlock没有判断返回值。加锁pthread_mutex_lock是不需要判断返回值的,因为它阻塞直到可以锁定为止,一般不会有失败的情况。按理说,既然加锁成功了,那执行玩代码后接着解锁也一定能成功,所以解锁也无需判断返回值吧?
并且我认为对加锁解锁判断返回值是没什么意义的,即使返回了错误,那你又该如何处理?难道就放弃执行后面的代码了吗?

我项目中的这个死锁问题很可能是pthread_mutex_unlock调用失败了,导致下次加锁时死锁了。不知道大家是否遇到过这样的情况,如果遇到了又是如何处理的呢?
...全文
2414 10 打赏 收藏 转发到动态 举报
AI 作业
写回复
用AI写文章
10 条回复
切换为时间正序
请发表友善的回复…
发表回复
evfree 2013-10-30
  • 打赏
  • 举报
回复
引用 8 楼 nice_cxf 的回复:
还有一种可能是数据越界把锁的内存改了,这样你解锁的时候多半会出错
如果是这种情况,返回值会是EINVAL:The value specified by mutex does not refer to an initialized mutex object. 不过解锁返回这个值,那么下一次的加锁也会返回这个值(而不是阻塞死锁了)。程序也很可能会core dump终止。 实际情况是进程还是存活着,表现为死锁的状态。
evfree 2013-10-30
  • 打赏
  • 举报
回复
引用 1 楼 max_min_ 的回复:
我表示没有检查过pthread_mutex_lock和unlock的返回值, 也没遇到这些调用失败的情况!可能我还是编码比较少吧!
一般来说,对系统调用的返回值检查比较少。并且捕捉到了失败你也很难去处理,最多打个日志。
nice_cxf 2013-10-30
  • 打赏
  • 举报
回复
还有一种可能是数据越界把锁的内存改了,这样你解锁的时候多半会出错
赵4老师 2013-10-30
  • 打赏
  • 举报
回复
PTHREAD_MUTEX_LOCK Section: POSIX Programmer's Manual (P) Updated: 2003 -------------------------------------------------------------------------------- NAME pthread_mutex_lock, pthread_mutex_trylock, pthread_mutex_unlock - lock and unlock a mutex SYNOPSIS #include <pthread.h> int pthread_mutex_lock(pthread_mutex_t *mutex); int pthread_mutex_trylock(pthread_mutex_t *mutex); int pthread_mutex_unlock(pthread_mutex_t *mutex); DESCRIPTION The mutex object referenced by mutex shall be locked by calling pthread_mutex_lock(). If the mutex is already locked, the calling thread shall block until the mutex becomes available. This operation shall return with the mutex object referenced by mutex in the locked state with the calling thread as its owner. If the mutex type is PTHREAD_MUTEX_NORMAL, deadlock detection shall not be provided. Attempting to relock the mutex causes deadlock. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results. If the mutex type is PTHREAD_MUTEX_ERRORCHECK, then error checking shall be provided. If a thread attempts to relock a mutex that it has already locked, an error shall be returned. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned. If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex shall maintain the concept of a lock count. When a thread successfully acquires a mutex for the first time, the lock count shall be set to one. Every time a thread relocks this mutex, the lock count shall be incremented by one. Each time the thread unlocks the mutex, the lock count shall be decremented by one. When the lock count reaches zero, the mutex shall become available for other threads to acquire. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned. If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior. Attempting to unlock the mutex if it was not locked by the calling thread results in undefined behavior. Attempting to unlock the mutex if it is not locked results in undefined behavior. The pthread_mutex_trylock() function shall be equivalent to pthread_mutex_lock(), except that if the mutex object referenced by mutex is currently locked (by any thread, including the current thread), the call shall return immediately. If the mutex type is PTHREAD_MUTEX_RECURSIVE and the mutex is currently owned by the calling thread, the mutex lock count shall be incremented by one and the pthread_mutex_trylock() function shall immediately return success. The pthread_mutex_unlock() function shall release the mutex object referenced by mutex. The manner in which a mutex is released is dependent upon the mutex's type attribute. If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex. (In the case of PTHREAD_MUTEX_RECURSIVE mutexes, the mutex shall become available when the count reaches zero and the calling thread no longer has any locks on this mutex.) If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread shall resume waiting for the mutex as if it was not interrupted. RETURN VALUE If successful, the pthread_mutex_lock() and pthread_mutex_unlock() functions shall return zero; otherwise, an error number shall be returned to indicate the error. The pthread_mutex_trylock() function shall return zero if a lock on the mutex object referenced by mutex is acquired. Otherwise, an error number is returned to indicate the error. ERRORS The pthread_mutex_lock() and pthread_mutex_trylock() functions shall fail if: EINVAL The mutex was created with the protocol attribute having the value PTHREAD_PRIO_PROTECT and the calling thread's priority is higher than the mutex's current priority ceiling. The pthread_mutex_trylock() function shall fail if: EBUSY The mutex could not be acquired because it was already locked. The pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() functions may fail if: EINVAL The value specified by mutex does not refer to an initialized mutex object. EAGAIN The mutex could not be acquired because the maximum number of recursive locks for mutex has been exceeded. The pthread_mutex_lock() function may fail if: EDEADLK The current thread already owns the mutex. The pthread_mutex_unlock() function may fail if: EPERM The current thread does not own the mutex. These functions shall not return an error code of [EINTR]. The following sections are informative. EXAMPLES None. APPLICATION USAGE None. RATIONALE Mutex objects are intended to serve as a low-level primitive from which other thread synchronization functions can be built. As such, the implementation of mutexes should be as efficient as possible, and this has ramifications on the features available at the interface. The mutex functions and the particular default settings of the mutex attributes have been motivated by the desire to not preclude fast, inlined implementations of mutex locking and unlocking. For example, deadlocking on a double-lock is explicitly allowed behavior in order to avoid requiring more overhead in the basic mechanism than is absolutely necessary. (More "friendly" mutexes that detect deadlock or that allow multiple locking by the same thread are easily constructed by the user via the other mechanisms provided. For example, pthread_self() can be used to record mutex ownership.) Implementations might also choose to provide such extended features as options via special mutex attributes. Since most attributes only need to be checked when a thread is going to be blocked, the use of attributes does not slow the (common) mutex-locking case. Likewise, while being able to extract the thread ID of the owner of a mutex might be desirable, it would require storing the current thread ID when each mutex is locked, and this could incur unacceptable levels of overhead. Similar arguments apply to a mutex_tryunlock operation. FUTURE DIRECTIONS None. SEE ALSO pthread_mutex_destroy() , pthread_mutex_timedlock() , the Base Definitions volume of IEEE Std 1003.1-2001, <pthread.h> COPYRIGHT Portions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology -- Portable Operating System Interface (POSIX), The Open Group Base Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of Electrical and Electronics Engineers, Inc and The Open Group. In the event of any discrepancy between this version and the original IEEE and The Open Group Standard, the original IEEE and The Open Group Standard is the referee document. The original Standard can be obtained online at http://www.opengroup.org/unix/online.html . --------------------------------------------------------------------------------
赵4老师 2013-10-30
  • 打赏
  • 举报
回复
仅供参考
//循环向a函数每次发送200个字节长度(这个是固定的)的buffer,
//a函数中需要将循环传进来的buffer,组成240字节(也是固定的)的新buffer进行处理,
//在处理的时候每次从新buffer中取两个字节打印
#ifdef WIN32
    #pragma warning(disable:4996)
#endif
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#ifdef WIN32
    #include <windows.h>
    #include <process.h>
    #include <io.h>
    #define  MYVOID             void
    #define  vsnprintf          _vsnprintf
#else
    #include <unistd.h>
    #include <sys/time.h>
    #include <pthread.h>
    #define  CRITICAL_SECTION   pthread_mutex_t
    #define  MYVOID             void *
#endif
//Log{
#define MAXLOGSIZE 20000000
#define MAXLINSIZE 16000
#include <time.h>
#include <sys/timeb.h>
#include <stdarg.h>
char logfilename1[]="MyLog1.log";
char logfilename2[]="MyLog2.log";
static char logstr[MAXLINSIZE+1];
char datestr[16];
char timestr[16];
char mss[4];
CRITICAL_SECTION cs_log;
FILE *flog;
#ifdef WIN32
void Lock(CRITICAL_SECTION *l) {
    EnterCriticalSection(l);
}
void Unlock(CRITICAL_SECTION *l) {
    LeaveCriticalSection(l);
}
void sleep_ms(int ms) {
    Sleep(ms);
}
#else
void Lock(CRITICAL_SECTION *l) {
    pthread_mutex_lock(l);
}
void Unlock(CRITICAL_SECTION *l) {
    pthread_mutex_unlock(l);
}
void sleep_ms(int ms) {
    usleep(ms*1000);
}
#endif
void LogV(const char *pszFmt,va_list argp) {
    struct tm *now;
    struct timeb tb;

    if (NULL==pszFmt||0==pszFmt[0]) return;
    vsnprintf(logstr,MAXLINSIZE,pszFmt,argp);
    ftime(&tb);
    now=localtime(&tb.time);
    sprintf(datestr,"%04d-%02d-%02d",now->tm_year+1900,now->tm_mon+1,now->tm_mday);
    sprintf(timestr,"%02d:%02d:%02d",now->tm_hour     ,now->tm_min  ,now->tm_sec );
    sprintf(mss,"%03d",tb.millitm);
    printf("%s %s.%s %s",datestr,timestr,mss,logstr);
    flog=fopen(logfilename1,"a");
    if (NULL!=flog) {
        fprintf(flog,"%s %s.%s %s",datestr,timestr,mss,logstr);
        if (ftell(flog)>MAXLOGSIZE) {
            fclose(flog);
            if (rename(logfilename1,logfilename2)) {
                remove(logfilename2);
                rename(logfilename1,logfilename2);
            }
        } else {
            fclose(flog);
        }
    }
}
void Log(const char *pszFmt,...) {
    va_list argp;

    Lock(&cs_log);
    va_start(argp,pszFmt);
    LogV(pszFmt,argp);
    va_end(argp);
    Unlock(&cs_log);
}
//Log}
#define ASIZE    200
#define BSIZE    240
#define CSIZE      2
char Abuf[ASIZE];
char Cbuf[CSIZE];
CRITICAL_SECTION cs_HEX ;
CRITICAL_SECTION cs_BBB ;
struct FIFO_BUFFER {
    int  head;
    int  tail;
    int  size;
    char data[BSIZE];
} BBB;
int No_Loop=0;
void HexDump(int cn,char *buf,int len) {
    int i,j,k;
    char binstr[80];

    Lock(&cs_HEX);
    for (i=0;i<len;i++) {
        if (0==(i%16)) {
            sprintf(binstr,"%03d %04x -",cn,i);
            sprintf(binstr,"%s %02x",binstr,(unsigned char)buf[i]);
        } else if (15==(i%16)) {
            sprintf(binstr,"%s %02x",binstr,(unsigned char)buf[i]);
            sprintf(binstr,"%s  ",binstr);
            for (j=i-15;j<=i;j++) {
                sprintf(binstr,"%s%c",binstr,('!'<buf[j]&&buf[j]<='~')?buf[j]:'.');
            }
            Log("%s\n",binstr);
        } else {
            sprintf(binstr,"%s %02x",binstr,(unsigned char)buf[i]);
        }
    }
    if (0!=(i%16)) {
        k=16-(i%16);
        for (j=0;j<k;j++) {
            sprintf(binstr,"%s   ",binstr);
        }
        sprintf(binstr,"%s  ",binstr);
        k=16-k;
        for (j=i-k;j<i;j++) {
            sprintf(binstr,"%s%c",binstr,('!'<buf[j]&&buf[j]<='~')?buf[j]:'.');
        }
        Log("%s\n",binstr);
    }
    Unlock(&cs_HEX);
}
int GetFromRBuf(int cn,CRITICAL_SECTION *cs,FIFO_BUFFER *fbuf,char *buf,int len) {
    int lent,len1,len2;

    lent=0;
    Lock(cs);
    if (fbuf->size>=len) {
        lent=len;
        if (fbuf->head+lent>BSIZE) {
            len1=BSIZE-fbuf->head;
            memcpy(buf     ,fbuf->data+fbuf->head,len1);
            len2=lent-len1;
            memcpy(buf+len1,fbuf->data           ,len2);
            fbuf->head=len2;
        } else {
            memcpy(buf     ,fbuf->data+fbuf->head,lent);
            fbuf->head+=lent;
        }
        fbuf->size-=lent;
    }
    Unlock(cs);
    return lent;
}
MYVOID thdB(void *pcn) {
    char        *recv_buf;
    int          recv_nbytes;
    int          cn;
    int          wc;
    int          pb;

    cn=(int)pcn;
    Log("%03d thdB              thread begin...\n",cn);
    while (1) {
        sleep_ms(10);
        recv_buf=(char *)Cbuf;
        recv_nbytes=CSIZE;
        wc=0;
        while (1) {
            pb=GetFromRBuf(cn,&cs_BBB,&BBB,recv_buf,recv_nbytes);
            if (pb) {
                Log("%03d recv %d bytes\n",cn,pb);
                HexDump(cn,recv_buf,pb);
                sleep_ms(1);
            } else {
                sleep_ms(1000);
            }
            if (No_Loop) break;//
            wc++;
            if (wc>3600) Log("%03d %d==wc>3600!\n",cn,wc);
        }
        if (No_Loop) break;//
    }
#ifndef WIN32
    pthread_exit(NULL);
#endif
}
int PutToRBuf(int cn,CRITICAL_SECTION *cs,FIFO_BUFFER *fbuf,char *buf,int len) {
    int lent,len1,len2;

    Lock(cs);
    lent=len;
    if (fbuf->size+lent>BSIZE) {
        lent=BSIZE-fbuf->size;
    }
    if (fbuf->tail+lent>BSIZE) {
        len1=BSIZE-fbuf->tail;
        memcpy(fbuf->data+fbuf->tail,buf     ,len1);
        len2=lent-len1;
        memcpy(fbuf->data           ,buf+len1,len2);
        fbuf->tail=len2;
    } else {
        memcpy(fbuf->data+fbuf->tail,buf     ,lent);
        fbuf->tail+=lent;
    }
    fbuf->size+=lent;
    Unlock(cs);
    return lent;
}
MYVOID thdA(void *pcn) {
    char        *send_buf;
    int          send_nbytes;
    int          cn;
    int          wc;
    int           a;
    int          pa;

    cn=(int)pcn;
    Log("%03d thdA              thread begin...\n",cn);
    a=0;
    while (1) {
        sleep_ms(100);
        memset(Abuf,a,ASIZE);
        a=(a+1)%256;
        if (16==a) {No_Loop=1;break;}//去掉这句可以让程序一直循环直到按Ctrl+C或Ctrl+Break或当前目录下存在文件No_Loop
        send_buf=(char *)Abuf;
        send_nbytes=ASIZE;
        Log("%03d sending %d bytes\n",cn,send_nbytes);
        HexDump(cn,send_buf,send_nbytes);
        wc=0;
        while (1) {
            pa=PutToRBuf(cn,&cs_BBB,&BBB,send_buf,send_nbytes);
            Log("%03d sent %d bytes\n",cn,pa);
            HexDump(cn,send_buf,pa);
            send_buf+=pa;
            send_nbytes-=pa;
            if (send_nbytes<=0) break;//
            sleep_ms(1000);
            if (No_Loop) break;//
            wc++;
            if (wc>3600) Log("%03d %d==wc>3600!\n",cn,wc);
        }
        if (No_Loop) break;//
    }
#ifndef WIN32
    pthread_exit(NULL);
#endif
}
int main() {
#ifdef WIN32
    InitializeCriticalSection(&cs_log);
    InitializeCriticalSection(&cs_HEX );
    InitializeCriticalSection(&cs_BBB );
#else
    pthread_t threads[2];
    int threadsN;
    int rc;
    pthread_mutex_init(&cs_log,NULL);
    pthread_mutex_init(&cs_HEX,NULL);
    pthread_mutex_init(&cs_BBB,NULL);
#endif
    Log("Start===========================================================\n");

    BBB.head=0;
    BBB.tail=0;
    BBB.size=0;

#ifdef WIN32
    _beginthread((void(__cdecl *)(void *))thdA,0,(void *)1);
    _beginthread((void(__cdecl *)(void *))thdB,0,(void *)2);
#else
    threadsN=0;
    rc=pthread_create(&(threads[threadsN++]),NULL,thdA,(void *)1);if (rc) Log("%d=pthread_create %d error!\n",rc,threadsN-1);
    rc=pthread_create(&(threads[threadsN++]),NULL,thdB,(void *)2);if (rc) Log("%d=pthread_create %d error!\n",rc,threadsN-1);
#endif

    if (!access("No_Loop",0)) {
        remove("No_Loop");
        if (!access("No_Loop",0)) {
            No_Loop=1;
        }
    }
    while (1) {
        sleep_ms(1000);
        if (No_Loop) break;//
        if (!access("No_Loop",0)) {
            No_Loop=1;
        }
    }
    sleep_ms(3000);
    Log("End=============================================================\n");
#ifdef WIN32
    DeleteCriticalSection(&cs_BBB );
    DeleteCriticalSection(&cs_HEX );
    DeleteCriticalSection(&cs_log);
#else
    pthread_mutex_destroy(&cs_BBB);
    pthread_mutex_destroy(&cs_HEX);
    pthread_mutex_destroy(&cs_log);
#endif
    return 0;
}
evfree 2013-10-30
  • 打赏
  • 举报
回复
引用 3 楼 buyong 的回复:
看看用死锁前后的数据,能不能再现问题
重现不太可能。分析了客户端的请求报文,也没什么异常,这样的报文一天要处理上万个。
evfree 2013-10-30
  • 打赏
  • 举报
回复
引用 2 楼 nice_cxf 的回复:
首先你要确定是因为解锁失败导致的,加上返回值判断写到日志里面等出错了查日志看是不是因为这个,如果是对应的errno是什么
嗯,现在只能用日志的形式,等待下一次的发生。
buyong 2013-10-30
  • 打赏
  • 举报
回复
看看用死锁前后的数据,能不能再现问题
nice_cxf 2013-10-30
  • 打赏
  • 举报
回复
首先你要确定是因为解锁失败导致的,加上返回值判断写到日志里面等出错了查日志看是不是因为这个,如果是对应的errno是什么
max_min_ 2013-10-30
  • 打赏
  • 举报
回复
我表示没有检查过pthread_mutex_lock和unlock的返回值, 也没遇到这些调用失败的情况!可能我还是编码比较少吧!

24,860

社区成员

发帖
与我相关
我的任务
社区描述
C/C++ 工具平台和程序库
社区管理员
  • 工具平台和程序库社区
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧