多线程下fgets函数读取文件导致线程挂住

Deutschester 2011-08-29 07:14:26

/* 主要的参数解释：
sock——建立连接的socket
*/

void char *dQ(const int sock, const char *entry, int *flag, char* ss)
{
char *temp, *p, buf[2000];
FILE *fi;

temp = malloc(strlen(entry) + 2 + 1);
strcpy(temp, query);
strcat(temp, "\r\n");

fi = fdopen(sock, "r"); //把sock映射为对应的文件打开

if (write(sock, temp, strlen(temp)) < 0)
{
printf("my write: Bad file descriptor!\n");
*flag = 4;
free(temp);
return NULL;
}
free(temp);

while (fgets(buf, sizeof(buf), fi)) //从文件fi中逐行读入数据（每行的长度不多于1000）
{
strcat(ss, buf);
strcat(ss, "\n");
}
}

以上代码通过socket从网络服务器上请求数据。在该函数之前调用connect函数建立tcp连接，sock为连接返回的socket。
该段代码在单线程下运行正常。但在多线程（各个线程做同样的工作，会同时请求这段代码）下运行，一些线程往往会在以fgets函数为入口的内核函数挂住（阻塞住）。以下是挂住线程的调用栈：

#0 0x00007fabf8d7e4bd in read () from /lib/libc.so.6
#1 0x00007fabf8d19348 in _IO_file_underflow () from /lib/libc.so.6
#2 0x00007fabf8d1aeee in _IO_default_uflow () from /lib/libc.so.6
#3 0x00007fabf8d0f43e in _IO_getline_info () from /lib/libc.so.6
#4 0x00007fabf8d0e329 in fgets () from /lib/libc.so.6
#5 0x0000000000405c13 in dQ (sock=24, entry=0x25d2160 "split.at", flag=0x7fabf3a0adcc, ss=0x7fabf3a0ae40 "")
at ./src/whoisfunc/dQ.cpp:727
#6 ……

可见该线程在fgets阻塞住，不往下执行，不会出现异常，且不退出。而且往往是在while循环了几次后，即已从文件流fi中获取了部分文本字串，但是文件后面的内容并没有被读出却阻塞住。不知道这是什么情况，在这里卡了很长时间了，那位大神可以帮帮偶啊多谢了！

另：该函数封装在类里面，按理讲在每个线程中使用各自的代码拷贝，并不存在共享内存的问题，为什么会出现这样只在多线程下才会出现的问题？

如下是找到的几个相关的网页，但是仍然没有解决方案:
http://www.cplusplus.com/forum/unices/7531/
http://www.gidforums.com/t-20523.html
http://www.programmersheaven.com/mb/ConLunix/385311/385311/underflow-with-fgets-and-popen/

...全文

1110 9 打赏收藏转发到动态举报

写回复

用AI写文章

9 条回复

切换为时间正序

请发表友善的回复…

发表回复

勤奋的沉沦 2011-08-30

打赏
举报



 The gets subroutine reads bytes from the standard input stream, stdin, into the array pointed to by the String parameter. It reads data until it reaches a new-line character or an end-of-

       file condition. If a new-line character stops the reading process, the gets subroutine discards the new-line character and terminates the string with a null character.



       The fgets subroutine reads bytes from the data pointed to by the Stream parameter into the array pointed to by the String parameter. The fgets subroutine reads data up to the number of bytes

       specified by the Number parameter minus 1, or until it reads a new-line character and transfers that character to the String parameter, or until it encounters an end-of-file condition. The

       fgets subroutine then terminates the data string with a null character.



       The first successful run of the fgetc (getc, getchar, fgetc, or getw Subroutine), fgets, fgetwc (getwc, fgetwc, or getwchar Subroutine), fgetws (getws or fgetws Subroutine), fread (fread or

       fwrite Subroutine), fscanf, getc (getc, getchar, fgetc, or getw Subroutine), getchar (getc, getchar, fgetc, or getw Subroutine), gets or scanf subroutine using a stream that returns data not

       supplied by a prior call to the ungetc or ungetwc subroutine marks the st_atime field for update.

上面是fgets的说明。
建议不要使用fgets这样的函数读取网络数据。使用read读完数据，然后丢给消费线程处理。

louyong0571 2011-08-30

打赏
举报

while循环里，fget失败的时候是不是应该要延时一会儿会好点？

Deutschester 2011-08-30

打赏
举报

使用了selecet来监视sock文件（事先设置为非阻塞读取方式）的可读状态，在超时时间内可读时再使用fgets读取。超时的话，则退出函数，从而不会让线程挂起（给人以线程死锁的假象）。这样做了以后，基本就达到了软件设计的需求了。谢谢大家的帮助！

现在函数修改为：

void char *dQ(const int sock, const char *entry, int *flag, char* ss)
{
char *temp, *p, buf[2000];
FILE *fi;

temp = malloc(strlen(entry) + 2 + 1);
strcpy(temp, query);
strcat(temp, "\r\n");

/*将sock描述文件设置为非阻塞方式*/
int flags = fcntl(sock, F_GETFL, 0);
fcntl(sock, F_SETFL, flags|O_NONBLOCK);

fi = fdopen(sock, "r"); //把sock映射为对应的文件打开

if (write(sock, temp, strlen(temp)) < 0)
{
printf("my write: Bad file descriptor!\n");
*flag = 4;
free(temp);
return NULL;
}
free(temp);
fd_set rfd;
struct timeval timeout;
timeout.tv_sec = FGETSTIMEOUT ; //指定fgets函数的超时时间为FGETSTIMEOUT（自定义宏）
timeout.tv_usec = 0;

while(1)
//while (fgets(buf, sizeof(buf), fi)) //从文件fi中逐行读入数据（每行的长度不多于1000）
{
FD_ZERO(&rfd);
FD_SET(sock, &rfd);
int res = select(sock+1, &rfd, NULL, NULL, &timeout);
if(res < 0) {
printf("select error!\n");
*flag = 5; //用作他用，无关
return NULL;
}
else if(res == 0) {
printf("select timeout!\n");
*flag = 5; //用作他用，无关
return NULL;
}
else{
if(FD_ISSET(sock, &rfd)){
if(NULL == fgets(buf, sizeof(buf) - 1, fi))
break;
}
}

strcat(ss, buf);
strcat(ss, "\n");
}
}

念茜 2011-08-29

打赏
举报

lz可以换用多路复用的思想试试，一个读取线程select多个描述符，当有可读时，fgets得到数据然后分给多个处理线程。
库函数考虑更多线程安全问题，io的buff考虑更多同步的问题，他们在这里可能有冲突造成阻塞。

justkk 2011-08-29

打赏
举报

会不会与标准输入输出的缓冲有关？
试试不用FILE *的方式读，直接用read来读

Deutschester 2011-08-29

打赏
举报

问题补充：一般情况下多个线程是可以正常工作的，但是过些时间少数线程会不定期出现上面描述的挂住（但是已经获得了部分的数据，线程就阻塞在等待获取剩余的数据上了）的情形。有时甚至不会出现以上描述的挂住的情形。

由于fi是一个从网络socket映射过来的文件，因此我推测由于网络连接错误（超时、重置等）、系统IO缓存共享、请求网络数据阻塞等都可能是引起这个问题的原因，由于对这些了解不够，请各位多多指教……