阅读gcc libc的源代码wcslen函数的疑问

JohnPhan 2013-04-26 04:24:33

为什么连续用了3个同样的if语句？真是不解。
源代码如下：



size_t

wcslen (s)

     const wchar_t *s;

{

  size_t len = 0;



  while (s[len] != L'\0')

    {

      if (s[++len] == L'\0')

	return len;

      if (s[++len] == L'\0')

	return len;

      if (s[++len] == L'\0')

	return len;

      ++len;

    }



  return len;

}

版本信息：
glibc 2.0

...全文

361 21 打赏收藏转发到动态举报

写回复

用AI写文章

21 条回复

切换为时间正序

请发表友善的回复…

发表回复

fuabcck 2015-11-30

打赏
举报

执行4次是为了字节对齐

z2357 2014-03-07

打赏
举报

看到了那个多字节吗？是多字节的原因。

mLee79 2013-05-07

打赏
举报

引用 17 楼 JohnPhan 的回复:

[quote=引用 14 楼 DelphiGuy 的回复:] 就是循环展开优化，减少分支指令执行的次数，也就减少了处理器分支预测可能失败的次数。相当老的优化技巧了，90年代的watcom c++编译器最爱使用这种套路，对于较新的处理器，这种优化效果不明显。

gcc的libc，大概有这样的假定：编译我代码的编译器，可能是“土包子”，可能相当老旧。看libc的代码语法，也可看出，很土，C语言的一点点的更新，都不敢用。 [/quote] glibc是要用首次静态编译的gcc编译的, 这时候很多语言特性本身就不能使用, 我们使用的gcc是要在编译完glibc以后, 使用新鲜出炉的glibc库再次编译gcc的代码生成的, 这时候才能支持所有的语言特性, 像glibc这种基础库能够使用的语言特性是要受到限制的..

「已注销」 2013-05-06

打赏
举报

关于这段代码，如此的计较优化质量，我的看法是，如果要编写系统函数，如本程序，的确的需要如此斤斤计较，尽力的优化。但如果编写一般的用户程序，就一切委托给编译器了，编写程序的专注点，不应该在于协助编译器优化代码上了。在学习语言阶段，可以锱铢必较，甚至研究一下汇编结果，但......

「已注销」 2013-05-06

打赏
举报

引用 14 楼 DelphiGuy 的回复:

就是循环展开优化，减少分支指令执行的次数，也就减少了处理器分支预测可能失败的次数。相当老的优化技巧了，90年代的watcom c++编译器最爱使用这种套路，对于较新的处理器，这种优化效果不明显。

gcc的libc，大概有这样的假定：编译我代码的编译器，可能是“土包子”，可能相当老旧。看libc的代码语法，也可看出，很土，C语言的一点点的更新，都不敢用。

日立奔腾浪潮微软松下联想 2013-05-03

打赏
举报

就是循环展开优化，减少分支指令执行的次数，也就减少了处理器分支预测可能失败的次数。相当老的优化技巧了，90年代的watcom c++编译器最爱使用这种套路，对于较新的处理器，这种优化效果不明显。

日立奔腾浪潮微软松下联想 2013-05-03

打赏
举报

哎呀，15楼贴错了，请版主删除。

日立奔腾浪潮微软松下联想 2013-05-03

打赏
举报

相对于程序加载的起始地址是固定的，但是不能保证每次加载/在不同的系统上加载时，静态变量的地址(逻辑地址)都固定，因为操作系统的加载过程可能对其重定位到不同的起始地址。至于其在物理内存中的地址，运行时都有可能改变。

「已注销」 2013-05-02

打赏
举报

引用 2 楼 flyound 的回复:

这个应该是对CPU指令执行的优化吧，并行处理，如果每次都进入while就不会进行并行运算了。

顺这你的思路，启发一个思路：这样同样的语句重叠放置，也许是为了利用CPU的流水线功能。现在ARM、DSP也都有这个设计。一条指令执行，如果需要4个周期，4个同样的指令连续执行，需要周期数不是4*4 = 16 ，而可能是4 + 3*1 = 7。

「已注销」 2013-05-02

打赏
举报

流水线；循环展开优化；如果没有优化也没有害处。就是这些。

CyberLogix 2013-05-02

打赏
举报

循环展开优化啊

mLee79 2013-05-02

打赏
举报

我看了下 x86 , x64 , arm , ppc 下的目标代码, 这样写与平凡的写法生成的目标代码基本没有啥区别 (-O3 -fomit-frame-pointer) , 看来gcc的优化还没有足够强大, 也许只是写的人习惯了看到循环就做循环展开优化, 反正也不会变的更坏, 本来我期望 arm 下能生成类似的指令的 ... ldr .. cmp .. beq .. ldrne .. cmpne .. beq ..

「已注销」 2013-04-27

打赏
举报

是一个思路，待有时间一定看一下，编译处理的机器码。估计应该将各种优化均设置上，才能看出端倪。

AndyStevens 2013-04-27

打赏
举报

不准确的猜想：可能是为了优化代码执行效率，c循环汇编之后，由于编译器的缘故，大多效率并不高，减少循环复杂度是提高效率的一个可行的方法

「已注销」 2013-04-27

打赏
举报

另一个版本glibc 2.7 wcslen.c 看样子是有意为止。

* Copyright (C) 1995, 1996, 1997, 1998, 2011  Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Contributed by Ulrich Drepper <drepper@gnu.ai.mit.edu>, 1995.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <http://www.gnu.org/licenses/>.  */

#include <wchar.h>

/* Return length of string S.  */
#ifdef WCSLEN
# define __wcslen WCSLEN
#endif

size_t
__wcslen (s)
     const wchar_t *s;
{
  size_t len = 0;

  while (s[len] != L'\0')
    {
      if (s[++len] == L'\0')
	return len;
      if (s[++len] == L'\0')
	return len;
      if (s[++len] == L'\0')
	return len;
      ++len;
    }

  return len;
}
#ifndef WCSLEN
weak_alias (__wcslen, wcslen)
#endif

可以用strlen.c 的源代码做对比参考：

/* Copyright (C) 1991,1993,1997,2000,2003,2009 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Written by Torbjorn Granlund (tege@sics.se),
   with help from Dan Sahlin (dan@sics.se);
   commentary by Jim Blandy (jimb@ai.mit.edu).

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <http://www.gnu.org/licenses/>.  */

#include <string.h>
#include <stdlib.h>

#undef strlen

/* Return the length of the null-terminated string STR.  Scan for
   the null terminator quickly by testing four bytes at a time.  */
size_t
strlen (str)
     const char *str;
{
  const char *char_ptr;
  const unsigned long int *longword_ptr;
  unsigned long int longword, himagic, lomagic;

  /* Handle the first few characters by reading one character at a time.
     Do this until CHAR_PTR is aligned on a longword boundary.  */
  for (char_ptr = str; ((unsigned long int) char_ptr
			& (sizeof (longword) - 1)) != 0;
       ++char_ptr)
    if (*char_ptr == '\0')
      return char_ptr - str;

  /* All these elucidatory comments refer to 4-byte longwords,
     but the theory applies equally well to 8-byte longwords.  */

  longword_ptr = (unsigned long int *) char_ptr;

  /* Bits 31, 24, 16, and 8 of this number are zero.  Call these bits
     the "holes."  Note that there is a hole just to the left of
     each byte, with an extra at the end:

     bits:  01111110 11111110 11111110 11111111
     bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD

     The 1-bits make sure that carries propagate to the next 0-bit.
     The 0-bits provide holes for carries to fall into.  */
  himagic = 0x80808080L;
  lomagic = 0x01010101L;
  if (sizeof (longword) > 4)
    {
      /* 64-bit version of the magic.  */
      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */
      himagic = ((himagic << 16) << 16) | himagic;
      lomagic = ((lomagic << 16) << 16) | lomagic;
    }
  if (sizeof (longword) > 8)
    abort ();

  /* Instead of the traditional loop which tests each character,
     we will test a longword at a time.  The tricky part is testing
     if *any of the four* bytes in the longword in question are zero.  */
  for (;;)
    {
      longword = *longword_ptr++;

      if (((longword - lomagic) & ~longword & himagic) != 0)
	{
	  /* Which of the bytes was the zero?  If none of them were, it was
	     a misfire; continue the search.  */

	  const char *cp = (const char *) (longword_ptr - 1);

	  if (cp[0] == 0)
	    return cp - str;
	  if (cp[1] == 0)
	    return cp - str + 1;
	  if (cp[2] == 0)
	    return cp - str + 2;
	  if (cp[3] == 0)
	    return cp - str + 3;
	  if (sizeof (longword) > 4)
	    {
	      if (cp[4] == 0)
		return cp - str + 4;
	      if (cp[5] == 0)
		return cp - str + 5;
	      if (cp[6] == 0)
		return cp - str + 6;
	      if (cp[7] == 0)
		return cp - str + 7;
	    }
	}
    }
}
libc_hidden_builtin_def (strlen)

赵4老师 2013-04-27

打赏
举报

对学习编程者的忠告：眼过千遍不如手过一遍！书看千行不如手敲一行！手敲千行不如单步一行！单步源代码千行不如单步对应汇编一行！

赵4老师 2013-04-27

打赏
举报

先 http://www.microsoft.com/visualstudio/chs/downloads#d-2010-express 点开Visual C++ 2010 Express下面的语言选‘简体中文’，再点立即安装再参考C:\Program Files\Microsoft Visual Studio 10.0\VC\crt\src\intel\strlen.asm：

        page    ,132
        title   strlen - return the length of a null-terminated string
;***
;strlen.asm - contains strlen() routine
;
;       Copyright (c) Microsoft Corporation. All rights reserved.
;
;Purpose:
;       strlen returns the length of a null-terminated string,
;       not including the null byte itself.
;
;*******************************************************************************

        .xlist
        include cruntime.inc
        .list

page
;***
;strlen - return the length of a null-terminated string
;
;Purpose:
;       Finds the length in bytes of the given string, not including
;       the final null character.
;
;       Algorithm:
;       int strlen (const char * str)
;       {
;           int length = 0;
;
;           while( *str++ )
;                   ++length;
;
;           return( length );
;       }
;
;Entry:
;       const char * str - string whose length is to be computed
;
;Exit:
;       EAX = length of the string "str", exclusive of the final null byte
;
;Uses:
;       EAX, ECX, EDX
;
;Exceptions:
;
;*******************************************************************************

        CODESEG

        public  strlen

strlen  proc \
        buf:ptr byte

        OPTION PROLOGUE:NONE, EPILOGUE:NONE

        .FPO    ( 0, 1, 0, 0, 0, 0 )

string  equ     [esp + 4]

        mov     ecx,string              ; ecx -> string
        test    ecx,3                   ; test if string is aligned on 32 bits
        je      short main_loop

str_misaligned:
        ; simple byte loop until string is aligned
        mov     al,byte ptr [ecx]
        add     ecx,1
        test    al,al
        je      short byte_3
        test    ecx,3
        jne     short str_misaligned

        add     eax,dword ptr 0         ; 5 byte nop to align label below

        align   16                      ; should be redundant

main_loop:
        mov     eax,dword ptr [ecx]     ; read 4 bytes
        mov     edx,7efefeffh
        add     edx,eax
        xor     eax,-1
        xor     eax,edx
        add     ecx,4
        test    eax,81010100h
        je      short main_loop
        ; found zero byte in the loop
        mov     eax,[ecx - 4]
        test    al,al                   ; is it byte 0
        je      short byte_0
        test    ah,ah                   ; is it byte 1
        je      short byte_1
        test    eax,00ff0000h           ; is it byte 2
        je      short byte_2
        test    eax,0ff000000h          ; is it byte 3
        je      short byte_3
        jmp     short main_loop         ; taken if bits 24-30 are clear and bit
                                        ; 31 is set

byte_3:
        lea     eax,[ecx - 1]
        mov     ecx,string
        sub     eax,ecx
        ret
byte_2:
        lea     eax,[ecx - 2]
        mov     ecx,string
        sub     eax,ecx
        ret
byte_1:
        lea     eax,[ecx - 3]
        mov     ecx,string
        sub     eax,ecx
        ret
byte_0:
        lea     eax,[ecx - 4]
        mov     ecx,string
        sub     eax,ecx
        ret

strlen  endp

        end