再论hashcode,一个意想不到的结果!

微创社(MCC) 2009-09-04 09:08:12
加精
class版
using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication9
{
public class B//class版本
{
public int i;
}

class Program
{
static void Main(string[] args)
{
List<int> list = new List<int>();
List<B> listClass = new List<B>();
for (int i = 0; i < 10000; i++)
{
B b = new B();
b.i = i;
list.Add(b.GetHashCode());
listClass.Add(b);
}
Console.WriteLine(list.Distinct().Count());//9999
Console.ReadKey();
}
}
}


struct版
using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication9
{
public struct B//struct版本
{
public int i;
}

class Program
{
static void Main(string[] args)
{
List<int> list = new List<int>();
List<B> listClass = new List<B>();
for (int i = 0; i < 10000; i++)
{
B b = new B();
b.i = i;
list.Add(b.GetHashCode());
listClass.Add(b);
}
Console.WriteLine(list.Distinct().Count());//10000
Console.ReadKey();
}
}
}


只改了一个地方class/struct
...全文
1772 98 打赏 收藏 转发到动态 举报
写回复
用AI写文章
98 条回复
切换为时间正序
请发表友善的回复…
发表回复
ruanwei1987 2009-09-18
  • 打赏
  • 举报
回复
日 看这个花了半天的时间 还没看懂 郁闷球!!!!
微创社(MCC) 2009-09-08
  • 打赏
  • 举报
回复
继续观察一下:

using System;
using System.Linq;
using System.Collections.Generic;


class B
{
//public int i;
}

class MyClassA
{
static bool Compare<T>(T obj1, T obj2)
where T:class
{
return obj1 == obj2;
}

static void Main1(string[] args)
{
int count = 1*10000;
List<int> listHashcode = new List<int>();
List<B> listObject = new List<B>();
for (int i = 0; i < count; i++)
{
B b = new B();
listHashcode.Add(b.GetHashCode());
listObject.Add(b);
}

Console.WriteLine("listHashcode:{0}", listHashcode.Distinct().Count());
Console.WriteLine("listObject:{0}", listObject.Distinct().Count());

Console.ReadKey();
}
}


yc513485587 2009-09-08
  • 打赏
  • 举报
回复
sdfsdfsa fsd sdfdf d sdf
yc513485587 2009-09-08
  • 打赏
  • 举报
回复
sdfsdfsd fsdf sdf saf sf d
a63445 2009-09-08
  • 打赏
  • 举报
回复
xuexi
yewuyu 2009-09-08
  • 打赏
  • 举报
回复
very good
fblgzdq 2009-09-08
  • 打赏
  • 举报
回复
d
jinshuang 2009-09-07
  • 打赏
  • 举报
回复
学习下下
微创社(MCC) 2009-09-07
  • 打赏
  • 举报
回复
[Quote=引用 88 楼 veiny 的回复:]
没有什么变化啊
都是10000
dotnet 3.5
[/Quote]

改10万,再试试。
lsd123 2009-09-07
  • 打赏
  • 举报
回复
.
微创社(MCC) 2009-09-06
  • 打赏
  • 举报
回复
示意过程,无法编译通过

using System;

class MultiUseWord
{
internal UIntPtr value;
internal const uint INFLATED_TAG = 3;

internal static int GetHashCode(Object obj)
{
int result;

//!!!-1
MultiUseWord muw = GetForObject(obj);
uint tag = muw.GetTag();

result = (int)(muw.GetPayload());
return result;
}

[ManualRefCounts]
internal static MultiUseWord GetForObject(Object obj)
{
//!!!-2
MultiUseWord result = ((ObjectPreMUW)obj).muw;

return result;
}

internal uint GetTag()
{
uint tag = (uint)(this.value & INFLATED_TAG);
return tag;
}

internal UIntPtr GetPayload()
{
UIntPtr result = (this.value & PAYLOAD_MASK);
return result;
}

internal static UIntPtr PAYLOAD_MASK
{
get
{
return (UIntPtr.Size == 8) ?
(UIntPtr)0xfffffffffffffff8 :
(UIntPtr)0xfffffff8;
}
}
}

// This class does not add any fields, nor should it ever be
// referenced by any other code. Its sole purpose is to add
// implementations of the get_muw, set_muw, and compareExchangeMUW
// methods to all objects when they have a MultiUseWord field.
[MixinConditional("ObjectHeaderDefault")]
[MixinConditional("ObjectHeaderPostRC")]
[Mixin(typeof(ObjectMUW))]
internal class ObjectPreMUW : ObjectMUW
{
internal new PreHeaderDefault preHeader;

internal new MultiUseWord muw
{
//!!!-3
[Inline]
get { return this.preHeader.muw; }
[Inline]
set { this.preHeader.muw = value; }
}

[Inline]
internal new UIntPtr CompareExchangeMUW(UIntPtr newValue,
UIntPtr oldValue)
{
return Interlocked.CompareExchange(ref this.preHeader.muw.value,
newValue, oldValue);
}
}

[MixinConditional("ObjectHeaderDefault")]
[MixinConditional("ObjectHeaderPostRC")]
[Mixin(typeof(PreHeader))]
[RequiredByBartok]
internal struct PreHeaderDefault /* : PreHeader */
{
//!!!-4
internal MultiUseWord muw;
}

// This class does not add any fields, nor should it ever be
// referenced by any other code. Its sole purpose is to declare
// that all objects have get_muw, set_muw, and compareExchangeMUW
// methods when the object header includes a MultiUseWord field.
[MixinConditional("ObjectHeaderDefault")]
[MixinConditional("ObjectHeaderPostRC")]
[Mixin(typeof(System.Object))]
internal class ObjectMUW
{

internal extern MultiUseWord muw
{
[Inline]
get;
[Inline]
set;
}

[Inline]
internal extern UIntPtr CompareExchangeMUW(UIntPtr newValue,
UIntPtr oldValue);

}

class Test
{
static void Main(string[] args)
{
object obj = new object();
MultiUseWord.GetHashCode(obj);
}
}
微创社(MCC) 2009-09-06
  • 打赏
  • 举报
回复
MUW结构关联了hashcode值

从上述论中可以看出,
[1]地址对齐是有边界的,8bit对齐
[2]能够保证hashcode与地址映射的一对一关系
能够间接说明61楼台中的[3],也就是这个不是引起hashcode值冲突的原因.

总结如下:
[1]引用类型地象的GetHashcode取值与对象地址有一对一的关系
[2]GC的1~2代操作,会改变对象地址,由于[1]的关系,导致对象Hashcode值同步改变
[3]GC的操作是不可控的,如果新的内存分配占用的老的地址,会导至Hashcode值冲突.
需要去研究一下GC了,呵呵(进一步验证)
微创社(MCC) 2009-09-06
  • 打赏
  • 举报
回复
关于:MUW structure

MUW structure
-------------
The structure of the MUW on a 32-bit system is as follows:
| 31 3 | 2 | 1 0 |
| payload |mark| tag |

And on a 64-bit system:
| 63 3 | 2 | 1 0 |
| payload |mark| tag |

This class provides fast-path code for a number of operations which
potentially require additional information to be associated with each
object in the heap:
这个类提供"快速管道"代码,用于操作一批建立在堆上的对象,这些对象需要更
多的相关信息.

- Object ownership and versioning information needed by the STM
implementation.
- 对象拥有"STM?"执行,所需的所有权信息和版本信息。

- Location-insensitive hash codes that have been allocated to objects.
- 对位置不敏感的,并且已经分配给对象的hashcode。

- Monitor objects that have been associate with objects.
- 将同步对象锁与对象进行关联.

- StringState bits associated with String objects
- 将StringState位与String对象相关

Overview
--------
Each object has a value of type MultiUseWord (MUW) held as a header
word. This can be used directly for any *one* of the three purposes
listed above. If more than one kind of usage is required on the same
object then the MUW is "inflated" -- i.e. replaced by a value that
indicates an external multi-use object (EMU) which contains space for
all three purposes.
对象都会有一个类型相关的MUW值,作为"header word?".能够直接用于上这三个
目标之一.如果一个对象需要同时使用多个属性,MUW将被扩充后的EMU取代,MUW
的值能提供更多的空间,达到上述三个目标。

NB: the StringState and HashCode words share the same storage locations,
distinguished by value. We rely on ChooseHashCode to not use small
integer values corresponding to the StringState enumeration.
注:StringState和hashCode能够被区分,并共享相同的存储位置.我们避免选择
小数值的HashCode来对应StringState枚举值。

In addition, the MUW provides a single per-object 'mark' bit. This
is used in the MemoryAccounting module when counting the volume of
different kinds of object in the heap. Placing this bit in the MUW
(rather than using spare vtable-bits as the GC does) allows
memory accounting to be invoked at any time (e.g. after a crash during
a GC).
此外,MUW提供了每个单一对象的'mark'标志.这个主要是用于MemoryAccounting
模块计算不同类型的对象在堆中的容量。设置该MUW标志(好于将虚表位置为GC)
允许内存在任何时候计数(比如,在GC操作崩溃后).

On 32-bit systems, this allows us to support pointers through the entire
4 GB address range. However, it means that pointers must be constrained
to fit in the limited space of the payload. For instance, the Monitor
is 8-byte aligned so the low 3 bits are available for encoding the tag
and mark.
在32位系统中,这种结构完全支持4GB范围的内存寻址.但是,这也意味着指针必须适
应有限空间的有效荷载.例如,"Monitor?"是8位对齐的,所以最后3位是空闲的,可用
于tag和mark的编码.

By convention, all modules using the mark bit must (a) work with the
world stopped and (b) leave all object's mark bits 0 after their operation.
(They may assume this is true when they start). This restriction avoids
needing to mask off the mark bit in common code paths in this file.
按照惯例,所有模块使用该Mark掩码,首先在操作时必须稍等片刻,然后在离开所有
对象时要将mark掩码复位.(假设开始时是这样的)。些约定能够避免,在该文件中
通用代码的标志位复位工作.

The payload field forms the majority of the MUW. The contents of the
payload are taken from bits 3..31 in the MUW, padded with three 0-bits
at the least significant end. This means that the payload can hold a
pointer to an 8-byte aligned address.
有效载荷位域构成了MUW的主体。载荷内容取自MUW中3..31位,在未尾填充的3个0
标志着结束。这意味着,有效载荷可以容纳一个指针,并指向8字节对齐的地址.

The tag values distinguish between four states that the MUW can be in:
tag标记值区分了下列4种MUW

00 => The MUW is either unused (if the payload is 0), or holds the
STM word for the object (if the payload is non-0).
00 => 如果地址为0,表示未使用,否则包含object的"STM word?".

01 => The payload holds the object's hash code or StringState bits.
01 => 包含object's的hash code或"字符状态位?".

10 => The payload refers to the object's Monitor.
10 => 指向该对象的同步锁.

11 => The payload refers to an external multi-use object (EMU).
11 => 指向"EMU?"
veiny 2009-09-06
  • 打赏
  • 举报
回复
没有什么变化啊
都是10000
dotnet 3.5
jimmey2000 2009-09-06
  • 打赏
  • 举报
回复
学习下下
qinw2003 2009-09-06
  • 打赏
  • 举报
回复
mark!!!!!
微创社(MCC) 2009-09-06
  • 打赏
  • 举报
回复
己经有答案了

在GC前,大家的情况是一样的,都是10000.
在GC后,各人的情况肯定有所变化,一般变的不会太大,
除非取数很大,越大GC1~2代(一般是1代)执行的可能性越大.

GC完了,地址就变了,地址对应的hashcode跟着变

所以微软说的不保证唯一性指的便是这个,
而一些书上说的不变,当然是指一般情况下,不GC的时候是不变的.

现在要做的,就是能够直接把这个hashcode算出来.

微创社(MCC) 2009-09-06
  • 打赏
  • 举报
回复
[Quote=引用 75 楼 goga21cn 的回复:]
LZ,我的结果和你的不一样啊
环境.net2.0  win2003 vs2005

你的第一段代码,F5几次结果都是10000

C# codeusing System;using System.Collections.Generic;using System.Linq;namespace ConsoleApplication9
{publicclass B//class版本 {publicint i;
}class Program
{staticvoid Main(string[] args)
{
List<int> list=new List<int>();
List<B> listClass=new List<B>();for (int i=0; i<10000; i++)
{
B b=new B();
b.i= i;if (!list.Contains(i))//先判断是否包含,和你的流程不太一样 {
list.Add(b.GetHashCode());
}//list.Add(b.GetHashCode()); listClass.Add(b);
}//Console.WriteLine(list.Distinct().Count());//9999,2.0的没有Distinct方法 Console.WriteLine(list.Count.ToString());//输出10000 Console.ReadKey();
}
}
}
[/Quote]

if (!list.Contains(i))
这个判断好费资源,
早点VS2008吧,
您用的是正版?

很奇怪,我也全是10000,呵呵,是不是GC的少了.

kmaui0523 2009-09-06
  • 打赏
  • 举报
回复
学习了。。。。
jiachenhe 2009-09-06
  • 打赏
  • 举报
回复
呵呵
加载更多回复(77)

111,074

社区成员

发帖
与我相关
我的任务
社区描述
.NET技术 C#
社区管理员
  • C#
  • AIGC Browser
  • by_封爱
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告

让您成为最强悍的C#开发者

试试用AI创作助手写篇文章吧