首页 > 其他分享> > 反射使HashCode不稳定

反射使HashCode不稳定

2019-10-26 05:05:29 作者：互联网

在以下代码中,访问SomeClass的自定义属性将导致SomeAttribute的哈希函数变得不稳定.
这是怎么回事？

static void Main(string[] args)
{
    typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected

    SomeAttribute tt = new SomeAttribute();
    Console.WriteLine(tt.GetHashCode());//Prints 1234567
    Console.WriteLine(tt.GetHashCode());//Prints 0
    Console.WriteLine(tt.GetHashCode());//Prints 0
}


[SomeAttribute(field2 = 1)]
class SomeClass
{
}

class SomeAttribute : System.Attribute
{
    uint field1=1234567;
    public uint field2;            
}

更新：

现在,这已作为错误报告给MS.
https://connect.microsoft.com/VisualStudio/feedback/details/3130763/attibute-gethashcode-unstable-if-reflection-has-been-used

更新2：

dotnetcore中已解决此问题：
https://github.com/dotnet/coreclr/pull/13892

解决方法:

这真的很棘手.首先,让我们看一下Attribute.GetHashCode方法的源代码：

public override int GetHashCode()
{
    Type type = GetType();

    FieldInfo[] fields = type.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
    Object vThis = null;

    for (int i = 0; i < fields.Length; i++)
    {
        // Visibility check and consistency check are not necessary.
        Object fieldValue = ((RtFieldInfo)fields[i]).UnsafeGetValue(this);

        // The hashcode of an array ignores the contents of the array, so it can produce 
        // different hashcodes for arrays with the same contents.
        // Since we do deep comparisons of arrays in Equals(), this means Equals and GetHashCode will
        // be inconsistent for arrays. Therefore, we ignore hashes of arrays.
        if (fieldValue != null && !fieldValue.GetType().IsArray)
            vThis = fieldValue;

        if (vThis != null)
            break;
    }

    if (vThis != null)
        return vThis.GetHashCode();

    return type.GetHashCode();
}

简而言之,它的作用是：

>枚举属性的字段
>查找第一个不是数组且没有空值的字段
>返回此字段的哈希码

在这一点上,我们可以得出两个结论：

>仅考虑一个字段来计算属性的哈希码
>该算法在很大程度上依赖于Type.GetFields返回的字段的顺序(因为我们采用了与条件匹配的第一个字段)

进一步测试,我们可以看到Type.GetFields返回的字段的顺序在两个版本的代码之间发生了变化：

typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected
SomeAttribute tt = new SomeAttribute();
Console.WriteLine(tt.GetHashCode());//Prints 1234567
Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0

foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

如果第一行未注释,则代码显示：

field2

field1

如果该行被注释,代码将显示：

field1

field2

因此,它确认某些内容正在更改字段的顺序,从而为GetHashCode函数产生不同的结果.

更有趣的是：

typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected
SomeAttribute tt = new SomeAttribute();
foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0

foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

此代码显示：

field1

field2

0

0

0

field2

field1

剩下的唯一问题是：为什么在第一次调用GetFields之后字段的顺序会改变？我相信这与Type实例中的内部缓存有关.

我们可以通过在quickwatch窗口中运行它来检查缓存的值：

System.Runtime.InteropServices.GCHandle.InternalGet(((System.RuntimeType)typeof(SomeAttribute)).m_cache) as RuntimeType.RuntimeTypeCache

在执行的最开始,缓存是空的(显然).然后,我们执行：

typeof(SomeClass).GetCustomAttributes(false)

在此行之后,如果我们检查缓存,则它包含一个字段：field2.现在很有趣.为什么是这个领域？因为您使用它,所以它是SomeClass的属性：[SomeAttribute(field2 = 1)]

然后,我们执行第一个GetHashCode并检查缓存,它现在包含field2和field1(请记住顺序很重要).由于字段的顺序,随后执行GetHashCode将返回0.

现在,如果我们删除行typeof(SomeClass).GetCustomAttributes(false)并在第一个GetHashCode之后检查缓存,我们将找到field1和field2.

总结一下：

属性的哈希码算法使用它找到的第一个字段的值.因此,它在很大程度上依赖于Type.GetFields方法返回的字段的顺序.为了提高性能,此方法在内部使用缓存.

有两种情况：

>不使用typeof(SomeClass).GetCustomAttributes(false);的情况

在这里,当调用GetFields时,缓存为空.它将由属性的字段填充,顺序为field1,field2.然后,GetHashCode将找到field1作为第一个字段,并显示1234567.
>使用typeof(SomeClass).GetCustomAttributes(false);的场景

执行该行时,将执行属性构造函数：[SomeAttribute(field2 = 1)].届时,field2的元数据将被推入缓存.然后,您调用GetHashCode,缓存将完成. field2已经存在,因此不会再次添加.然后,接下来将添加field1.因此,缓存中的顺序为field2,field1.因此,GetHashCode将找到field2作为第一个字段,并显示0.

剩下的唯一令人惊讶的观点是：为什么第一次调用GetHashCode的行为与接下来的调用有所不同？我没有检查,但是我相信它会检测到缓存不完整,并以其他方式读取字段.然后,对于后续调用,缓存将完成并且其行为将保持一致.

老实说,我认为这是一个错误. GetHashCode的结果应随时间保持一致.因此,Attribute.GetHashCode的实现不应依赖于Type.GetFields返回的字段的顺序,因为我们已经看到它可以更改.这应该报告给Microsoft.

标签：reflection,custom-attributes,hash,c
来源： https://codeday.me/bug/20191026/1934076.html