Thursday, October 14, 2010

Is .NET Type-Safe?

A type error is erroneous or undesirable program behaviour caused by a discrepancy between differing data types. […] The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values that are not of the appropriate data type.

That’s Wikipedia.

I always thought that .NET is type-safe. I always thought that the only legitimate way to access object data is to use it’s direct interface or an external dedicated interface (reflection).

I was wrong.

You see, in C/C++ there’s a notion of “union” where different data can share the same memory location. It seems that to preserve the compatibility with the legacy code, the same feature has been added to .NET but what shocks me is that it not only applies to value types but also to reference types.

Consider following example:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
 
namespace ConsoleApplication50
{
    class Program
    {
        static void Main( string[] args )
        {
            Foo f = new Foo();
 
            f.A = new A() { ab = 123, ai = 12345678 };
 
            Console.WriteLine( "{0} {1} {2} {3} {4}", f.B.b1, f.B.b2, f.B.b3, f.B.b4, f.B.bi );
            Console.ReadLine();
        }
    }
 
    [StructLayout( LayoutKind.Explicit )]
    public class Foo
    {
        [FieldOffset( 0 )]
        public A A;
        [FieldOffset( 0 )]
        public B B;
    }
 
    public class A
    {
        public int  ai;
        public byte ab;
    }
    public class B
    {
        public B( string Param ) { }
 
        public byte bi;
        public byte b1;
        public byte b2;
        public byte b3;
        public byte b4;
    }
}

Note that Foo is defined so that A and B overlap. But both A and B are reference types, however they point to the same address in memory. The result is somewhat surprising – although only A is explicitely initialized, B is accessed without any issues and because A and B overlap, the internal content of A is read through consecutive members of B.

I must say that I am really and deeply confused.

The ECMA, Partition II, 10.7 says

Offset values shall be non-negative. It is possible to overlap fields in this way, though offsets occupied by an
object reference shall not overlap with offsets occupied by a built-in value type or a part of another object
reference. While one object reference can completely overlap another, this is unverifiable.

If the code is unverifiable how can it run with no issues?

I always thought that the code which cannot be statically verified to be type-safe (and thus considered unsafe) can be executed only if SecurityPermission( SecurityAction.RequestMinimum, SkipVerification=true ) attribute is present in the assembly (this is how you let the code using pointers be executed by the runtime environment – while it’s usafe, it’s your responsibility to let it execute).

This is not the case, however, in the above example. The code runs perfectly under .NET 2.0 and .NET 4.0. What’s also confusing, there are two outputs from two different versions of PEVerify.exe:

Microsoft (R) .NET Framework PE Verifier.  Version  3.5.30729.1Copyright (c) Microsoft Corporation.  All rights reserved.All Classes and Methods in ConsoleApplication50.exe Verified.

and

Microsoft (R) .NET Framework PE Verifier.  Version  4.0.30319.1Copyright (c) Microsoft Corporation.  All rights reserved.[IL]: Error: [C:\Users\wzychla\Documents\Poligony\C#3.0\ConsoleApplication50\ConsoleApplication50\bin\Debug\ConsoleApplication50.exe : ConsoleApplication50.Program::Main] Type load failed.[token  0x02000003] Type load failed.2 Error(s) Verifying ConsoleApplication50.exe

Would you at least expect the latter to be used by .NET 4.0 Runtime? Well, it’s not, as the code runs under .NET 4.0. The former is invoked from

c:\Program Files\Microsoft SDKs\Windows\v7.0A\bin

the latter from

c:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools

The case is getting interesting! It seems that even .NET Framework engineers seem unsure whether or not such code is verifiable or not (or maybe the notion of “type-safety” changes as the time passes). Why the runtime does not prevent this code from executing without proper SecurityPermission even though the latest PEVerify claims it’s unsafe?

Nevertheless, I belive that the example from the above reveals something not quite intended. To me the .NET Framework is not type-safe anymore. Most people consider that type-safety means that the code accesses only the memory locations it is somehow authorized to access (directly or using dedicated API). However, consider this (I just modified the Main method from the above example):

static void Main( string[] args )
{
     Foo f = new Foo();
 
     f.A = new A() { ab = 123, ai = 12345678 };
 
     Console.WriteLine( f.A.ai );
 
     f.B.b3 = 44;
 
     Console.WriteLine( f.A.ai );
 
     Console.ReadLine();
}

How am I allowed to modify the internal contents of an object (A in this case) by modifying the contents of another object? Do you still belive that the runtime cares for the consistency of your objects? Consider this:

public class A
{
    public A( int i ) { this.i = i; }
 
    private int i;
 
    public void WriteI()
    {
        Console.WriteLine( i );
    }
}

This is your class. It does not expose any public data, it can only print the internal value. I can easily compromise your object by wrapping it together with another object having different structure:

[StructLayout( LayoutKind.Explicit )]
public class HackShell
{
    [FieldOffset( 0 )]
    public A A;
    [FieldOffset( 0 )]
    public HackA B;
}
 
public class HackA
{
    public byte i1;
    public byte i2;
    public byte i3;
    public byte i4;
}

and now the integrity of your object does not hold anymore:

static void Main( string[] args )
 {
     A a = new A( 5 );
 
     // do you believe in the integrity of your object?
 
     HackShell shell = new HackShell();
     shell.A = a;
 
     // well, here you are:
 
     shell.B.i3 = 17;
 
     a.WriteI();
 
     Console.ReadLine();
 }

Unbelievable.

Altough the internal data of an object can be modified using reflection, the reflection cannot be used to alter the integrity of an object. And while pointers can be used to alter the integrity of objects, using them means that your code is not verifiable and so it will not be executed without explicit security permissions.

The conclusion is as follows: it seems that it’s valid to write a code that is completely type-unsafe and verifiable in the same time.

4 comments:

Mojo said...

Well, there goes the neighborhood.

Anonymous said...

I'm impressed. Really nice "paper".

Paweł Łukasik said...

Really interesting info. I've used StructLayout many times for P/Invoke but did not think it could be used that way.

The code could be even "simplified". It would be easier just to use byte[] in the HackA class instead of four bytes. They we are covering all sizes of data and we have unified access to all bytes. I'm thinking how this could be used gain more privileges than we are allowed.

Regards,
Pawel

daqruel said...

Very intresting post!!!
In .NET 4.0 it is possible to do:

List<IA> ha = new List<A>();// O(1) complexity
where A : IA

but using this post trick, it s also possible in older .NET versions
(with O(1) complexity, too)

List<IA> tu = new Overlap { Source = new List<A>() }.Target;

[StructLayout(LayoutKind.Explicit)]
public class Overlap
{
[FieldOffset(0)]
public List<A> Source;

[FieldOffset(0)]
public List<IA> Target;
}