Aug 112015

If there is one place where developers feel least guilty about allocating large amounts of memory, it’s in the local variables of the method. After all, local variables are short lived and as the method execution is over, the call stack is wound down and the return value is popped. This frees all of them for garbage  collection. This developer assumption might hold true for most methods, but not for all.

Let’s consider a simple method first. It allocates a large  1 million element long integer array and  returns the length of the array.

    class Program
        static void Main(string[] args)
            var a = new Foo();
    class Foo
        public int Bar()
            var arr = Enumerable.Repeat(1, 1000000).ToArray();
            return arr.Length;

Let’s compile the program and open it in WinDbg. The commands for doing that are

  • .symfix (fixes the symbol path)
  • sxe ld:clrjit.dll (telling the code to break when clrjit is loaded)
  • g (continuing execution till the clrdll is loaded)
  • .loadby sos clr (This loads the SOS managed debugging extension)
  • !bpmd (Break when Program.main is executed)

Looking at the IL code of the Foo.Bar method, it’s pretty straightforward. A 1 million element array is created and then the ldloc.0 instruction loads the local variable on the stack. After the method returns, the local variable pointer no longer exists and the garbage collector is free to reclaim the memory for other objects.


This works quite well, but imagine a scenario where you might need access to the local variable even after the method execution is over.  One scenario is when the method returns  a Func delegate instead of an integer.

class Foo
    public Func<int> BarFunc()
        var arr = Enumerable.Repeat(1, 1000000).ToArray();
        return () => arr.Length;

Though this method will always return the same result as the previous method, the CLR cannot make that assumption and mark the integer array for collection. Because there is no guarantee that the returned delegate will be executed immediately or even just once, the CLR has to maintain a reference to the local variable even after the method execution is completed. The compiler resolves this dilemma by promoting the local variable on to the heap as a field of an autogenerated Type. Let’s see the IL generated when this new method is called.



This IL is considerably different. The newobj instruction creates an object of a new Type c__DisplayClass1 which we never created. That is the type which the compiler autogenerated and used for storing the local variable. Since the new type lives on the heap it’s lifetime is guaranteed till the return delegate’s reference is held on by the calling method.  We can verify this by examining the managed Heap



…and the object of the autogenerated type shows our local variable now as a field.



If we modify our main method a bit and store the resulting delegate into a class level field , we can see that the GC maintains an explicit root to the object. In essence the object lives till the application execution is completed. This is unnecessary memory usage by the application.

class Program
    private static Func<int> classLevelVariable;

    static void Main(string[] args)
        var a = new Foo();
        classLevelVariable = a.BarFunc();

Finding GCRoots for the object, we see that Garbage collector can never collect this object.


This particular scenario might seem trivial, but in a LINQ-heavy production application it is very easy to lose track of the methods that are creating closures.  Awareness about the promotion of local variables can help prevent memory leaks and improve application performance.


Aug 072015

Microsoft recently open sourced the CLR and the framework libraries and published them on Github. Though a non production version has been open sourced for a long time under the name Rotor or SSCLI, this time there were no half measures. It gives the community the opportunity to raise issues and also fix them by creating pull requests.

The journey from source to executable code has two phases – first the compiler compiles the source code into the Intermediate Language (MSIL) and then the execution engine (CLR) converts the IL to machine specific assembly instructions. This allows .NET code to be executable across platforms and also be language agnostic as the runtime only understands MSIL.

When the program is executed, the CLR reads the type information from the assembly and creates in-memory structures to represent them. The main structures that represent a type at runtime are the MethodTable and the EEClass. The MethodTable contains “hot” data which is frequently accessed by the runtime to resolve method calls and for garbage collection. The EEClass on the other hand is a cold structure which has detailed structural information about the type including its fields and methods. This is used in Reflection. The main reason for splitting these structures is to optimize performance and keep the frequently accessed fields in as small a data structure  as possible. Every non-generic type has its own copy of the MethodTable and the EEClass, and the pointer to the MethodTable is stored in the first memory address location of each object. We can observe this by loading the SOS managed debugging extension in WinDbg



The DumpHeap command gives us the information of our type along with the the addresses of all the objects for the type. Using the WinDbg command dq to read the address at the memory address we see that the first memory address points to its MethodTable. There is another structure called the SyncBlock which exists at a negative offset to the MethodTable in the memory. This structure handles the thread synchronization information for the object.

This diagram from the SSCLI Essentials Book explains the relationship between various data structures very clearly.


As you can see the object header points to the MethodTable which in turns point to the EEClassSince the EEClass is not frequently used during runtime, this extra level of indirection doesn’t hurt performance. The MethodTable itself is followed by a call table – a table which contains the addresses of the virtual and non virtual methods to be executed for the type. Since the dispatch table is laid out at a fixed offset from the MethodTablethere is no pointer indirection to access the right method to call. One more thing to be noted about the CLR is that everything is loaded only when it’s needed to be executed. This holds true for both types and methods. When the CLR executes a method which creates another type, it creates the memory structures for the new type. However, even then the methods themselves are not compiled till the absolute last moment when they are needed to be executed.

In the above diagram, you can see the MethodTable vtable pointing to a thunk, which is called a prestub in .NET. When the method is first called, the prestub calls the JIT compiler. The JIT compiler is responsible for reading the MSIL opcode and generating the processor specific assembly code. Once the JIT Compilation is done, the address at which the compiled code resides is backpatched on to the call table. Subsequent calls to the method are directly executed without having to go through the compilation phase

Loading the MethodTable for our calculator type using the command DumpMT with the MD switch which also loads the MethodDescriptors.


At this stage in the application execution, the object for Calculator class has been created but the AddTwoNumbers method hasn’t been executed yet. So the MethodDesc table shows that only the constructor method has been jitted but not the AddTwoNumbers method.  Seeing the MethodDescriptors for both the methods using the command !DumpMD



The Constructor method now contains a code address, but the AddTwoNumbers doesn’t have code yet. Let’s step forward and see what happens after the method is jitted. Now the code address is replaced by an actual memory address which contains our machine specific assembly code. The next time this method is called, this assembly code will be directly executed.


To view the assembly, use the !u command followed by the code address.  Like in most languages, there are two registers ebp and esp to keep track of each stackframe. During a method call a new stackframe is created and the ebp maintains a pointer to the base of the stack. As code executes the esp register keeps track of how the stack grows and once execution completes, the stack is cleared and the epb value is popped.



Now lets look at this from a code level. Detailed building and debugging instructions are given at the coreclr repo. The MethodTableBuilder class contains the method which loads the types. You could put a breakpoint here but it will keep breaking when system types are loading. To avoid this , put a breakpoint in the RunMain method in assembly.cpp class, and once it breaks then put the breakpoint in the CreateTypeHandle method. This will start breaking on your custom type creation.


Below is the simple Calculator class code that we are calling. I just used the name of the executable as a Command Argument to run CoreRun.exe in the coreclr solution (Detailed instructions given in Github repo)



Now for the fun part – we start debugging the solution. The first step (after loading allocators) is to make sure all parent types are loaded. Since our type doesn’t inherit any class, its parent is System.Object. Once the Parent type is found (can’t be an interface, only a concrete type), it’s method table is returned to the MethodTableBuilder



Then there are some additional checks to handle cases like enums, Generic method, excplicit layouts etc. I’ll skip over them for brevity. At this time we have started to build the MethodTable but not the EEClass. That is done in the next step.



At this  stage, the CLR checks if the type implements any interfaces. Since interface calls are a bit more complex, there needs to be a relationship from the interface vtable to the implementing type, the calls are mapped using a slot map maintained on the implementing type’s MethodTable which maps it to the vtable slot on the interface. Since our Calculator Class doesn’t inherit interfaces, it will totally skip this block.


Now we go into the final and most crucial method which will finally return the TypeHandle. If this method succeeds, then our type has been successfully loaded into memory.


The first thing the BuildMethodTableThrowing class does is to walk up the inheritance hierarchy and load the parent type. This holds for all methods except interfaces. An interface’s vtable will not contain the System.Object’s method calls. So the method builder will simply set the parent Type to null if the type being loaded is an interface.


After this, the method makes sure the type in question is not a value type, enum, remoting type, or being called via COM Interop. All this would be loaded differently then simple reference types deriving directly from System.Object. Then the MethodImpl attributes are checked since they impact how a type a loaded. Our Calculator class just skips over these checks. The next method is EnumerateClassMethods which iterates through all the methods and adds them to the MethodTable.

Now that the implemented methods are added to the MethodTable, we need to also add the parent type’s method calls to the current vtable. this is done by the methods ImportParentMethods, AllocateWorkingSlotTables and CopyParentVtable in the MethodBuilder class. Here virtual methods have to be handled differently since they can be overridden by the current type. For non virtual methods, a direct entry to the methods implemented by the Parent type should suffice.

First the maximum possible vtable size is computed. Next a temporary table is allocated for the maximum possible size of the table


Then the parent vTable methods are loaded to the Calculator type.


After the Parent methods are added, the current type’s methods are added. We just have two methods – the Constructor and the AddTwoNumbers method. Here first the Virtual Methods are added and then the Non-Virtual ones. Since we didn’t define a custom constructor, it will just inherit the Default constructor and add it in the vtable. Once all virtual methods are added, the remaining methods will get the non vtable slots.


Now that the type’s methods have been completely been loaded, the MethodDescriptors are  created. However the code for the methods is not called even once so it will simply be pointing to a stub waiting to be JIT compiled on execution. After this stage the remaining fields are placed in the MethodTable and some additional integrity checks are done. Finally the Type is loaded and is ready to be published