Aug 112015
 

If there is one place where developers feel least guilty about allocating large amounts of memory, it’s in the local variables of the method. After all, local variables are short lived and as the method execution is over, the call stack is wound down and the return value is popped. This frees all of them for garbage  collection. This developer assumption might hold true for most methods, but not for all.

Let’s consider a simple method first. It allocates a large  1 million element long integer array and  returns the length of the array.

    class Program
    {
        static void Main(string[] args)
        {
            var a = new Foo();
            Console.WriteLine(a.Bar());
        }
    }
    class Foo
    {
        public int Bar()
        {
            var arr = Enumerable.Repeat(1, 1000000).ToArray();
            return arr.Length;
        }
    }

Let’s compile the program and open it in WinDbg. The commands for doing that are

  • .symfix (fixes the symbol path)
  • sxe ld:clrjit.dll (telling the code to break when clrjit is loaded)
  • g (continuing execution till the clrdll is loaded)
  • .loadby sos clr (This loads the SOS managed debugging extension)
  • !bpmd (Break when Program.main is executed)

Looking at the IL code of the Foo.Bar method, it’s pretty straightforward. A 1 million element array is created and then the ldloc.0 instruction loads the local variable on the stack. After the method returns, the local variable pointer no longer exists and the garbage collector is free to reclaim the memory for other objects.

Foo.Bar

This works quite well, but imagine a scenario where you might need access to the local variable even after the method execution is over.  One scenario is when the method returns  a Func delegate instead of an integer.

class Foo
{
    public Func<int> BarFunc()
    {
        var arr = Enumerable.Repeat(1, 1000000).ToArray();
        return () => arr.Length;
    }
}

Though this method will always return the same result as the previous method, the CLR cannot make that assumption and mark the integer array for collection. Because there is no guarantee that the returned delegate will be executed immediately or even just once, the CLR has to maintain a reference to the local variable even after the method execution is completed. The compiler resolves this dilemma by promoting the local variable on to the heap as a field of an autogenerated Type. Let’s see the IL generated when this new method is called.

Foo.BarFunc

 

This IL is considerably different. The newobj instruction creates an object of a new Type c__DisplayClass1 which we never created. That is the type which the compiler autogenerated and used for storing the local variable. Since the new type lives on the heap it’s lifetime is guaranteed till the return delegate’s reference is held on by the calling method.  We can verify this by examining the managed Heap

NewAutogeneratedType

 

…and the object of the autogenerated type shows our local variable now as a field.

DumpObject

 

If we modify our main method a bit and store the resulting delegate into a class level field , we can see that the GC maintains an explicit root to the object. In essence the object lives till the application execution is completed. This is unnecessary memory usage by the application.

class Program
{
    private static Func<int> classLevelVariable;

    static void Main(string[] args)
    {
        var a = new Foo();
        classLevelVariable = a.BarFunc();
        Console.ReadLine();
        classLevelVariable();
    }
}

Finding GCRoots for the object, we see that Garbage collector can never collect this object.

gcroots

This particular scenario might seem trivial, but in a LINQ-heavy production application it is very easy to lose track of the methods that are creating closures.  Awareness about the promotion of local variables can help prevent memory leaks and improve application performance.