Aug 112015
 

If there is one place where developers feel least guilty about allocating large amounts of memory, it’s in the local variables of the method. After all, local variables are short lived and as the method execution is over, the call stack is wound down and the return value is popped. This frees all of them for garbage  collection. This developer assumption might hold true for most methods, but not for all.

Let’s consider a simple method first. It allocates a large  1 million element long integer array and  returns the length of the array.

    class Program
    {
        static void Main(string[] args)
        {
            var a = new Foo();
            Console.WriteLine(a.Bar());
        }
    }
    class Foo
    {
        public int Bar()
        {
            var arr = Enumerable.Repeat(1, 1000000).ToArray();
            return arr.Length;
        }
    }

Let’s compile the program and open it in WinDbg. The commands for doing that are

  • .symfix (fixes the symbol path)
  • sxe ld:clrjit.dll (telling the code to break when clrjit is loaded)
  • g (continuing execution till the clrdll is loaded)
  • .loadby sos clr (This loads the SOS managed debugging extension)
  • !bpmd (Break when Program.main is executed)

Looking at the IL code of the Foo.Bar method, it’s pretty straightforward. A 1 million element array is created and then the ldloc.0 instruction loads the local variable on the stack. After the method returns, the local variable pointer no longer exists and the garbage collector is free to reclaim the memory for other objects.

Foo.Bar

This works quite well, but imagine a scenario where you might need access to the local variable even after the method execution is over.  One scenario is when the method returns  a Func delegate instead of an integer.

class Foo
{
    public Func<int> BarFunc()
    {
        var arr = Enumerable.Repeat(1, 1000000).ToArray();
        return () => arr.Length;
    }
}

Though this method will always return the same result as the previous method, the CLR cannot make that assumption and mark the integer array for collection. Because there is no guarantee that the returned delegate will be executed immediately or even just once, the CLR has to maintain a reference to the local variable even after the method execution is completed. The compiler resolves this dilemma by promoting the local variable on to the heap as a field of an autogenerated Type. Let’s see the IL generated when this new method is called.

Foo.BarFunc

 

This IL is considerably different. The newobj instruction creates an object of a new Type c__DisplayClass1 which we never created. That is the type which the compiler autogenerated and used for storing the local variable. Since the new type lives on the heap it’s lifetime is guaranteed till the return delegate’s reference is held on by the calling method.  We can verify this by examining the managed Heap

NewAutogeneratedType

 

…and the object of the autogenerated type shows our local variable now as a field.

DumpObject

 

If we modify our main method a bit and store the resulting delegate into a class level field , we can see that the GC maintains an explicit root to the object. In essence the object lives till the application execution is completed. This is unnecessary memory usage by the application.

class Program
{
    private static Func<int> classLevelVariable;

    static void Main(string[] args)
    {
        var a = new Foo();
        classLevelVariable = a.BarFunc();
        Console.ReadLine();
        classLevelVariable();
    }
}

Finding GCRoots for the object, we see that Garbage collector can never collect this object.

gcroots

This particular scenario might seem trivial, but in a LINQ-heavy production application it is very easy to lose track of the methods that are creating closures.  Awareness about the promotion of local variables can help prevent memory leaks and improve application performance.

 

Aug 072015
 

Microsoft recently open sourced the CLR and the framework libraries and published them on Github. Though a non production version has been open sourced for a long time under the name Rotor or SSCLI, this time there were no half measures. It gives the community the opportunity to raise issues and also fix them by creating pull requests.

The journey from source to executable code has two phases – first the compiler compiles the source code into the Intermediate Language (MSIL) and then the execution engine (CLR) converts the IL to machine specific assembly instructions. This allows .NET code to be executable across platforms and also be language agnostic as the runtime only understands MSIL.

When the program is executed, the CLR reads the type information from the assembly and creates in-memory structures to represent them. The main structures that represent a type at runtime are the MethodTable and the EEClass. The MethodTable contains “hot” data which is frequently accessed by the runtime to resolve method calls and for garbage collection. The EEClass on the other hand is a cold structure which has detailed structural information about the type including its fields and methods. This is used in Reflection. The main reason for splitting these structures is to optimize performance and keep the frequently accessed fields in as small a data structure  as possible. Every non-generic type has its own copy of the MethodTable and the EEClass, and the pointer to the MethodTable is stored in the first memory address location of each object. We can observe this by loading the SOS managed debugging extension in WinDbg

methodtablelayout

 

The DumpHeap command gives us the information of our type along with the the addresses of all the objects for the type. Using the WinDbg command dq to read the address at the memory address we see that the first memory address points to its MethodTable. There is another structure called the SyncBlock which exists at a negative offset to the MethodTable in the memory. This structure handles the thread synchronization information for the object.

This diagram from the SSCLI Essentials Book explains the relationship between various data structures very clearly.

objectlayout

As you can see the object header points to the MethodTable which in turns point to the EEClassSince the EEClass is not frequently used during runtime, this extra level of indirection doesn’t hurt performance. The MethodTable itself is followed by a call table – a table which contains the addresses of the virtual and non virtual methods to be executed for the type. Since the dispatch table is laid out at a fixed offset from the MethodTablethere is no pointer indirection to access the right method to call. One more thing to be noted about the CLR is that everything is loaded only when it’s needed to be executed. This holds true for both types and methods. When the CLR executes a method which creates another type, it creates the memory structures for the new type. However, even then the methods themselves are not compiled till the absolute last moment when they are needed to be executed.

In the above diagram, you can see the MethodTable vtable pointing to a thunk, which is called a prestub in .NET. When the method is first called, the prestub calls the JIT compiler. The JIT compiler is responsible for reading the MSIL opcode and generating the processor specific assembly code. Once the JIT Compilation is done, the address at which the compiled code resides is backpatched on to the call table. Subsequent calls to the method are directly executed without having to go through the compilation phase

Loading the MethodTable for our calculator type using the command DumpMT with the MD switch which also loads the MethodDescriptors.

methodtable

At this stage in the application execution, the object for Calculator class has been created but the AddTwoNumbers method hasn’t been executed yet. So the MethodDesc table shows that only the constructor method has been jitted but not the AddTwoNumbers method.  Seeing the MethodDescriptors for both the methods using the command !DumpMD

MethodDescriptors

 

The Constructor method now contains a code address, but the AddTwoNumbers doesn’t have code yet. Let’s step forward and see what happens after the method is jitted. Now the code address is replaced by an actual memory address which contains our machine specific assembly code. The next time this method is called, this assembly code will be directly executed.

afterjitting

To view the assembly, use the !u command followed by the code address.  Like in most languages, there are two registers ebp and esp to keep track of each stackframe. During a method call a new stackframe is created and the ebp maintains a pointer to the base of the stack. As code executes the esp register keeps track of how the stack grows and once execution completes, the stack is cleared and the epb value is popped.

assemblyccode

 

Now lets look at this from a code level. Detailed building and debugging instructions are given at the coreclr repo. The MethodTableBuilder class contains the method which loads the types. You could put a breakpoint here but it will keep breaking when system types are loading. To avoid this , put a breakpoint in the RunMain method in assembly.cpp class, and once it breaks then put the breakpoint in the CreateTypeHandle method. This will start breaking on your custom type creation.

createtypehandle

Below is the simple Calculator class code that we are calling. I just used the name of the executable as a Command Argument to run CoreRun.exe in the coreclr solution (Detailed instructions given in Github repo)

DebugCode

 

Now for the fun part – we start debugging the solution. The first step (after loading allocators) is to make sure all parent types are loaded. Since our type doesn’t inherit any class, its parent is System.Object. Once the Parent type is found (can’t be an interface, only a concrete type), it’s method table is returned to the MethodTableBuilder

loadparenttype

 

Then there are some additional checks to handle cases like enums, Generic method, excplicit layouts etc. I’ll skip over them for brevity. At this time we have started to build the MethodTable but not the EEClass. That is done in the next step.

eeclass

 

At this  stage, the CLR checks if the type implements any interfaces. Since interface calls are a bit more complex, there needs to be a relationship from the interface vtable to the implementing type, the calls are mapped using a slot map maintained on the implementing type’s MethodTable which maps it to the vtable slot on the interface. Since our Calculator Class doesn’t inherit interfaces, it will totally skip this block.

interfaces

Now we go into the final and most crucial method which will finally return the TypeHandle. If this method succeeds, then our type has been successfully loaded into memory.

bmtfinalmethod

The first thing the BuildMethodTableThrowing class does is to walk up the inheritance hierarchy and load the parent type. This holds for all methods except interfaces. An interface’s vtable will not contain the System.Object’s method calls. So the method builder will simply set the parent Type to null if the type being loaded is an interface.

interfaceInBuilder

After this, the method makes sure the type in question is not a value type, enum, remoting type, or being called via COM Interop. All this would be loaded differently then simple reference types deriving directly from System.Object. Then the MethodImpl attributes are checked since they impact how a type a loaded. Our Calculator class just skips over these checks. The next method is EnumerateClassMethods which iterates through all the methods and adds them to the MethodTable.

Now that the implemented methods are added to the MethodTable, we need to also add the parent type’s method calls to the current vtable. this is done by the methods ImportParentMethods, AllocateWorkingSlotTables and CopyParentVtable in the MethodBuilder class. Here virtual methods have to be handled differently since they can be overridden by the current type. For non virtual methods, a direct entry to the methods implemented by the Parent type should suffice.

First the maximum possible vtable size is computed. Next a temporary table is allocated for the maximum possible size of the table

maxVTableSize

Then the parent vTable methods are loaded to the Calculator type.

CopyParentVtable

After the Parent methods are added, the current type’s methods are added. We just have two methods – the Constructor and the AddTwoNumbers method. Here first the Virtual Methods are added and then the Non-Virtual ones. Since we didn’t define a custom constructor, it will just inherit the Default constructor and add it in the vtable. Once all virtual methods are added, the remaining methods will get the non vtable slots.

constructorCurrentType

Now that the type’s methods have been completely been loaded, the MethodDescriptors are  created. However the code for the methods is not called even once so it will simply be pointing to a stub waiting to be JIT compiled on execution. After this stage the remaining fields are placed in the MethodTable and some additional integrity checks are done. Finally the Type is loaded and is ready to be published

finalMethodTable

 

Jan 242012
 

Reusability is one of the most underrated aspects of software development. The focus on meeting functional requirements in a product as quickly as possible often overshadows the reusability and maintainability part. Code duplication, poor adherence to design are often symptoms of such development which increase costs in the long term in favour of short term benefits.

WPF encourages creation of reusable controls, primarily by its support for creation of “lookless controls” where the client of the control is not only able to reuse the control, but also completely alter its appearance. The control in essence is just a bundle of logic with a default template. Its upto the client whether to accept this default skin or override it with one of his own. Though this concept looks similar to themes and skins in other technologies, its extremely powerful that you can alter the visual appearance of the control at a granular level. If the default template defines a button and a textbox, that can easily be changed to an filled rectangle with a Linear Gradient and a TextBlock to display the data.

Difference between User Controls and Custom Controls.

Custom controls arent the only type of control library that WPF offers. There is also the User Control library. The difference between the two lies in our end purpose. If we are simply looking to bundle a few controls together and provide basic customization to the end user, then User Controls are the way to go. They are also much more easier to use than Custom Controls. However, if your aim is to provide full customization capability to the developer who consumes your control, then Custom Controls are much better.

Dependency Properties

Dependency Properties are quite different from the conventional properties which we use in C#, and are exclusive to WPF. They do not belong to any particular class and their value can be set sources other than just the class itself. The values could come from their default values, styles, themes, callbacks etc. They also support Data binding so your UI elements can directly bind to them and update whenever the value changes, like how properties in classes implementing INotifyPropertyChanged events behave.

Custom Control Example

The custom control example I build is a filled rectangle with a slider. As you increase the slider the fill percentage of the rectangle increases as well. Here is an illustration

The control is quite simple. There are two main properties here which are exposed to the outer world – The FillColor and the EmptyColor denoting the colors of the rectangle. The third property is the Value of slider which is used within the control, but that too could be exposed out of the control. Lets see the code for the control. ( Notice the absence of any UI stuff)

public class FilledBarControl : Control
    {
        public FilledBarControl()
        {
            DataContext = this;
        }
        
        static FilledBarControl()
        {
            DefaultStyleKeyProperty.OverrideMetadata(typeof(FilledBarControl), new FrameworkPropertyMetadata(typeof(FilledBarControl)));
          
        }

        public static readonly DependencyProperty EmptyColorProperty = DependencyProperty.Register("EmptyColor", typeof(Color), typeof(FilledBarControl), new UIPropertyMetadata((Color)Colors.Transparent));

        public Color EmptyColor
        {
            // IMPORTANT: To maintain parity between setting a property in XAML and procedural code, do not touch the getter and setter inside this dependency property!
            get
            {
                return (Color)GetValue(EmptyColorProperty);
            }
            set
            {
                SetValue(EmptyColorProperty, value);
            }
        }
        

        public static readonly DependencyProperty FillColorProperty = DependencyProperty.Register("FillColor", typeof(Color), typeof(FilledBarControl), new UIPropertyMetadata((Color)Colors.Red));

        public Color FillColor
        {
            // IMPORTANT: To maintain parity between setting a property in XAML and procedural code, do not touch the getter and setter inside this dependency property!
            get
            {
                return (Color)GetValue(FillColorProperty);
            }
            set
            {
                SetValue(FillColorProperty, value);
            }
        }
    }

As you can see, there are two steps to declaring a dependency property – First is the registering of the property using the DependencyProperty.Register method and the second is the definition of the getter and setter methods. The naming convention for the DependencyObject is to append “Property” after the name of the Dependecy Property. Hence here the object becomes FillColorProperty. You can also define the default values for the Property in the Register method. It needs to be passed inside the constructor of the PropertyMetadata object. I used Red Color here, so if the developer doesnt pass any value for the FillColor, the control automatically chooses Red.

Like written before, a custom control is simply defined in code, it doesn’t necessarily need a UI to exist – the UI can be supplied by the developer using the control. The default look and feel for the control is defined in a separate file – Generic.xaml inside the themes folder. Here is the xaml for the filled Bar control.

<Style TargetType="{x:Type local:FilledBarControl}">
        <Setter Property="Template">
            <Setter.Value>
                <ControlTemplate TargetType="{x:Type local:FilledBarControl}">
                    <UniformGrid Columns="1">
                        <Border BorderThickness="{TemplateBinding BorderThickness}" BorderBrush="{TemplateBinding BorderBrush}" >                           
                            <Rectangle Height="{TemplateBinding Height}"
                            Width="{TemplateBinding Width}">
                            <Rectangle.Fill>
                                <LinearGradientBrush StartPoint="0,0" EndPoint="1,0">
                                    <GradientStop Color="{Binding Path=FillColor}"
                        Offset="0"/>
                                    <GradientStop Color="{Binding Path=FillColor}"
                        Offset="{Binding ElementName=slider, Path=Value}"/>
                                    <GradientStop Color="{Binding Path=EmptyColor}" 
                        Offset="{Binding ElementName=slider, Path=Value}"/>
                                </LinearGradientBrush>
                            </Rectangle.Fill>
                        </Rectangle>
                        </Border>

                        <Slider x:Name="slider" Width="200" Height="50" 
            Minimum="0" Maximum="1" Value="0.2"/>
                    </UniformGrid>
                </ControlTemplate>
            </Setter.Value>
        </Setter>
    </Style>

The XAML is just a control template containing a Uniform Grid. One of the rows is the Rectangle and the other a Slider for the value. The rectangle’s fill tag is a LinearGradientBrush with two phases. One from the Start Point (0,0) to the Value (bound to the Slider’s value) and another which starts from the Value to the end point. This gradient gives the impression of a filled rectangle whose fill percentage changes as the slider is dragged.

Now, how can a host application change the appearance of the control? There are two ways – one using the Dependency Properties and other completely overriding the Control Template itself. As you can recall, there were two dependency Properties defined in the FilledBar.cs – The FillColor and the EmptyColor. Both these properties appear in the Intellisense while defining the control in the XAML. An example of such customization would be

         <FilledBar:FilledBarControl HorizontalAlignment="Center" 
                                    VerticalAlignment="Center"  
                                    Height="100" Width="200" 
                                    BorderThickness="2" BorderBrush="#003300" 
                                    FillColor="Maroon" EmptyColor="LightGreen" />

This is how the control now looks. Note that both colours have changed as per our definition

The second form of customization is what makes Custom Controls so much more powerful than ordinary User Controls. Lets assume that a rectangular fill bar doesnt suit my requirement and my application would look better with a FilledCircle rather than a FilledBar. Rather than writing the entire control again just for one change, I could just swap out the Rectangle and substitute it with an Ellipse whose Fill is done by a RadialGradientBrush. These changes do not require any changes to the original control itself – a style can be created in the Resource Dictionary and referred to by the code.

            <Style TargetType="{x:Type FilledBar:FilledBarControl}" x:Key="FilledCircle">
                <Setter Property="Template">
                    <Setter.Value>
                        <ControlTemplate TargetType="{x:Type FilledBar:FilledBarControl}">
                            <UniformGrid Columns="1" Width="{TemplateBinding Width}" Height="{TemplateBinding Height}" >

                                <Ellipse Height="100" Width="100" Stroke="{TemplateBinding BorderBrush}" StrokeThickness="{TemplateBinding BorderThickness}"
                            >
                                        <Ellipse.Fill>
                                            <RadialGradientBrush >
                                                <GradientStop Color="{Binding Path=FillColor}"
                        Offset="0"/>
                                                <GradientStop Color="{Binding Path=FillColor}"
                        Offset="{Binding ElementName=slider, Path=Value}"/>
                                                <GradientStop Color="{Binding Path=EmptyColor}" 
                        Offset="{Binding ElementName=slider, Path=Value}"/>
                                            </RadialGradientBrush>
                                        </Ellipse.Fill>
                                    </Ellipse>
                              

                            <Slider x:Name="slider" Width="200" Height="50" 
            Minimum="0" Maximum="1" Value="0.2"/>
                        </UniformGrid>
                        </ControlTemplate>

                    </Setter.Value>
                </Setter>
                
            </Style>

The only changes in the above style tag are the changing to ellipse and RadialGradientbrush. Note that instead of using the Border, we just bind the BorderBrush and BorderThickness properties to the Stroke and StrokeThickness properties of the Ellipse. This allows setting of the properties just as we did with the unchanged controls and is much more easier to read. The control is declared in xaml with the style tag referring to our user dictionary which overrides the default style written in the Generic.xaml

        <FilledBar:FilledBarControl HorizontalAlignment="Center" VerticalAlignment="Center" 
                                    Grid.Row="3" Grid.Column="1"  
                                    Style="{StaticResource ResourceKey=FilledCircle}"  
                                    Height="200" Width="200" 
                                    BorderThickness="2" BorderBrush="Brown" 
                                    FillColor="Blue" EmptyColor="Transparent">

Here is how the control looks now.

We can have multiple instance of the same control using different styles and values for Dependency Property. An example of the Host Application

Dec 272010
 

Since the last post, I changed the library/API wrapper a bit. I removed all the ugly reflection stuff to retrieve the specific API urls and substituted them with static variables in a separate class. However this does have the added disadvantage that the urls are exposed to the client, but at least it wont break any client code if Socialcast decides to change the API in the future. Also in the previous example, the username, password and subdomain are variables in the wrapper itself. In the absence of oAuth, every call needs to be authenticated with the user credentials. To avoid having to handle the responsibility of storing user information, I created a class to encapsulate this information (SocialcastAuthDetails) which is passed to the API Accessor for every call. I also added the data objects to return strongly typed responses from the API accessor instead of an XmlDocument, but havent gotten around to incorporate them yet.

Here is the code to Post a message and Get the company stream. Accessing the Company Stream requires two calls – first to get the Stream ID and the next to get the messages for the particular stream.

        public XmlDocument GetCompanyStream(SocialCastAuthDetails auth)
        {
            XmlDocument streams = new XmlDocument();
            if (companyStreamID == 0)
            {
                streams.LoadXml(base.MakeServiceCalls(helper.GetSocialcastURL(ObjectType.Streams,auth.DomainName,null), GetCredentials(auth.Username,auth.Password)));

                foreach (XmlNode node in streams.GetElementsByTagName("stream"))
                {
                    if (node.SelectSingleNode("name").InnerText.ToLower() == "company stream")
                    {
                        companyStreamID = int.Parse(node.SelectSingleNode("id").InnerText);
                        break;
                    }
                }
            }
            streams = new XmlDocument();
            streams.LoadXml(base.MakeServiceCalls(
                                 helper.GetSocialcastURL(ObjectType.StreamMessages,auth.DomainName,companyStreamID.ToString()),
                                 GetCredentials(auth.Username,auth.Password)));
            return streams;
        }

        public XmlDocument PostMessage(string title,string body,SocialCastAuthDetails auth)
        {
            string data = String.Format("message[title]={0}&message[body]={1}", HttpUtility.UrlEncode(title), HttpUtility.UrlEncode(body));
            XmlDocument update = new XmlDocument();
            update.LoadXml(base.MakeServiceCallsPOST(
                                helper.GetSocialcastURL(ObjectType.Messages, auth.DomainName, null),
                                GetCredentials(auth.Username, auth.Password), data));
            return update;
        }

Since any messages which manipulates data requires a POST call instead of a GET, the WebServiceHelper class needs a new method to make the service call using POST. Also the data which is to be posted is URL encoded before being sent to this method.

  protected string MakeServiceCallsPOST(string _requestURL, NetworkCredential credentials, string data)
        {
            // Create the web request
            HttpWebRequest request = WebRequest.Create(_requestURL) as HttpWebRequest;

            request.Credentials = credentials;
            request.ContentType = "application/x-www-form-urlencoded";
            request.Method = "POST";

            byte[] bytes = Encoding.UTF8.GetBytes(data);

            request.ContentLength = bytes.Length;
            using (Stream requestStream = request.GetRequestStream())
            {
                requestStream.Write(bytes, 0, bytes.Length);

                using (WebResponse response = request.GetResponse())
                {
                    using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                    {
                        return reader.ReadToEnd();
                    }
                }
            }
        }

This is the client code to post the message. The socialcast auth details class is initialized by the client and sent, so its their headache to maintain passwords and other sensitive information.

    class Program
    {
        static SocialCastAuthDetails auth = new SocialCastAuthDetails()
        {
            DomainName = "demo",
            Username = "emily@socialcast.com",
            Password = "demo"
        };
        static void Main(string[] args)
        {
            int _messageCounter=1;
            APIAccessor api = new APIAccessor();
            api.PostMessage("Posting from API", "this is a test message posted through C#", auth);
            var xdoc = api.GetCompanyStream(auth);
            Console.WriteLine("Company Steam of demo.socialcast.com");
            Console.WriteLine("******************************************************");
            foreach(XmlNode node in xdoc.GetElementsByTagName("message"))
            {
                Console.WriteLine("Message {0} posted by {1}", _messageCounter++, node.SelectSingleNode("user/name").InnerText);
                Console.WriteLine("Message: {0} {1}", node.SelectSingleNode("title").InnerText, node.SelectSingleNode("body").InnerText);
                Console.WriteLine("====================================================");
            }
        }
    }

It works!!

Oct 202010
 

For the last few months, I am almost exclusively using Ubuntu as my primary operating system. Due to this my .NET development suffered quite a bit and was limited just to work. To find a way around I installed Mono and played around with it.

Though a lot of people believe that .NET is completely tied with the Windows API, that is not exactly true. If we look at the source code of the .NET runtime (released by Microsoft under the name Rotor/SSCLI), there is a seperate layer called the Platform Adaption Layer (PAL) written which acts as an abstraction between the OS and runtime. This layer allows people to write their own implementations of the CLR to translate CIL into native code.

The mono project is one such effort which has ported .NET to both Mac and Linux environment. Even though Mono can work on Linux, I wouldnt really recommend using it for a production application because you might end up with bugs and small irritants while running it on linux which you can ill afford for a commercial application. So if you want to write once, run anywhere, Java is still your best bet.

To install Mono, first the repository needs to be added to the package manager. The mono packages are available here and a guide to add them is given here. After the source is added, click reload to update the list. The use the familiar apt-get command to install Mono. If that doesnt work enable the multiverse repositories or you can use the GUI Synaptic package manager.

Now mono is the application which can execute a .NET exe file. It however cannot compile the code. For that there is another application called mcs. To install it use the following command.

Now for the obligatory hello word program. Just write a simple program and save it as a text file.

Now lets compile it using mcs. The -v switch gives more detailed information in case a parsing error occurs.

If you observe the folder, you will see an exe file created with the same name as the source code file. Now, Linux has no idea how to run an exe without a compatibility layer (cough: emulator cough:) like WINE. So we can compare mono to be an very specific emulator only for .net applications. Even inside Windows, .NET assemblies are loaded differently than other windows dlls. The PE header contains an additional entry about the CLR version and the execution is transferred to the runtime rather than the operating system handling it. I think a similar process happens when we run the exe using mono. Here is the command.

The output of the program is displayed in the console.

May 082010
 

When I looked at the first post in this series, I realized had jumped the gun a bit by going straight to generics and didn’t do enough justice to the fundamentals. So in this post, I have made an effort to go back to the basics.

The CLR (Common Language Runtime) is the heart of .NET. It’s the virtual execution system responsible for converting the platform neutral CIL (Common Intermediate Language) into platform specific and optimized code. The CLR provides services like memory management, garbage collection, exception handling and type verification. Thus it allows language designers to concentrate solely on outputting good CIL and provides a uniform API to allow language interoperability.

The only truth in .NET is the assembly code which is the final product. All the rest are virtual constraints enforced by the execution system in a very robust manner. For e.g. A memory address declared as int cannot take a string value, not because the memory is not able to take the value, but rather the CLR makes sure the values conform to the types declared – a feature called type verification. Usually this happens at the compilation level, but still rechecked at runtime.

Microsoft released the code of the CLI (Common Language Infrastructure) under the name SSCLI (Shared Source CLI). It can be downloaded here. Joel Pobar and others wrote a great book about it. Unfortunately the 2.0 version is still a draft.

Type safety is the most important aspect of .NET programming and a lot of thought went into it. A type can be thought of as a contract which the objects need to conform to. For e.g. in the following code, the Person class is a type – which is supposed to have five public variables and one method. Any object that claims itself to be a person type must necessarily fulfill this contract – or the CLR will reject it during runtime. While working with the more mature compilers like C# and VC++, these checks are already done while converting the code to CIL.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Person
{
    public string _name;
    public int _ssn;
    public char _middleName;
    public decimal _phoneNumber;
    public char _bloodGroup;
 
    public Person(string name, int ssn, char middleName, decimal phoneNumber, char bloodGroup)
    {
        this._name = name;
        this._ssn = ssn;
        this._middleName = middleName;
        this._phoneNumber = phoneNumber;
        this._bloodGroup = bloodGroup;
    }
 
    public string GetSomeDetails()
    {
        return String.Empty;
    }
}
 
static void Main(string[] args)
{
    Person _p = new Person("John Doe", 4454353, 'B', 324242432, 'O');
    _p.GetSomeDetails();
}

The code for object class can be found at sscli20\clr\src\vm\object.h in the SSCLI code and the Type class at \sscli\clr\src\vm\typehandle.h. The Type class which is extensively used in Reflection for reading the type metadata is a wrapper for this TypeHandle class. Lets look at the underlying code for some familiar methods of the TypeHandle, some of which you see in Type Class. So every object that declares itself of this type, indirectly points to the data structure to define itself.

1
2
3
4
5
6
7
8
    BOOL IsEnum() const;
    BOOL IsFnPtrType() const;
    inline MethodTable* AsMethodTable() const;
    inline TypeDesc* AsTypeDesc() const;
    BOOL IsValueType() const;
    BOOL IsInterface() const;
    BOOL IsAbstract() const;
    WORD GetNumVirtuals() const;

The MethodTable that you see is the datastructure which contains the frequently used fields needed to call the methods. Along with another structure called EEClass, it defines the type identity of an object in .NET. The difference is that the MethodTable contains data that is frequently accessed by the runtime while the EEClass is a larger store of type metadata. This metadata helps querying type information and dynamically invoking methods using the Reflection API. Using the SOS dll’s DumpHeap command, the address of all the types can be gotten and used to see the EEClass and MethodTables. Lets examine the Person type in the above example.

.load SOS
extension C:\Windows\Microsoft.NET\Framework\v2.0.50727\SOS.dll loaded
 
!DumpHeap -type Person
PDB symbol for mscorwks.dll not loaded
 Address       MT     Size
020a34d4 001530f0       36
total 1 objects
Statistics:
      MT    Count    TotalSize Class Name
001530f0        1           36 DebugApp.Person
Total 1 objects
 
//Getting the address and using the DumpObj command
 
!DumpObj 020a34d4
Name: DebugApp.Person
MethodTable: 001530f0
EEClass: 001513d0
Size: 36(0x24) bytes
 (D:\Ganesh Ranganathan\Documents\Visual Studio 2005\Projects\DebugApp\DebugApp\bin\Debug\DebugApp.exe)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
70d00b68  4000001        4        System.String  0 instance 020a34b0 _name
70d02db4  4000002        8         System.Int32  1 instance  4454353 _ssn
70d01848  4000003        c          System.Char  1 instance       42 _middleName
70cd7f00  4000004       10       System.Decimal  1 instance 020a34e4 _phoneNumber
70d01848  4000005        e          System.Char  1 instance       4f _bloodGroup

Lets dissect this output. First the DumpObj command lists both the MethodTable and the EEClass address and the proceeds to list the fields . See how the value column lists the direct value for the int and char fields while the address is listed for the reference type string. However the bigger decimal type, which actually is a struct and hence a value type, displays the reference. Though SOS displays the reference, we can observe that the address is actually an offset from the object header, which means that it is still stored by value and not the reference. Looking at the memory window for the string and decimal fields’s address gives their original values.


Viewing the object in the memory window shows a pattern of how the runtime stores the values in memory. The object starts with a reference to the MethodTable, then the fields are lined up. It can be observed that there is a difference in how the runtime stores the values of the fields and how we defined them. For e.g. The two character fields are pushed together in spite of not being declared sequentially. This is done to save memory and the runtime is able to manage this situation because all it stores is the memory offset of the fields from the header. To avoid this behavior, types can be decorated with the [StructLayout(LayoutKind.Sequential)] attribute, often used while marshalling data out of managed code, because unmanaged code cant deal with such vagaries. You should also pin your objects, especially references while passing them to unmanaged code, because the runtime keeps moving the memory blocks around.

Now lets look at the MethodTable through SOS. As you can see, every type also inherits the methods from its parent, in this case System.Object. The MethodTable also contains a pointer to the EEClass. When it is laid out during type creation, the method points to a temporary piece of code called a thunk. The thunk in turn calls the JIT compiler and asks it to compile the method. This lazy compilation works wonders for performance and the memory footprint. Once the method is compiled the JIT updates the method to point to the compiled code instead of the thunk.

!DumpMT -MD 001230f0
EEClass: 001213d0
Module: 00122c5c
Name: DebugApp.Person
mdToken: 02000005  (D:\Ganesh Ranganathan\Documents\Visual Studio 2005\Projects\DebugApp\DebugApp\bin\Debug\DebugApp.exe)
BaseSize: 0x24
ComponentSize: 0x0
Number of IFaces in IFaceMap: 0
Slots in VTable: 6
--------------------------------------
MethodDesc Table
   Entry MethodDesc      JIT Name
70c56aa0   70ad4a34   PreJIT System.Object.ToString()
70c56ac0   70ad4a3c   PreJIT System.Object.Equals(System.Object)
70c56b30   70ad4a6c   PreJIT System.Object.GetHashCode()
70cc7550   70ad4a90   PreJIT System.Object.Finalize()
0012c030   001230c4      JIT DebugApp.Person..ctor(System.String, Int32, Char, System.Decimal, Char)
0012c038   001230d4     NONE DebugApp.Person.GetSomeDetails()

You can see the JIT column says none for the GetSomeDetails method and thats because it hasnt been called yet. After its called for the first time, the method is JIT compiled and the MethodDesc shows the code address where the compiled code can be found. Note however, that the MethodDesc is not the usual route for the runtime to execute methods, it is rather done directly. Only when the method is invoked by its name, is the MethodDesc required.

!DumpMT -MD 001230f0
EEClass: 001213d0
Module: 00122c5c
Name: DebugApp.Person
mdToken: 02000005  (D:\Ganesh Ranganathan\Documents\Visual Studio 2005\Projects\DebugApp\DebugApp\bin\Debug\DebugApp.exe)
BaseSize: 0x24
ComponentSize: 0x0
Number of IFaces in IFaceMap: 0
Slots in VTable: 6
--------------------------------------
MethodDesc Table
   Entry MethodDesc      JIT Name
70c56aa0   70ad4a34   PreJIT System.Object.ToString()
70c56ac0   70ad4a3c   PreJIT System.Object.Equals(System.Object)
70c56b30   70ad4a6c   PreJIT System.Object.GetHashCode()
70cc7550   70ad4a90   PreJIT System.Object.Finalize()
0012c030   001230c4      JIT DebugApp.Person..ctor(System.String, Int32, Char, System.Decimal, Char)
0012c038   001230d4      JIT DebugApp.Person.GetSomeDetails()
 
!DumpMD 001230d4
Method Name: DebugApp.Person.GetSomeDetails()
Class: 001213d0
MethodTable: 001230f0
mdToken: 06000007
Module: 00122c5c
IsJitted: yes
CodeAddr: 009f01c8

In this post we saw basic functioning of the CLR and how it creates and stores internal data structures to facilitate code execution at the same time abstracting away all the gory details from the developer and allowing him to solely concentrate on his applications. Below the hood everything is simply memory addresses pointing to each other and a bunch of assembly code. To give it such a high degree of structure and definition is by no means an easy task. Hats off to the developers in the .NET and Java teams!! Hope I am able to reach their skill levels one day. 🙂

May 042010
 

You spend a lot of time in creating an application, starting with abstractions and slowly moving to the specifics,  hours of heated debate to painstakingly design each and every intricate detail, working on all possibilities for  extensibility, even going through every line of code other developers write to make sure they stick to the design.  Yes, a lot of effort goes into creating and delivering a production quality application which satisfies all the  stakeholders.

Ideally, thats when a developer should say goodbye to the application vowing never to see the code again. Coz, if  he does, he’d end up mighty disappointed. Future developers would have literally pillaged the application to  systematically eliminate any traces of the original design, the code would be littered with dirty bug fixes and so -called enhancements that don’t “enhance” anything. The customer doesn’t care because the application still works,  the managers don’t care because they meet their revenue targets and the timelines, and the developers never cared  in the first place.

Keeping that in mind, I listed down a few ways which might help save your application from meeting the same fate.

  • Keep it Simple:  The oft repeated rule in software design is “Keep everything as simple as possible, but not  simpler”. If the design is too complex, then in all probability no developer would take the pains to go through it  and understand the whole thing. Many software engineers are guilty of trying to force fit their code to use a certain design pattern. Instead,look for the pattern which best fits your requirement
  • Keep it Modular: Dividing your application into logically seperated projects makes it easier to identify the right  places to make the changes. Proper namespaces help too. For e.g. an assembly with the namespace  ABCApplication.HelperObjects.FileIOOperations leaves no space for doubt about what it does.
  • Keep it Visual: Sure, you might find the 1000 page SMTD an interesting read, but not everyone else will. Remember a  bored developer would be the most likely to introduce a dirty fix in the code. So, to keep things interesting, go  for easy to understand graphics and images. MS Visio is a great tool to model your application, and it supports  automatic code generation as well (Enterprise Architect version). And after creating the model, dont leave it in a  random location that even you would forget in 2 weeks. My advice is to check it in the Source control along with  the code and add it to the Visual Studio Solution.
  • Keep it Commented: Good commenting is the most potent weapon in your arsenal to guide/warn/ prohibit future  developers from making changes they shouldnt be doing. XML comments are a great way to maintain uniform commenting  standards across the application and can also be used to generate comment documentation automatically (using the  /doc switch while compiling) . A good way to convey how you intended the code to be used is to include a sample code block in the xml comments itself. Keep it mind though, that the comments are stripped out by the compiler, so  if you want comments to appear in the intellisense while using the assemblies, then the generated documentation  file must be packaged along with the assembly.
Apr 112010
 

Anyone who has written a fairly complex web application would have experienced the quirks of JavaScript. Though immensely powerful and probably the only way to write good client side code, JavaScript code can get difficult especially while doing complex DOM navigation and including cross browser support (Many browsers see the DOM differently).

The best possible solution to harness the power of JavaScript and making the code fun to write is to use client side libraries which do the dirty work for you. Arguably the most popular one available today is JQuery. It makes code both easy to read and write. Here are some cool things you can do with JQuery. To get started download jQuery at jquery.com, and reference the script file before any your custom scripts.

1
2
3
4
5
<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
<script type="text/javascript" src="Scripts/jquery.js" language="javascript" ></script>
<script type="text/javascript" src="Scripts/myscript.js" language="javascript" ></script>
</head>
  • Code which runs on page load: The classic way to do this was using the window.onload event. However, the onload event waits for all of the page to get loaded, including images. This can make the user wait for a long time before the events fire. jquery has an alternative – document.ready which fires as soon as the DOM gets loaded. In heavy pages, this can dramatically increase user experience
1
2
3
$(document).ready(function() {
    alert("Hello jquery");
});
  • Hooking up dynamic event handlers: Hooking up Javascript events was a pain. Now its surprisingly easy. Note how the document.getElementById has been replaced by the $. In this example we hook up a dynamic event handler for a button click event which gets the value from a textbox.
1
2
3
4
5
6
7
8
$(document).ready(function() {
    //Getting value of button and assigning an event
    //handler using an anonymous method
    $('#btnSayHello').click(function() {
        alert('Hello ' + $('#txtName').val());
    });
 
});
  • Animation made easy: Animation has always difficult in Javascript and pushed people towards Flash/Silverlight. JQuery’s animate API makes things almost too easy. In the example below, a div’s size is increased, its moved towards the left, made slightly transparent
1
2
3
4
<input type="button" id="btnAnimate" value="Animate" />
<div id="containerDiv" >
This is the text which would be animated
</div>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$(document).ready(function() {
    //The animation is hooked on to
    //the Animate Button's click handler,
    //A callback function is specified to alert
    //when the animation is over
    $('#btnAnimate').click(function() {
        $('#containerDiv').animate({
            opacity: 0.6,
            marginLeft: '+=2in',
            fontSize: '3em'
        }, 1000, function() {
            alert('animation complete');
        });
    });
});
  • AJAXify your application almost instantly: Writing raw AJAX code was just too much of a hassle considering the time you spent on ironing out the browser differences rather than your core application logic. JQuery makes AJAX extremely simple and takes care of the background work of creating the xmlhttp object, making the call and giving you back the result. In the below example, I created a simple autocomplete textbox that makes the suggestions dynamically based on what you type. This would have taken atleast 150 lines if written in plain-ol javascript.

HTML Code

1
2
3
<span>Enter Country Name</span><br />
<input type="text" id="txtCountry" style="width:250px" /><br />
<span id="autoList" style="width:250px;display:block" />

Javascript code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
$(document).ready(function() {
    //Hooking the event handler to the keyup function
    $('#txtCountry').keyup(function() {
        //Only call AJAX function if atleast 3 characters are there
        $('#autoList').empty();
        if ($('#txtCountry').val().length > 2) {
            $.ajax( //Jquery AJAX api
        {
        url: "GetCountriesSuggestion.aspx",
        //Send in data as a query string
        data: "country=" + $('#txtCountry').val(),
        cache: false,
        async: true, //Make an asynchronous req
        datatype: "xml",
        //This is a callback function
        success: function(xml) {
            $('#autoList').empty();
            if ($(xml).find('Country').length > 0) {
                var ul = $('<ul></ul');
                //Creating a border and removing the bullets
                //that appear in an unordered list by default
                ul.css('border', 'solid 1px black').css('list-style', 'none');
                ul.css('left', '0px');
                //Find the country element and iterate
                //through each element
                $(xml).find('Country').each(function() {
                    //Creating a list item element and hooking up
                    //its click handler
                    var li = $('<li>' + $(this).text() + '</li>').css('cursor', 'pointer')
                    li.css('left', '0px');
                    li.hover(function() {
                        //This changes the color and background color
                        //of the suggestion box when the mouse is taken over it
                        //The hover method takes in two functions- one invoked on
                        //mouseover and one on mouseout
                        $(this).css('color', '#FFFFFF');
                        $(this).css('background-color', '#0000FF')
                    }, function() {
                        $(this).css('color', '#000000');
                        $(this).css('background-color', '#FFFFFF')
                    });
                    li.click(function() {
                        //Set the clicked item to textbox text and empty the
                        //Collecttion list
                        $('#txtCountry').val($(this).text());
                        $('#autoList').empty();
                    });
                    ul.append(li);
                });
                $('#autoList').append(ul);
            }
        }
    });
        }
    });
});

ASP.NET Code Behind in C#

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
protected void Page_Load(object sender, EventArgs e)
        {
            if (!String.IsNullOrEmpty(Request.QueryString["country"]))
            {
                Response.Write(GetCountriesList(Request.QueryString["country"]));
            }
        }
        private string GetCountriesList(string countryName)
        {
            //creating a string builder object
            StringBuilder _response = new StringBuilder();
            XmlDocument _xDoc = new XmlDocument();
            //Load the xml file with all the countries
            _xDoc.Load(Server.MapPath("Countries.xml"));
            //Get all country names in the xml file
            XmlNodeList _allCountries = _xDoc.GetElementsByTagName("Entry");
            //Writing out xml declaration and root tag
            _response.Append("<?xml version=\"1.0\" encoding=\"ISO-8859-1\" standalone=\"yes\"?><Countries>");
            //Iterating through each country
            foreach (XmlNode _country in _allCountries)
            {
                //If what user entered matches the country name,
                //create a dynamic xml node and append
                if ((countryName.Length <= _country.FirstChild.InnerText.Length) &&( countryName.ToUpper().Equals(_country.FirstChild.InnerText.Substring(0, countryName.Length))))
                    _response.Append("<Country>" + _country.FirstChild.InnerText + "</Country>");
            }
            //Append closing root tag
            return _response.Append("<Countries>").ToString();
        }
    }
Mar 122010
 

I have always believed that strong typing is the holy grail of .NET, which is not to be messed with, and it has been my primary grouse with VB.NET is that it uses sneaky workarounds to circumvent the typing rules of the CLR. C# for most of its initial existence followed static typing religiously with slight changes being seen in 3.0 with the var keyword. But in 4.0, everything changed with the introduction of the Dynamic Language Runtime (DLR).

The DLR according to wikipedia is “an ongoing effort to bring a set of services that run on top of the Common Language Runtime (CLR) and provides language services for several different dynamic languages.” As the definition says, it is independent of the CLR and adds no new OpCodes to the IL. Languages like C# used the DLR to introduce dynamic typing while maintaining the existing mechanism of statically determining types.

The dynamic keyword is the C# construct introduced for this. In short – it tells the compiler that the call to this method is to be resolved at runtime and not to be bothered by throwing compiler errors. Lets see a simple example for the usage of the dynamic keyword. Suppose you have a book class which has four main properties – Author, Publisher, Price and Number Of Pages. However, each book may have a lot of other properties as well, which you wont know at design time. So the question arises, how to store the additional information? The first answer that would come to mind is storing them in a collection class and later retrieving it. The dynamic class provides you with a neat way of doing this shown in the code below :-

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class Program
    {
        static void Main(string[] args)
        {
            // The dynamic keyword bypasses any compile time checking for this object
            dynamic _daVinciCode = new Book("Sample Author", "SomePublisher", 250, 450);
 
            /*** EXTRA PROPERTIES - NOT PRESENT IN THE OBJECT***/
            _daVinciCode.BookStores = new string[] { "Landmark", "Oddyssey", "Crosswords" };
            _daVinciCode.CitiesAvailable = new string[] { "Delhi", "Bangalore", "Chennai" };
            _daVinciCode.ExtraVat = 45;
 
            /*** PRINTING OUT EXTRA PROPERTIES VALUE ***/
            Console.WriteLine(_daVinciCode.ExtraVat);
        }
 
    }
 
    /// <summary>
    /// This is our dynamic class. It defines 4 concrete properties
    /// and a dictionary class for storing any other property values
    /// as well. The abstract class DynamicObject is implemented
    /// </summary>
    public class Book:DynamicObject
    {
        //Our four defined properties
        public string Author { get; private set; }
        public string Publisher { get; private set; }
        public double Price { get; private set; }
        public int NumberOfPages { get; private set; }
 
        //Constructor - Parametrized
        public Book(string _author, string _publisher, double _price, int _numberOfPages)
        {
            this.Author = _author;
            this.Publisher = _publisher;
            this.Price = _price;
            this.NumberOfPages = _numberOfPages;
        }
 
        //This collection object stores all the extra properties
        public Dictionary<string, object> _extraProperties = new Dictionary<string, object>();
 
        //At runtime this method is called in order to bind the propertyname to a Getter
        public override bool TryGetMember(GetMemberBinder binder, out object result)
        {
            return _extraProperties.TryGetValue(binder.Name.ToLower(), out result);
        }
 
        //At runtime this method is called in order to bind the propertyname to a setter
        public override bool TrySetMember(SetMemberBinder binder, object value)
        {
            _extraProperties.Add(binder.Name.ToLower(), value);
            return true;
        }
    }

The Book class is a dynamic type and derives from the abstract class DynamicObject. This base class can help determine how binding should occur at runtime. In our example, the dictionary object stores all the additional properties of the book class. The two overriden methods TryGetMember and the TrySetMember are responsible for the binding of the new properties to the dictionary object. The first time a property on a dynamic type is encountered, the DLR binds the property and then caches its address. So any subsequent calls are faster.

More on the DLR in future posts.

Mar 092010
 

I have always believed that when it comes to blogging, brevity is soul and its very important to convey your idea in as less words as possible. Of course with technical blogs its an altogether different ball game with most of the visitors coming through search engines and looking for specific topics, where you can be afford to be verbose. But with a mixed audience and a variety of blog subjects, it is extremely important to keep it as short and simple as possible to ensure maximum reach.

To prove this point, I set out to find if there is a relation between the length of each post and the comments it receives. Though, the number of comments is not the best yardstick to measure the popularity of the post, it certainly is the most concrete one. The best place to try this out seemed to be my organization’s internal blogosphere, which had a lot of sample data (3500+ posts). Since its impossible to manually collect data for that many posts, I wrote a program to write it to a text file from where any correlation could be identified. Its a mixture of retrieving the data through HTML and RSS and parsing it. Though it should be compatible with any WordPress version, the comment counting function might need some tweaking to make it work on later versions. Once the code finishes executing, you would have a text file with the name of each post, the length and the number of comments. The data in this file can be imported into Excel for further manipulation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
using System;
using System.Xml;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;
 
public class TestRSS
{
    StreamWriter _swObj = new StreamWriter("Results.txt", true);
    public const string _feedURL = "http://blog.ganeshran.com";
 
    public static void Main()
    {
        new  TestRSS().RunTests();
    }
 
    public void RunTests()
    {
        for (int i = 1; ; i++) //This is an infinite loop only to be broken when there are no more posts
        {
            XmlDocument _xdoc = new XmlDocument(); //a New XmlDocument object
            //Lets load the XML from the url
            _xdoc.LoadXml(GetXmlData(_feedURL + (i>1?"?paged=" + i+"&":"?" )+"feed=rss2"));
            XmlNodeList _xList = _xdoc.GetElementsByTagName("item");
            if (_xList.Count == 0)
                break; //This means there are no more blog posts
            foreach (XmlNode _tempNode in _xList)
                //This method writes the name of post, length and the number of comments delimited by a colon
                //So we need to remove the colon in case of any being present/
                //Example Data: - ASP.NET Evolution WebForms v/s MVC: 6060:0
                WriteToFile(_tempNode.FirstChild.InnerText.Replace(":", "") + ": " + _tempNode.SelectSingleNode("description").NextSibling.InnerText.Length + ":" + GetComments(_tempNode.FirstChild.NextSibling.InnerText));
        }
    }
 
    public string GetXmlData(string _url)
    {
        //An ordinary retrieval of data from the url using the HttpWebRequest
        //This is a workaround for sites with badly formed RSS feeds due to script tags.
        //Once we get the data, we can load it into an XmlDocument class
        HttpWebRequest _blogReq = (HttpWebRequest)WebRequest.Create(_url);
        HttpWebResponse _blogResp = (HttpWebResponse)_blogReq.GetResponse();
        StreamReader _respStream = new StreamReader(_blogResp.GetResponseStream());
        //Replace Script tags
        return Regex.Replace(_respStream.ReadToEnd(), @"<script[^>]*?>[\s\S]*?<\/script>", "");
    }
 
    public int GetComments(string _url)
    {
        //Retrieving the Comments from the HTML and not the RSS feed. I wasnt able
        //to find a workaround for 10 comment limit in the RSS feed in wordpress 2.3,
        //Hence retriving the HTML and matching the comment divs. Keep in mind this wont work
        //for higher Wordpress versions. Just take the comment tag and substitute accordingly
        Match _match = Regex.Match(GetXmlData(_url), "<div class=\"mycomment\"[^>]*?>[\\s\\S]*?<\\/div>");
        int _counter=0;
        for (; _match.Success; _counter++)
            _match = _match.NextMatch();
        return _counter;
    }
 
    public void WriteToFile(string _tobeWritten)
    {
        //Just write it to the file
        _swObj.WriteLine(_tobeWritten);
        _swObj.Flush();
    }
}