Jul 122012

Oh how I wish there was a out of the box solution to do this where you could just enter the source and destination urls and voila! the entire blog was migrated. Unfortunately there isnt 🙁 (or atleast I am not aware of one).

So when I set out to migrate around a 1000 posts from 16 different Sharepoint blogs to WordPress, there were quite a few challenges on the way. The first step is to read the Sharepoint data.This would have been far easier with the proper database access to sharepoint db but unfortunately there was way too much red tape wrapped around it to get the access quick enough to start immediately, hence I took the dirty way out – to scrape the HTML pages and retrieving the data out of it. Scraping HTML is the usually the worst possible way to extract data from any source since the code is rarely ever reusable, bloated to handle nested tags and often filled with branching statements to handle special cases. Only go for the scraping approach if you don’t have any other way to access the data. What makes scraping bearable is the wonderful HTML Agility pack which I had blogged about earlier. Its XMLish approach to traverse HTML makes this activity quite easy.

This is the object that I used to represent each blog to be imported. It contains the source URL, destination blog, author Id. Each object represents the source site as well as the destination information needed to write to WordPress.

   class BlogsToBeMigrated
        public string Url { get; set; }
        public int SiteID { get; set; }
        public int AuthorId { get; set; }
        public string DestinationBlogName { get; set; }
        public BlogsToBeMigrated(string url, int siteId, int authorID,string destination)
            Url = url;
            SiteID = siteId;
            AuthorId = authorID;
            DestinationBlogName = destination;

This is the code to read each blog entry. Note that it takes in each blog object and extracts information out of it. The concatenation step would depend on how your sharepoint site is structured. Just make sure it points to the list view of the blog where all posts are listed in a tabular form. This helps us to take out each link and get the content to add in WordPress. To get the various xpath to navigate to the nodes, I used the chrome extension Xpath helper. This could be different for every sharepoint site. Just play around with the xpath till you get the required information

private static void MigrateSingleBlog(BlogsToBeMigrated blog)
    var siteUrl = String.Concat("YOUR SITE URL HERE", blog.Url, "/Lists/Posts/AllPosts.aspx");
    string siteList = ReadWebPage(siteUrl);
    var listDoc = new HtmlDocument();
    var siteListNodes = listDoc.DocumentNode.SelectNodes("//td[@class='ms-vb-title']/table/tr/td/a");

    foreach (var site in siteListNodes)
        var postUrl = String.Concat("YOUR SITE URL HERE", site.GetAttributeValue("href", "href"));
        var PageDump = ReadWebPage(postUrl);
        var postDoc = new HtmlDocument();
        var siteContent = postDoc.DocumentNode.SelectSingleNode("//div[@class='ms-PostWrapper']");

        string postDate = siteContent.ChildNodes[0].InnerText;
        string postTitle = siteContent.ChildNodes[1].ChildNodes[0].ChildNodes[0].InnerText.Trim();
        var postContent = postDoc.DocumentNode.SelectSingleNode("//div[@class='ms-PostWrapper']/div[@class='ms-PostBody']/div");

        var postHtml = postContent.InnerHtml;

        MoveToWordPress(postDate, postTitle, postHtml, blog.Url,blog.SiteID,blog.AuthorId,blog.DestinationBlogName);
        Console.WriteLine(postDate + postTitle);


Also remember to put this line in the constructor of your class before any of the HTML agility pack code is executed. This is needed because forms can be tricky elements in HTML due to their overlapping between tags which makes it difficult to parse the markup. This makes HTML Agility pack parse form tags as empty elements and the below line allows you to look inside them.


Now that I had access to all the sharepoint data the challenge was to enter this is in the right WordPress blogs. I took a look at CSV importer which allows bulk import of posts from CSV files. This step didnt work properly at all since the post content was way too large for a CSV file to be parsed properly. After numerous attempts to sanitize the CSV and escape each linebreak and comma, this step still ignored many valid posts and also filled gibberish in others. Then I thought of directly inserting the data in the WordPress database. Initially I was skeptical since wordpress might fill some related tables when a post was published, but found that there were no such problems. Directly inserting the data worked like a charm. It also allowed me to migrate the data multiple times each time I noticed an issue with improperly rendered markup

This is the method that enters in the WordPress blog table directly. Note that the multisite installation means different post tables which have a number in the table name. e.g. WP_2_posts, WP_3_posts etc. The index for the table name was in the BlogToBeMigrated object as I had manually created each wordpress blog which corresponded to a sharepoint blog. This step is fairly simple. It just creates a connection to the MySql database using the connector dlls, gets the maximum ID, increments it and uses that to insert a new post. The code isnt production standard but this isnt really something that I am looking to maintain for a long time. Till the migration is done right, we can just keep repeating with the required fixes and once its finished – you have a functioning site with no need to migrate anymore. Pragmatism wins.

private static void MoveToWordPress(string postDate,string postTitle,string postContent,string postUrl, int blogID, int authorID,string destinationBlogName)
   //Remember to include ConvertZeroDateTime=true in the connection string
    MySqlConnection wordpressConn = new MySqlConnection("DATABASE_CONNECTION STRING; ConvertZeroDateTime=true");

    using (wordpressConn)
        int maxID = 0;
        var id = new MySqlCommand(String.Format("Select Max(ID) from WP_{0}_Posts", blogID), wordpressConn).ExecuteScalar();
        if (id.GetType() == typeof(System.DBNull))
            maxID = 1;
            maxID = Convert.ToInt32(id);

        string SQLCommandText = "Insert into wp_{0}_posts (id,post_author,post_date,post_content,post_title,post_excerpt,post_status,comment_status,ping_status,post_name,to_ping,pinged,post_modified,post_modified_gmt,post_content_filtered,post_parent,guid,menu_order,post_type,comment_count)";
        SQLCommandText += " values (?ID,{1},?postDate,?postBody,?postTitle,'','publish','open','open',?postName,'','',?postDate,?postDate,'',0,?postGuid,0,'post',0)";
        HtmlDocument span = new HtmlDocument();

        MySqlCommand insertPost = new MySqlCommand(String.Format(SQLCommandText, blogID, authorID), wordpressConn);

        var siteName = "YOUR_BLOG_URL_HERE/{0}/files/{1}";
        var imagenodes = span.DocumentNode.SelectNodes("//a/img");
        if (imagenodes != null)
            foreach (var image in imagenodes)
                var imageUrl = image.ParentNode.GetAttributeValue("href", "href");
                var imageThumbUrl = image.GetAttributeValue("src", "src");
                if (imageUrl.Contains("/sites"))
                    var migratedFileName = imageUrl.Replace(String.Concat(postUrl, "/Lists/Posts/Attachments/"), string.Empty);
                    postContent = postContent.Replace(string.Format("href=\"{0}", imageUrl), String.Format("href=\"{0}", String.Format(siteName, destinationBlogName, migratedFileName)));
                    postContent = postContent.Replace(string.Format("src=\"{0}", imageThumbUrl), String.Format("src=\"{0}", String.Format(siteName, destinationBlogName, migratedFileName)));
                    postContent = postContent.Replace(string.Format("src=\"{0}", imageThumbUrl), String.Format("src=\"{0}", imageUrl));
            span = new HtmlDocument();
            var htmlNode = span.DocumentNode.SelectSingleNode("//span[@class='erte_embed']");
            if (htmlNode != null)
                string urlValue = htmlNode.GetAttributeValue("id", "id");
                urlValue = HttpUtility.UrlDecode(urlValue);
                //   urlValue = urlValue;
                postContent = postContent.Replace(htmlNode.OuterHtml, urlValue);

            insertPost.Parameters.AddWithValue("?ID", maxID);
            insertPost.Parameters.AddWithValue("?postDate", DateTime.Parse(postDate));
            insertPost.Parameters.AddWithValue("?postBody", postContent);
            insertPost.Parameters.AddWithValue("?postTitle", postTitle);
            insertPost.Parameters.AddWithValue("?postName", postTitle.Replace('#', '-').Replace(' ', '-'));
            insertPost.Parameters.AddWithValue("?postGuid", "YOUR_BLOG_URL_HERE/?p=" + maxID);


Note the additional processing around the image tags. This was because I had migrated the image files separately and wanted to update the image tag’s src attributes to reflect to the new path. If you plan on keeping your previous sharepoint installations up, then this step is optional since the attachments will be loaded from the sharepoint site anyway. But I would recommend migrating the images as well just for easier maintenance of the content.

The code to migrate a single image is below. The way to get all the image tags is very similar to how each blog content was retrieved. The only difference is that instead of entering in the database, we just use Agility pack to extract all image tags in the post contennt and call the below method to download it. The files are then saved in the wp-content directory and the path is updated in the migration logic.

  private static void DownloadImage(string url,string postUrl,string destination)
            string saveDir = String.Format("C:\\Wordpress_Images\\{0}\\", destination);
            string filename = string.Concat(saveDir, url.Replace(string.Concat("YOUR_SHAREPOINT_URL", postUrl, "/Lists/Posts/Attachments/"), string.Empty)).Replace("/","\\");
            FileInfo fi = new FileInfo(filename);
            if (!fi.Directory.Exists)
            //DirectoryInfo info = new DirectoryInfo(
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            request.UseDefaultCredentials = true;
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            if ((response.StatusCode == HttpStatusCode.OK ||
                response.StatusCode == HttpStatusCode.Moved ||
                response.StatusCode == HttpStatusCode.Redirect) &&
                response.ContentType.StartsWith("image", StringComparison.OrdinalIgnoreCase))

                using (Stream inputStream = response.GetResponseStream())
                using (Stream outputStream = File.OpenWrite(filename))
                    byte[] buffer = new byte[4096];
                    int bytesRead;
                        bytesRead = inputStream.Read(buffer, 0, buffer.Length);
                        outputStream.Write(buffer, 0, bytesRead);
                    } while (bytesRead != 0);
Jul 082012

I did some changes to the plugin in the last post – including an option in the admin dashboard to enable/disable shortcodes and the ability to directly add buttons to the post without having to modify the theme php file. Future changes planned are to allow admins to add the widgets through the options screen as well.

The plugin is available here and the code is below. Caution: Some more testing is required

 * Returns major.minor WordPress version.
function sc_reach_get_wp_version() {
  return (float) substr(get_bloginfo('version'), 0, 3);

function sc_add_author_stream($email, $style='width:300px;height:400px') { 
  return get_div_email($email, 'profile_container_id', $style, get_option('sc_profile_token'));

function get_div_email($email, $id, $style, $token) {
	$socialcast_url = get_option('sc_host');
	if ($id != '' && $token != '') {
		return '<div id="' . $id . '" style="' . $style .
		'"></div><script type="text/javascript">_reach.push({container: "' . $id . '", domain: "https://' 
		. $socialcast_url . '", token: "' . $token . '", email:"'. $email . '"});</script>';
	} else {
		return '';

function sc_reach_content_handle($content, $sidebar = false) {
		case 'Dont Show':
		case 'Top':
			$content =  sc_add_button() . $content;
		case 'Bottom':
			$content = $content . sc_add_button();
	if (get_option('sc_use_microdata') == 'true') {
		$purl = get_permalink();
   		$content = "<div itemscope=itemscope itemtype=\"http://ogp.me/ns#Blog\"><a itemprop='url' href='" . $purl . "' /></a>" . $content . "</div>";

	return $content;

function get_div($id, $style, $token) {
	$socialcast_url = get_option('sc_host');
	if ($id != '' && $token != '') {
		return '<div id="' . $id . '" style="' . $style .
		'"></div><script type="text/javascript">_reach.push({container: "' . $id . '", domain: "https://' 
		. $socialcast_url . '", token: "' . $token . '"});</script>';
	} else {
		return '';

/* Name: Ganesh Ranganathan 
   Date: 7th July 2012
This function has been added to generate the div from the shortcut
which can be added to any text widget or in the post itself. It includes
an additional parameter - display which lets users insert shortcodes without
knowing the token as long as they are specified in the Plugin options screen.
Please make sure they are do_shortcode is called in your theme for widget text if 
you want to include this in a widget */
function get_shortcode_div($id,$style,$token,$display){
 $tokenInOptions ='';
  if($display != '')
			case 'button':
			case 'discussion':
				$tokenInOptions= get_option('sc_discussion_token');
			case 'profile':
			    $tokenInOptions = get_option('sc_profile_token');
			case 'trends':
				$tokenInOptions = get_option('sc_trends_token');
		if($tokenInOptions != '')
		   return get_div($id,$style,$tokenInOptions);
  return get_div($id,$style,$token);

function add_reach($atts) {
	extract( shortcode_atts( array(
			'id' => 'reach_container_id',
			'style' => '',
			'token' => '',
			'display' => ''
		), $atts ) );

  return get_shortcode_div($id, $style, $token,$display);

function reach_init_method() {

  if (sc_reach_get_wp_version() >= 2.7) {
    if (is_admin ()) {
      add_action('admin_init', 'sc_reach_register_settings');
  add_filter('the_content', 'sc_reach_content_handle');
  add_filter('admin_menu', 'sc_reach_admin_menu');
  add_option('sc_host', '');
  add_option('sc_button_token', '');
  add_option('sc_discussion_token', '');
  add_option('sc_profile_token', '');
  add_option('sc_use_microdata', 'true');
  add_option('sc_trends_token', '');
  add_option('sc_show_button','Dont Show');
  add_action('wp_head', 'sc_reach_header_meta');
  add_action('wp_footer', 'sc_reach_add_js');
  add_shortcode( 'reach', 'add_reach' );
//echo "Get option" . get_option('sc_show_button');


function sc_reach_header_meta() {
  echo '<script type="text/javascript">var _reach = _reach || [];</script>';

function sc_reach_register_settings() {
  register_setting('sc_reach', 'sc_host');
  register_setting('sc_reach', 'sc_button_token');
  register_setting('sc_reach', 'sc_discussion_token');
  register_setting('sc_reach', 'sc_trends_token');
  register_setting('sc_reach', 'sc_profile_token');
  register_setting('sc_reach', 'sc_use_microdata');
  register_setting('sc_reach', 'sc_enableShortcode');

function sc_add_button($style='width:300px;height:30px') {
	return get_div('like_container_id', $style, get_option('sc_button_token'));

function sc_add_discussion($style='width:300px;height:400px') {
	return get_div('discussion_container_id', $style, get_option('sc_discussion_token'));

function sc_add_token($style='width:300px;height:400px'){
	return get_div('trends_container_id',$style,get_option('sc_trends_token'));

function sc_reach_admin_menu() {
  add_options_page('REACH Plugin Options', 'Socialcast REACH',  'activate_plugins', __FILE__, 'sc_reach_options');

function sc_reach_options() {

  <div class="wrap">
    <h2>Reach Extensions by <a href="http://www.socialcast.com" target="_blank">Socialcast</a></h2>

    <form method="post" action="options.php">
    if (sc_reach_get_wp_version() < 2.7) {
    } else {

      <p>If you are not logged in to Socialcast. Please do so with a user that has administrative credentials.
        Once there either Create an HTML Extension or select one from the list. For more information please visit the <a href="http://integrate.socialcast.com/business-systems/wordpress/">plugin page</a>.
      <h3>Socialcast Community</h3>
			<tr><td>https://</td><td><input style="width:400px" type="text" name="sc_host" value="<?php echo get_option('sc_host'); ?>" /></td></tr>
			<tr><td>Add HTML Microdata ?</td><td><input style="width:400px" type="text" name="sc_use_microdata" value="<?php echo get_option('sc_use_microdata'); ?>" />Type 'true' to use</td></tr>
			<tr><td>Button Token:</td><td><input style="width:400px" type="text" name="sc_button_token" value="<?php echo get_option('sc_button_token'); ?>" />Function: sc_add_button()</td></tr>
			<tr><td>Discussion Token:</td><td><input style="width:400px" type="text" name="sc_discussion_token" value="<?php echo get_option('sc_discussion_token'); ?>" />Function sc_add_discussion()</td></tr>
            <tr><td>Profile Token:</td><td><input style="width:400px" type="text" name="sc_profile_token" value="<?php echo get_option('sc_profile_token'); ?>" />Function sc_add_author_stream()</td></tr>
			<tr><td>Trends Token:</td><td><input style="width:400px" type="text" name="sc_trends_token" value="<?php echo get_option('sc_trends_token'); ?>" />Function sc_add_trends()</td></tr>
			<tr><td>Enable Shortcode: </td><td><input name="sc_enableShortcode" type="checkbox" value="1" <?php checked( '1', get_option( 'sc_enableShortcode' ) ); ?> /></td></tr>
			<tr><td>Show Button:</td> <td>
			<select style="width:400px" name="sc_show_button" id="sc_show_button">
<option value="Top" <?php if (get_option('sc_show_button')=='Top') echo 'selected="selected"';?>>Top</options>
<option value="Bottom" <?php if (get_option('sc_show_button')=='Bottom') echo 'selected="selected"';?>>Bottom</options>
<option value="Dont Show" <?php if (get_option('sc_show_button')=='Dont Show') echo 'selected="selected"';?>>Dont Show</options>
</select>Make sure the button token is set for this to work</td></tr>
    <?php if (sc_reach_get_wp_version() < 2.7) : ?>
      <input type="hidden" name="action" value="update" />
      <input type="hidden" name="page_options" value="sc_host" />
    <?php endif; ?>
      <p class="submit">
        <input type="submit" name="Submit" value="<?php _e('Save Changes') ?>" />
    <iframe width="100%" height="600px" src="http://developers.socialcast.com/reach/setup-a-reach-extension/">


    function sc_reach_add_js() {
      <script type="text/javascript">
          var e=document.createElement('script');
          e.async = true;
          e.src= document.location.protocol + '//<?php echo get_option('sc_host') ?>/services/reach/extension.js';
          var s = document.getElementsByTagName('script')[0];
          s.parentNode.insertBefore(e, s);

Jul 072012

I installed WordPress at work and have been trying to make a multisite installation work as an enterprise blogging platform. During one of the discussions around it, a colleague asked me if it was possible to integrate the blog comments system with Socialcast (the microblogging tool which is quite popular at work). I initially thought this would not be straightforward and would require development of a custom plugin from scratch.

However later, I did some searching and found that Socialcast already provides an infrastructure called Reach which can be used to integrate the comments, posts and trends with a variety of 3rd party sites. For an organization, this integration is extremely valuable as it introduces a common store for all the social interactions – be it Sharepoint, blogs, intranet pages or anything else. Since Reach is written in Javascript, it doesnt pose any restrictions on server side technology used for the sites.

So the primary goal was to make Reach work with WordPress. Initially I looked at options like HTML Javascript Adder which lets you add the reach code directly into a widget on the site. However, this posed too many issues given the lack of control one had on when the scripts were getting loaded and the difficulty to configure it. Since all Reach scripts look exactly the same except a token which is generated by Socialcast when the script is created in the admin panel, it is useless to keep replicating the same code everywhere.

Then I came across a plugin written by Monica Wilkinson of Socialcast. However this was last updated an year ago and both WordPress and Socialcast required some changes. So I forked the branch and made a few minor tweaks to suit my requirement. The plugin gives an option page to configure the tokens and the URL of your socialcast community. So I added the php file to my plugins directory and Network Activated it (This was a multisite installation). Once this is done you would get an options page on the dashboard

Now the options page has the tokens that need to be entered along with the url of your socialcast community.Remember to be careful while sharing the tokens as they have the options of allowing access to the community without the proper login credentials. I am not sure if Socialcast provides the option of revoking these tokens on a periodic basis and providing fresh ones, but this should be present to protect the company data.

There are four main kinds of reach extensions

  • Discussion – A comments system which would be shared with the Socialcast community
  • Stream – Any group or company stream
  • Trends – Trending topics, people etc.
  • Button – Like, Recommend, Share buttons. The exact verb can be configured on the admin screen.

All of these are rendered in the exact same way, by calling a script asynchronously (services/reach/extension.js) and then pushing the reach object with the javascript token. In the plugin there is a get_div function which generates the html tag

function get_div($id, $style, $token) {
	$socialcast_url = get_option('sc_host');
	if ($id != '' && $token != '') {
		return '<div id="' . $id . '" style="' . $style .
		'"></div><script type="text/javascript">_reach.push({container: "' . $id . '", domain: "https://'
		. $socialcast_url . '", token: "' . $token . '"});</script>';
	} else {
		return '';

There are two main ways of rendering the appropriate Reach control on your page

  • call the function in PHP code
  • Use the shortcode [reach]

Lets see the first option. The appropriate function that needs to be called in PHP code is given on the option screen. Let’s say I want to display the button just below the title. So I go to the theme’s postheader.php file and call the sc_add_button function in PHP code. Note: the call to sc_add_button function will only work if you have the button token configured in the plugin options. This step may differ from theme to theme.

<header class='post-header title-container fix'>
	<div class="title">
		<<?php echo $header_tag;?> class="posttitle"><?php echo suffusion_get_post_title_and_link(); ?></<?php echo $header_tag;?>>
	 echo sc_add_button('width:300px;height:30px'); 
 			if ($post_meta_position == 'corners') {
	<div class="postdata fix">

Or If you want the comments system to be replaced by the Socialcast discussion system, then go to the comment-template.php file in the wp_include directory and replace the comment markup with the call to sc_add_discussion(). Remember that you can pass the style to this method so it overrides the default styles in the plugin.

		<?php if ( comments_open( $post_id ) ) : ?>
			<?php do_action( 'comment_form_before' ); ?>
			<?php echo sc_add_discussion(''); ?>
			<?php do_action( 'comment_form_after' ); ?>
		<?php else : ?>
			<?php do_action( 'comment_form_comments_closed' ); ?>
		<?php endif; ?>

The resulting page looks like this

Now for the shortcode way. This plugin initially required the user to enter the token in the shortcode functions but I wasnt too happy with that way as revealing the tokens to non admin users seems risky. So I wrote a new function in the plugin which would allow the user to give a text on what should be displayed and the token would be read from the options. The previous way of specifying a token still exists as well.

function get_shortcode_div($id,$style,$token,$display){
 $tokenInOptions ='';
  if($display != '')
			case 'button':
			case 'discussion':
				$tokenInOptions= get_option('sc_discussion_token');
			case 'profile':
			    $tokenInOptions = get_option('sc_profile_token');
		if($tokenInOptions != '')
		   return get_div($id,$style,$tokenInOptions);
  return get_div($id,$style,$token);

This system makes it really easy for users to add the button to their blogs. Just insert a text based widget in the sidebar with the shortcode reach and it will render the widget when the page is run. Just make sure the theme calls the do_shortcode function on the widget_text parameter. If not a single line addition should do it.

Once the widget is saved, the reach extension is rendered on the page. The modified plugin can be downloaded here and the original version written by Monica can be downloaded here. I will make changes for the trends extension soon once I get the token to test it out.

Dec 302010

I recently came across a very useful wordpress plugin which can be used to embed other web pages inside WordPress posts. Embed Iframe is a simple plugin which allows you to embed a web page simply by writing the below paragraph in the post.

Below is an example usage. The center alignment doesn’t seem to work

Dec 212010

Two of the plugins I use on my blog seem to have an unfortunate conflict. ShareThis is a cool plugin which provides a share bar on each post allowing viewers to share it on more than 85 social networks including Twitter and Facebook. This plugin not only provides a aesthetically pleasing share bar, it is also configurable and allows to chose between various bar layouts.

SmartYoutube also makes life a lot easier for the prolific blogger. Instead of going to youtube.com and copying the embed code in your post’s HTML view, SmartYoutube allows you to embed youtube videos merely by copying the link to the video and replacing the http with a httpv.

I started using both of them this week but was disappointed to see that there is conflict between the two. Two of the Sharebar buttons create a new window on hovering over. This new windows seems to blank out the video embedded by the SmartYoutube. I checked the Javascript code in Firebug and it seems the culprit is the buttons.js code which handles the drawing bar and attaching various events to the buttons in them. Since this is an issue with ShareThis, it might not be limited to only SmartYoutube videos and happen with all embedded flash video content in the post.

Before hovering on the Share Bar

After hovering on the bar

Mar 092010

I have always believed that when it comes to blogging, brevity is soul and its very important to convey your idea in as less words as possible. Of course with technical blogs its an altogether different ball game with most of the visitors coming through search engines and looking for specific topics, where you can be afford to be verbose. But with a mixed audience and a variety of blog subjects, it is extremely important to keep it as short and simple as possible to ensure maximum reach.

To prove this point, I set out to find if there is a relation between the length of each post and the comments it receives. Though, the number of comments is not the best yardstick to measure the popularity of the post, it certainly is the most concrete one. The best place to try this out seemed to be my organization’s internal blogosphere, which had a lot of sample data (3500+ posts). Since its impossible to manually collect data for that many posts, I wrote a program to write it to a text file from where any correlation could be identified. Its a mixture of retrieving the data through HTML and RSS and parsing it. Though it should be compatible with any WordPress version, the comment counting function might need some tweaking to make it work on later versions. Once the code finishes executing, you would have a text file with the name of each post, the length and the number of comments. The data in this file can be imported into Excel for further manipulation.

using System;
using System.Xml;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;
public class TestRSS
    StreamWriter _swObj = new StreamWriter("Results.txt", true);
    public const string _feedURL = "http://blog.ganeshran.com";
    public static void Main()
        new  TestRSS().RunTests();
    public void RunTests()
        for (int i = 1; ; i++) //This is an infinite loop only to be broken when there are no more posts
            XmlDocument _xdoc = new XmlDocument(); //a New XmlDocument object
            //Lets load the XML from the url
            _xdoc.LoadXml(GetXmlData(_feedURL + (i>1?"?paged=" + i+"&":"?" )+"feed=rss2"));
            XmlNodeList _xList = _xdoc.GetElementsByTagName("item");
            if (_xList.Count == 0)
                break; //This means there are no more blog posts
            foreach (XmlNode _tempNode in _xList)
                //This method writes the name of post, length and the number of comments delimited by a colon
                //So we need to remove the colon in case of any being present/
                //Example Data: - ASP.NET Evolution WebForms v/s MVC: 6060:0
                WriteToFile(_tempNode.FirstChild.InnerText.Replace(":", "") + ": " + _tempNode.SelectSingleNode("description").NextSibling.InnerText.Length + ":" + GetComments(_tempNode.FirstChild.NextSibling.InnerText));
    public string GetXmlData(string _url)
        //An ordinary retrieval of data from the url using the HttpWebRequest
        //This is a workaround for sites with badly formed RSS feeds due to script tags.
        //Once we get the data, we can load it into an XmlDocument class
        HttpWebRequest _blogReq = (HttpWebRequest)WebRequest.Create(_url);
        HttpWebResponse _blogResp = (HttpWebResponse)_blogReq.GetResponse();
        StreamReader _respStream = new StreamReader(_blogResp.GetResponseStream());
        //Replace Script tags
        return Regex.Replace(_respStream.ReadToEnd(), @"<script[^>]*?>[\s\S]*?<\/script>", "");
    public int GetComments(string _url)
        //Retrieving the Comments from the HTML and not the RSS feed. I wasnt able
        //to find a workaround for 10 comment limit in the RSS feed in wordpress 2.3,
        //Hence retriving the HTML and matching the comment divs. Keep in mind this wont work
        //for higher Wordpress versions. Just take the comment tag and substitute accordingly
        Match _match = Regex.Match(GetXmlData(_url), "<div class=\"mycomment\"[^>]*?>[\\s\\S]*?<\\/div>");
        int _counter=0;
        for (; _match.Success; _counter++)
            _match = _match.NextMatch();
        return _counter;
    public void WriteToFile(string _tobeWritten)
        //Just write it to the file
Dec 222009

I just upgraded the blog to the newest WordPress version 2.9 and it seems to be working fine. But plugins are the biggest headache while upgrading versions and that was the reason I jumped directly from 2.8.4 to 2.9 skipping the intermediates.

So, please let me know if you notice any funny behavior around here.

P.S: Here is the link to the upgrade instructions to 2.9.

Nov 202009

Here’s how. Download the wordpress plugin Twitter Tools here. Unzip and copy the folder to your wordpress blog’s wp-content/plugins folder using any FTP Client (I use FileZilla) . Once done, you should see the list of plugins in your WordPress Admin page (To see click on Plugins -> Installed ) . There Activate all the twitter tools plugins you want to use.


Once Activated , go the Twitter tools settings page and enable your blogs to posted to your twitter account. You need to give your twitter username and password. You can also click on “Test Login” to make sure your login works.


Also you can give the prefix for the tweet. By default it is New Blog Post. Also set it to On by default which is the next option. And thats it, all your blogs would be posted as tweets to your twitter page. You can configure it the other way round as well, i.e. all your tweets would appear as  blog posts. A weekly/ monthly digest can also be configured. Quite useful, I must say.

Nov 142009

I just enabled WordPress Permalinks in my blog. The links look much better than the ugly querystring URLs before.

To do this, Just click on Permalinks in the WordPress Settings in your dashboard. And choose the format you want your links in.

It might take a while for the redirection to kick in so wait for 5-10 minutes. In my case it was instant though. And in case you were wondering what happens to the URLs already indexed by the search engines, they work as well!!!

Nov 022009

Few days back, I blogged about the WP-Syntax plugin which allows bloggers to write code with proper indentation and keyword highlighting in the blog.

I had tested the syntax out on Firefox, Chrome and IE8 and it worked fine, however when I saw the blog at work, I was surprised to see it all messed up. I googled only to find out that WP-syntax is supposed to work on IE6 by default, however rendering by some custom themes may spoil the look. And thus the hunt begins for a code writer plugin compatible with IE6!!. After all, its still too widely used to be cosidered trivial