web 2.0

Getting Started with LINQ to XML

man-pulling-hair-out I recently announced the WeBlog project. WeBlog is a blogging platform which will support multiple data providers. Out of the box I plan on offering SQL Server and XML support. Most people like the XML option because it drastically reduces web hosting costs. The only problem with XML is that it can be painful to work with. In general, XML makes me want to pull my hair out!

When building a blog you have a few basic entities that you need to deal with. Most typical blogs have posts, categories, tags, users and roles. Therefore I made an XML file to represent each of these items. However, for this tutorial I will focus on parsing the XML for categories. For a point of reference here is the XML structure that I am using to store category information:

 

<?xml version="1.0" encoding="utf-8"?>
<categories>
  <category id="19770e74-9ec9-4cde-b2ab-e5051aaaf348" description="Posts about my adventures with WeBlog" 
     parent="" name="WeBlog">
    <posts>
      <post id="0e05a782-7440-46e9-8fc4-e33fd51685e9" />
    </posts>
  </category>
  <category id="c223353c-1aef-4a46-afd1-cb61ab1a792d" description="" parent="" name="Tech">
    <posts>
      <post id="1aeaa3a2-6dfb-4a57-a633-0c1597e162ff" />
    </posts>
  </category>
<categories>

From the XML above you can deduce that each category has a ID, Name, Description and Parent. In addition, the category has related posts.

Since my application supports multiple data providers I first created an interface which all providers must adhere to. The methods defined in the interface return common objects. This allows me to abstract from the underlying data store. For example, here is the CategoryModel class which is really just an object representation of the XML:

public class CategoryModel {
    public Guid ID { get; set; }
    public Guid? Parent { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public int PostCount { get; set; }
}

Selecting Data

The job of the XML data provider is to create a list of CategoryModel objects by reading the XML. Luckily, this process is relatively simple with LINQ to XML. First, we create an XDocument object and then loop through each category node. For each category we can read the attributes and elements to populate a CategoryModel object which can be added to a generic list:

 public List<CategoryModel> FindCategories()
 {      
     List<CategoryModel> categories = new List<CategoryModel>();
     XDocument xmlDoc = GetCategoryXML();
     var query = from category in xmlDoc.Descendants("category")
                 select new CategoryModel
                 {
                     ID = Guid.Parse(category.Attribute("id").Value),
                     Name = category.Attribute("name").Value,
                     Parent = (Guid?)(category.Element("parent") == null ? (Guid?)null :
                              Guid.Parse(category.Element("parent").Value)),            
                     Description = category.Attribute("description").Value,
                             PostCount = category.Descendants("post").Count()   
                 };
     categories = query.ToList();            
     return categories;
 }

Deleting Data

To delete a category we can use a similar approach. Again we need to create an XDocument and find the correct node by matching the id attribute of the category node. Once, we have the proper node selected we can call Remove() and then save the document:

public void DeleteCategory( Guid id )
{            
    XDocument xmlDoc = GetCategoryXML();
    var category = from x in xmlDoc.Descendants("category")
                where
                    x.Attribute("id").Value == id.ToString()
                select x;
    category.Remove();
    xmlDoc.Save(GetCategoryXMLFilename());   
}

Creating Data

Creating data is also a simple task. You create the category element and then assign the attributes and child elements. The syntax is a little bit different than the select and delete operations but overall it is fairly straightforward:

public void InsertCategory(CategoryModel source)
{
    XDocument xmlDoc = GetCategoryXML();
    XElement category = new XElement( "category",
                            new XAttribute( "id", source.ID.ToString() ),                                    
                            new XAttribute("description", source.Description),
                            new XAttribute( "parent", source.Parent == null ? "" : source.Parent.ToString() ),
                            new XAttribute("name", source.Name),
                        new XElement( "posts" ));
    xmlDoc.Element("categories").Add( category );            
    xmlDoc.Save(GetCategoryXMLFilename());
}

Updating Data

In order to update data you need to first find the correct element in the XML document. Once you have the element selected you can update the attributes and child values by using assignment operators. Once you make the updates you can save the document :

public void UpdateCategory(CategoryModel source)
{
    XDocument xmlDoc = GetCategoryXML();
    var category = ( from x in xmlDoc.Descendants("category")
                   where
                      x.Attribute("id").Value.Equals(source.ID.ToString(), StringComparison.CurrentCultureIgnoreCase)
                   select
                      x ).Single();
    category.Attribute("description").Value = source.Description;
    category.Attribute("name").Value = source.Name;
    xmlDoc.Save(GetCategoryXMLFilename());
}

As I mentioned before, XML is not my favorite format but LINQ to XML at least makes the process bearable. Hopefully this brief introduction will make you feel better about XML parsing. Maybe it will even save you from pulling your hair out!

Comments

Steve Belgium, on 3/8/2010 7:55:41 AM Said:

Steve

Always use the cast operator when working with XAttribute, they handle null and casting.

string name = (string)element.Attribute("Name"); // Null if no Name attribute, otherwise the string
int age = (int)element.Attribute("Age"); // Logical exception if missing, otherwise the value
int? age = (int?)element.Attribute("Age"); // Null if no Age attribute, otherwise the value
int age = (int?)element.Attribute("Age") ?? 0; // 0 if no Age attribute, otherwise the value
etc

Nathan Lim United States, on 5/3/2010 1:47:09 PM Said:

Nathan Lim

Thank you for this helpful article.

I am encountering a different type of XML string that I need to parse using LINQ.
Your example above I don't think covers this type of XML configuration.

Here's the XML string :

<LOCALDISK>
<DISKS Drive="C:">
      <Param Description="Local Fixed Disk" TimeStamp="4/26/2010 2:20:11 AM" />
      <Param Compressed="No" TimeStamp="4/26/2010 2:20:11 AM" />
      <Param FileSystem="NTFS" TimeStamp="4/26/2010 2:20:11 AM" />
      <Param TotalSpace="149.05GB (160039239680bytes)" TimeStamp="4/26/2010 2:20:11 AM" />
</DISKS>
<LOCALDISK>

The question is, what is the best way to read the Description, Compressed, FileSystem and TotalSpace attributes of the <Param /> sections in the above XML string?

I can do the following for instance :

class Param
{
public string Description { get; set; }
public string Compressed { get; set; }
public string FileSystem { get; set; }
public string TotalSpace { get; set; }
};

public class DiskDrives
{
public string Drive { get; set; }
}


XDocument xdoc = Xdocument.Parse(theString);

// The LINQ code below will get the Drive="C:" attribute of DISKS section

List<DiskDrives> LocalDisks =
(from drives in xdoc.Descendants("Disks")
select new DiskDrives
{
Drive = drives.Attribute("Drive").Value,
}).ToList<DiskDrives>();

The question is how do I get the attributes of <Param /> ?

The following code does not work perfectly ( but I think I am close ) :

foreach (var drives in LocalDisks)
{
  Console.WriteLine("Drive: " + drives.Drive.ToString());
  var Parameters = from theparam in xdoc.Descendants("Disks")
      where (theparam.Attribute("Drive").Value == drives.Drive)
   select new Param
   {
      Description = theparam.Element("Param").Attribute("Description").Value,

     // Note, I had to comment the next two lines out as they were causing exceptions.
     // My intent is to capture the rest of the attributes in the above XML string

     // Compressed = theparam.Element("Param").Attribute("Compressed").Value,
     //FileSystem = theparam.Element("Param").Attribute("FileSystem").Value,
   };

   foreach (var item in Parameters)
   {
    Console.WriteLine("Item: " + item.Description); // I get the correct value here "Local Fixed Disk"
    Console.WriteLine("Item: " + item.Compressed); // The value should be "No"
    Console.WriteLine("Item: " + item.FileSystem); // The value should be "NTFS"
   }


In the above code, the commented out lines cause exceptions.

I am able to get the Description attribute of the first <PARAM />
section, but do not know how I can get the other attributes programatically ( e.g., the Compressed and FileSystem attributes ).

Your help/advise regarding this matter will be highly appreciated.

Michael Ceranski United States, on 5/3/2010 10:58:52 PM Said:

Michael Ceranski

Hi Limnath,
   You would probably want to do a for each loop over all the elements matching the name Parm. I think it would be something like :

foreach( var p in xdoc.Elements("Param") ){
  Console.WriteLine( p.Attribute("Compressed").Value );
  Console.WriteLine( p.Attribute("FileSystem").Value );
  ...
}

I wrote this off the top of my head so it may need to be tweaked a little bit. In any case I hope I answered your question.

Steve Belgium, on 5/4/2010 3:20:35 AM Said:

Steve

It's not really a good XML file but if you use the last code you'll get the NullReferenceException when you are looping through the other Param nodes because p.Attribute("Compressed") will return null if the attribute doesn't exist.

So...

var compressed = (string)p.Attribute("Compressed");
if (compressed != null)
  item.Compressed = compressed;

Comments are closed