C# – Loose merge of XML documents

clinq-to-xmlxml

I've got two documents – one is a custom XML file format, the other is an RSS feed with a bunch of custom extensions. I want to fill in fields in the XML file with values found in the RSS feed when one element value matches.

This is for an offline process that will be run a few times manually – it doesn't need to perform well, be all that fault tolerant, etc. Manual labor or intervention is fine.

My master XML document looks like this:

    <videos>
        <video>
            <title>First Video</title>
            <code>AAA123</code>
            <id>decaf-decaf-decaf-decaf</id>
            <description>lots of text here...</description>
        </video>
        <video>
            <title>Second Video with no code</title>
            <code></code>
            <id>badab-badab-badab-badab</id>
            <description>lots of text here...</description>
        </video>
    </videos>

The RSS feed is standard RSS with some extra field:

  <ns:code>AAA123</ns:code>
  <ns:type>Awesome</ns:type>
  <ns:group>Wonderful</ns:group>

I'd like to pull the extra fields from the RSS document in to the XML document when the value matches the value:

    <videos>
        <video>
            <title>First Video</title>
            <code>AAA123</code>
            <id>decaf-decaf-decaf-decaf</id>
            <description>lots of text here...</description>
            <type>Awesome</type>
            <group>Wonderful</group>
        </video>
        <video>
            <title>Second Video with no code</title>
            <code></code>
            <id>badab-badab-badab-badab</id>
            <description>lots of text here...</description>
            <type></type>
            <group></group>
        </video>
    </videos>

I'd most like to use c#, LINQ, or some kind of Excel-fu. I guess if I had to I could deal with XSLT as long as it doesn't involve me writing much XSLT myself.

I looked at this question, but it didn't seem all that helpful for what I'm trying to do:
Merge XML documents

Best Answer

Sounds like a job for LINQ to XML!

var vidDoc = XDocument.Parse(vidXml);
var rssDoc = XDocument.Parse(rssXml);
var videos = vidDoc.XPathSelectElements("/videos/video");
var rssItems = rssDoc.XPathSelectElements("/rss/channel/item");
var matches = videos.Join(
    rssItems,
    video => video.Element(XName.Get("code")).Value,
    rssItem => rssItem.Element(XName.Get("code", "http://test.com")).Value,
    (video, item) => new {video, item});

foreach (var match in matches)
{
    var children = match.item.Elements()
        .Where(child => child.Name.NamespaceName == "http://test.com" &&
                        child.Name.LocalName != "code");

    foreach (var child in children)
    {
        //remove the namespace
        child.Name = XName.Get(child.Name.LocalName);
        match.video.Add(child);
    }
}

vidDoc.Save(Console.Out);

The above solution assumes that the RSS document looks something like this:

<rss xmlns:ns="http://test.com" version="2.0">
  <channel>
    <item>
      <title>AAA123</title>
      <link>http://test.com/AAA123</link>
      <pubDate>Sun, 26 Jul 2009 23:59:59 -0800</pubDate>
      <ns:code>AAA123</ns:code>
      <ns:type>Awesome</ns:type>
      <ns:group>Wonderful</ns:group>
    </item>
  </channel>
</rss>
Related Topic