R – Filter XML based on another xml using XSLT 1

filterxmlxslt

How do we filter an xml document based on another xml document. I have to remove all the elements which are not there in the lookup xml. Both the input xml and lookup xml has the same root elements, we are using XSLT 1.0.

Ex Input

<Root>
    <E1 a="1">V1</E1>
    <E2>V2</E2>
    <E3>V3</E3>
    <E5>
       <SE51>SEV1</SE51>    
       <SE52>SEV2</SE52>    
    </E5>
    <E6>
       <SE61>SEV3</SE61>    
       <SE62>SEV4</SE62>    
    </E6>
</Root>

Filter Xml

<Root>
    <E1 a="1"></E1>
    <E2></E2>
    <E5>
       <SE51></SE51>    
       <SE52></SE52>    
    </E5>
</Root>

Expected Output

<Root>
    <E1 a="1">V1</E1>
    <E2>V2</E2>
    <E5>
       <SE51>SEv1</SE51>    
       <SE52>SEV2</SE52>    
    </E5>
</Root>

Best Solution

Here is the required transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:z="inline:text.xml"
 exclude-result-prefixes="z"
 >
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <z:filter>
        <Root>
            <E1 a="1"></E1>
            <E2></E2>
            <E5>
                <SE51></SE51>
                <SE52></SE52>
            </E5>
        </Root>
    </z:filter>

    <xsl:variable name="vFilter" select=
     "document('')/*/z:filter"/>

    <xsl:template match="/">
      <xsl:apply-templates select="*[name()=name($vFilter/*)]">
        <xsl:with-param name="pFiltNode" select="$vFilter/*"/>
      </xsl:apply-templates>
    </xsl:template>

    <xsl:template match="*">
      <xsl:param name="pFiltNode"/>

      <xsl:copy>
       <xsl:copy-of select="@*"/>

       <xsl:for-each select="text() | *">
         <xsl:choose>
           <xsl:when test="self::text()">
             <xsl:copy-of select="."/>
           </xsl:when>
           <xsl:otherwise>
            <xsl:variable name="vFiltNode"
                 select="$pFiltNode/*[name()=name(current())]"/>

            <xsl:apply-templates select="self::node()[$vFiltNode]">
              <xsl:with-param name="pFiltNode" select="$vFiltNode"/>
            </xsl:apply-templates>
           </xsl:otherwise>
         </xsl:choose>
       </xsl:for-each>
      </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document (the original one plus the addition of <SE511>SEV11</SE511> to demonstrate that the filtering works on any level:

<Root>
    <E1 a="1">V1</E1>
    <E2>V2</E2>
    <E3>V3</E3>
    <E5>
        <SE51>SEV1</SE51>
        <SE511>SEV11</SE511>
        <SE52>SEV2</SE52>
    </E5>
    <E6>
        <SE61>SEV3</SE61>
        <SE62>SEV4</SE62>
    </E6>
</Root>

the wanted result is produced:

<Root>
    <E1 a="1">V1</E1>
    <E2>V2</E2>
    <E3>V3</E3>
    <E5>
        <SE51>SEV1</SE51>
        <SE511>SEV11</SE511>
        <SE52>SEV2</SE52>
    </E5>
    <E6>
        <SE61>SEV3</SE61>
        <SE62>SEV4</SE62>
    </E6>
</Root>

Do notice the following details of this solution:

  1. Templates are applied only to elements that have a matching node in the filter-document and also to all text nodes of such elements.
  2. The template that matches an element is passed as parameter the corresponding node in the filter-document.
  3. When applying templates to an element-child, its corresponding node is found and passed as the expected parameter.

Do enjoy!