<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Thomas Johansen&#039;s Blog &#187; Uncategorized</title>
	<atom:link href="http://thomasjo.com/blog/archive/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://thomasjo.com/blog</link>
	<description>A monologue on the creation of highly opinionated software</description>
	<lastBuildDate>Tue, 04 Aug 2009 21:46:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>XMLDSIG in the .NET framework</title>
		<link>http://thomasjo.com/blog/archive/xmldsig-in-the-net-framework/</link>
		<comments>http://thomasjo.com/blog/archive/xmldsig-in-the-net-framework/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 21:25:26 +0000</pubDate>
		<dc:creator>thomasjo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://thomasjo.com/blog/archive/xmldsig-in-the-net-framework/</guid>
		<description><![CDATA[I was recently given the task on one of my projects at work, to implement a new version of a digital signature solution that we use to get legally binding “signatures” from users. As part of the upgrade process, I &#8230; <a href="http://thomasjo.com/blog/archive/xmldsig-in-the-net-framework/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I was recently given the task on one of my projects at work, to implement a new version of a digital signature solution that we use to get legally binding “signatures” from users. As part of the upgrade process, I had to implement support for <a href="http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/" target="_blank">XMLDSIG</a>.</p>
<p>To my great joy, I discovered that the .NET framework has supported XMLDSIG for years, but I quickly got into problems and all of the documentation I found online, including the official MSDN documentation covering the XMLDSIG support was either lacking or incorrect…</p>
<h3>What is XMLDSIG?</h3>
<p>Before I get to the code, and the problems I encountered, I’ll briefly explain the concept of XMLDSIG; <a href="http://www.w3.org/Signature/Drafts/WD-xmldsig-core-20000114/" target="_blank">XMLDSIG is an old standard</a> in Internet years, and is seemingly accepted as the best and easiest way of digitally signing XML documents.</p>
<p>The signature can distributed in three different variants;</p>
<ol>
<li><a href="http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/#def-SignatureEnveloped" target="_blank">Enveloped signature</a> – the signature is added to the document that was signed. </li>
<li><a href="http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/#def-SignatureEnveloping" target="_blank">Enveloping signature</a> – the signature contains the document that was signed. </li>
<li><a href="http://www.w3.org/TR/2008/REC-xmldsig-core-20080610/#def-SignatureDetached" target="_blank">Detached signature</a> – the signature is distributed separate from the document that was signed. </li>
</ol>
<p>The differences are rather subtle, but there are many transformations that can be applied to the document prior to signing, and only the right combinations provide valid signatures, and that is one of the problems I encountered with the <a href="http://msdn.microsoft.com/en-us/library/system.security.cryptography.xml.signedxml.aspx" target="_blank">problematic MSDN documentation</a>.</p>
<h3>Enveloping != Enveloped</h3>
<p>The problem with the MSDN documentation, and virtually every other example of doing XMLDSIG in .NET, is that they are only based around the “enveloped signature” variant, even when they tell you they are showing you an example of the “enveloping signature” variant. Either the authors of the examples have misunderstood the XMLDSIG specification, or have mistakenly used the word “enveloping”, when they should have used “enveloped”.</p>
<p>The problem is that, most, if not all, of the authors tell you they are showing you an example of the enveloping variant, they are instead using some freakish hybrid variant, and the only reason the sample actually works, is because they are combining the enveloped and enveloping variants. Any attempt to validate the signature without the context of the parent document, will fail.</p>
<p>One approach for generating valid enveloping signatures, is to utilize a different transform that is designed to work with the enveloping variant. The transform I ended up using was the <a href="http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/" target="_blank">Exclusive XML Canonicalization</a> transform, as it lends itself very well to extracting the enveloped document and using it in another context.</p>
<pre class="brush: csharp;">public static class XmlDsig
{
    private const LoadOptions SafeLoadOptions = LoadOptions.PreserveWhitespace;
    private const SaveOptions SafeSaveOptions = SaveOptions.DisableFormatting;

    public static XDocument SignDocument(XDocument originalDocument, X509Certificate2 certificate)
    {
        if (originalDocument.Root == null) {
            throw new ArgumentException(&quot;Invalid XML document; no root element found.&quot;, &quot;originalDocument&quot;);
        }

        SignedXml signature = GetSignature(originalDocument, certificate);
        XDocument signatureDocument = GetSignedDocument(signature);

        VerifySignature(signatureDocument, certificate);

        return signatureDocument;
    }

    private static SignedXml GetSignature(XNode originalDocument, X509Certificate2 certificate)
    {
        XmlDocument document = GetXmlDocument(originalDocument);
        if (document.DocumentElement == null) {
            throw new InvalidOperationException(&quot;Invalid XML document; no root element found.&quot;);
        }

        var signedXml = new SignedXml(document);
        var dataObject = new DataObject(&quot;message&quot;, &quot;&quot;, &quot;&quot;, document.DocumentElement);

        signedXml.AddReference(GetSignatureReference());
        signedXml.AddObject(dataObject);
        signedXml.SigningKey = certificate.PrivateKey;
        signedXml.KeyInfo = GetCertificateKeyInfo(certificate);
        signedXml.SignedInfo.CanonicalizationMethod = SignedXml.XmlDsigExcC14NTransformUrl;
        signedXml.ComputeSignature();

        return signedXml;
    }

    private static XmlDocument GetXmlDocument(XNode originalDocument)
    {
        var document = new XmlDocument { PreserveWhitespace = true };
        document.LoadXml(originalDocument.ToString(SafeSaveOptions));

        return document;
    }

    private static Reference GetSignatureReference()
    {
        var signatureReference = new Reference(&quot;#message&quot;);
        signatureReference.AddTransform(new XmlDsigExcC14NTransform());

        return signatureReference;
    }

    private static KeyInfo GetCertificateKeyInfo(X509Certificate certificate)
    {
        var certificateKeyInfo = new KeyInfo();
        certificateKeyInfo.AddClause(new KeyInfoX509Data(certificate));

        return certificateKeyInfo;
    }

    private static XDocument GetSignedDocument(SignedXml signedXml)
    {
        string signatureXml = signedXml.GetXml().OuterXml;
        XDocument signedDocument = XDocument.Parse(signatureXml, SafeLoadOptions);

        return signedDocument;
    }

    private static void VerifySignature(XNode signedDocument, X509Certificate2 certificate)
    {
        var document = new XmlDocument { PreserveWhitespace = true };
        document.LoadXml(signedDocument.ToString(SafeSaveOptions));
        if (document.DocumentElement == null) {
            throw new InvalidOperationException(&quot;Invalid XML document; no root element found.&quot;);
        }

        var signedXml = new SignedXml(document);
        signedXml.LoadXml(document.DocumentElement);
        if (!signedXml.CheckSignature(certificate, true)) {
            throw new InvalidOperationException(&quot;Signature is invalid.&quot;);
        }
    }
}</pre>
<p>I had to make another little adjustment to get everything to work correctly, and that was explicitly setting the canonicalization method. Changing the transform, also solved another problem I encountered; the inability to reference the object elements by URI ID, as the default behavior when using the enveloped variant is to look for elements matching the URI ID within the document being signed, instead of within the signature.</p>
<h3>But what if I want to use the “enveloped signature” variant?</h3>
<p>If you don’t want the variant I needed (enveloping), then changing the code sample above to produce signatures of the enveloped kind, is trivial;<br />
  <br />First make sure to remove the following two lines:</p>
<pre class="brush: csharp;">signedXml.AddObject(dataObject);
signedXml.SignedInfo.CanonicalizationMethod = SignedXml.XmlDsigExcC14NTransformUrl;</pre>
<p>The next step is to change the GetSignatureReference method; we need to replace the&#160; transform implementation with something that is suitable for the enveloped signature.</p>
<pre class="brush: csharp;">private static Reference GetSignatureReference()
{
    var signatureReference = new Reference();
    signatureReference.AddTransform(new XmlDsigEnvelopedSignatureTransform());

    return signatureReference;
}</pre>
<p>We also need to add an extra argument to the GetSignedDocument method, so that we can pass in the original document.</p>
<pre class="brush: csharp;">private static XDocument GetSignedDocument(XNode originalDocument, SignedXml signedXml)
{
    string signatureXml = signedXml.GetXml().OuterXml;
    XElement signatureElement = XElement.Parse(signatureXml, SafeLoadOptions);
    XDocument signedDocument = XDocument.Load(originalDocument.CreateReader(), SafeLoadOptions);
    if (signedDocument.Root == null) {
        throw new InvalidOperationException(&quot;Invalid XML document; no root element found.&quot;);
    }

    signedDocument.Root.Add(signatureElement);

    return signedDocument;
}</pre>
<p>If you spot any errors, please let me know, so that there can exist at least one correct example of using XMLDSIG in the .NET framework.</p>
]]></content:encoded>
			<wfw:commentRss>http://thomasjo.com/blog/archive/xmldsig-in-the-net-framework/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A pessimistic HTML sanitizer</title>
		<link>http://thomasjo.com/blog/archive/a-pessimistic-html-sanitizer/</link>
		<comments>http://thomasjo.com/blog/archive/a-pessimistic-html-sanitizer/#comments</comments>
		<pubDate>Wed, 20 May 2009 21:18:59 +0000</pubDate>
		<dc:creator>thomasjo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[This post is quite a few days overdue; I&#8217;ve been a little caught up with planning and implementing a few more features for the blog. Most of them are under the covers, and some not yet finished, but one feature that is &#8230; <a href="http://thomasjo.com/blog/archive/a-pessimistic-html-sanitizer/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This post is quite a few days overdue; I&#8217;ve been a little caught up with planning and implementing a few more features for the blog. Most of them are under the covers, and some not yet finished, but one feature that is visible, at least to a certain extent, is the sanitizing that is now being performed on comments. Before the feature was implemented, to protect users against XSS attacks, I HTML encoded the comments. This works well, but completely prevents users from using tags such as <code>&lt;strong&gt;</code>, <code>&lt;em&gt;</code> and <code>&lt;blockquote&gt;</code>.</p>
<p>To solve this &#8220;problem&#8221;, I set out to find a simple and safe way of sanitizing comments before storing them in the database, and thus being relatively certain that they can be rendered safely without the need to encode them.<br />
At first I looked for code snippets and I did find a few, of which the most notable ones were found within a <a href="http://refactormycode.com/codes/333-sanitize-html">&#8220;thread&#8221;</a> on <a href="http://refactormycode.com/">RefactorMyCode.com</a> started by <a href="http://www.codinghorror.com/blog/">Jeff Atwood</a>. Many of the code snippets looked promising at first, but nearly all of them shared one common flaw, and that is their usage of regular expressions to &#8221;parse&#8221; the targeted HTML. Several of the snippets had flawed expressions that allowed a potential attacker to quite easily circumvent the sanitizer.</p>
<p>Continuing my search, I remembered having heard about an OSS DOM parser project a year ago, and searching through some old emails (gotta love Gmail), I found the email that mentioned it — <a href="http://www.codeplex.com/htmlagilitypack">HtmlAgilityPack</a>. At first I was a bit discouraged by the lack of activity within the project for the past two-and-a-half years or so, but after being unable to find a decent alternative, I decided to give it a go, and fix any potential bugs as I discovered them. So far, I&#8217;ve been unable to find any bugs that has affected the ability to sanitize HTML.</p>
<p>The biggest advantage of using a DOM parser over mere regular expressions, is that the DOM parser can&#8217;t be fooled as easily, and it also has knowledge of concepts such as self-closing tags, and can also handle unclosed tags properly. Not handling unclosed tags properly is the most comment flaw in most regular expressions that attempt to &#8220;parse&#8221; HTML.<br />
Using <a href="http://www.codeplex.com/htmlagilitypack">HtmlAgilityPack</a> also gave me another advantage, and that is a fairly rich object model for describing blocks of HTML.</p>
<p>As I started developing the sanitizer, I had to choose whether to blacklist or whitelist tags and attributes — after about 10 seconds of pondering I decided to go with whitelisting. By opting for whitelisting, the risk of being the victim of an attack vector you don&#8217;t know about, is significantly reduced&#8230;</p>
<pre class="brush:csharp">using System.Collections.Generic;
using System.Linq;
using HtmlAgilityPack;

namespace Wayloop.Blog.Core.Markup
{
    public static class HtmlSanitizer
    {
        private static readonly IDictionary&lt;string, string[]&gt; Whitelist;

        static HtmlSanitizer()
        {
            Whitelist = new Dictionary&lt;string, string[]&gt; {
                { "a", new[] { "href" } },
                { "strong", null },
                { "em", null },
                { "blockquote", null },
                };
        }

        public static string Sanitize(string input)
        {
            var htmlDocument = new HtmlDocument();

            htmlDocument.LoadHtml(input);
            SanitizeNode(htmlDocument.DocumentNode);

            return htmlDocument.DocumentNode.WriteTo().Trim();
        }

        private static void SanitizeChildren(HtmlNode parentNode)
        {
            for (int i = parentNode.ChildNodes.Count - 1; i &gt;= 0; i--) {
                SanitizeNode(parentNode.ChildNodes[i]);
            }
        }

        private static void SanitizeNode(HtmlNode node)
        {
            if (node.NodeType == HtmlNodeType.Element) {
                if (!Whitelist.ContainsKey(node.Name)) {
                    node.ParentNode.RemoveChild(node);
                    return;
                }

                if (node.HasAttributes) {
                    for (int i = node.Attributes.Count - 1; i &gt;= 0; i--) {
                        HtmlAttribute currentAttribute = node.Attributes[i];
                        string[] allowedAttributes = Whitelist[node.Name];
                        if (!allowedAttributes.Contains(currentAttribute.Name)) {
                            node.Attributes.Remove(currentAttribute);
                        }
                    }
                }
            }

            if (node.HasChildNodes) {
                SanitizeChildren(node);
            }
        }
    }
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://thomasjo.com/blog/archive/a-pessimistic-html-sanitizer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
