.NET Web Services Fail on Unicode Control chars
Monday, February 9, 2009 at 12:49 Tweet Did you know that any string containing Unicode Control character has a nasty effect when being passed through the web services?
Xml Serialization successfully serializes these strings, but fails on deserializing them:
Bad Request
There is an error in XML document
'\v', hexadecimal value 0x0B, is an invalid character.
This results in Bad Request exception being thrown by the ASP.NET Web Services, should such string be passed to them. The exception itself does not have any meaningful information associated with it.
In order to prevent this issue once and for all, there was a simple rule added to the Lokad Shared Libraries:
public static readonly Rule<string> ValidForXmlSerialization = (s, scope) =>
{
for (int i = 0; i < s.Length; i++)
{
if (char.IsControl(s[i]))
scope.Error("String cannot contain unicode control characters.");
}
};
And here's a usage sample:
internal static void ValidName(string name, IScope scope)
{
scope.ValidateInScope(name,
StringIs.Without(IllegalCharacters),
StringIs.ValidForXmlSerialization);
}
This automatically added protection to all rules and methods that reference this rule.
StringIs.ValidForXmlSerialization rule from .NET Business Rules and Validation Application Block in Lokad Shared Libraries would be extended with more strict checks, should we discover more issues in the production scenarios.
Reader Comments (5)
I can't tell from reading, but is this a symptom of all web services, or are you referring to .asmx/ WCF specifically?
Brian,
the issue applies to the XMLSerializer.
Thus, ASP.NET web services (asmx) are affected. WCF services are not affected, as long as they do not use XMLSerialization for passing messages around.
I'm not sure about the serialization or what data you're serializing, but the web service is acting correctly when it dies trying to deserialize XML data with control characters. I've written a post about sanitizing for illegal XML characters (which control characters are) on my blog. Control characters aren't allowed in XML per the specifications, and is why the web service is dying.
http://seattlesoftware.wordpress.com/2008/09/11/hexadecimal-value-0-is-an-invalid-character/
Chris,
in this sample I'm deserializing XML that was generated by an XMLSerializer itself.
I am only experiencing this problem when deserialisation occurs in Silverlight which will fail if the xml text data contains control characters.