Horkay Blog
The postings on this site are my own and do not represent my Employer's positions, advice or strategies.
Tuesday, December 18, 2012

XML is not my favorite and removing and validating malformed XML is even worse.

I'm trying to load the xml file, but it is failing. These comments make the xml invalid. The xml comes from a vendor.

I tried removing these based on approaches from other posts, but I was not successful. Here is an example of the xml:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!--MAIN VARIABLES-->
<content type="screwed">
<!--KEEP 19-39 -- SEE HELP.TXT AND THE VIDEO TUTORIALS FOR MORE INFO -->
<!--REGULAR/NON-Regular EXAMPLE --><SomeTag somefile="test.txt3" Name="test"/>
<!-- -->
</content>

I have tried the following without success:

string xmlDocFile = "c:\server\test.xml";

XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreComments = true;
readerSettings.ProhibitDtd = false;
readerSettings.ValidationType = ValidationType.DTD;
XmlReader reader = XmlReader.Create(xmlDocFile, readerSettings);
XmlDocument myXmlDoc = new XmlDocument();
myXmlDoc.Load(reader);
myXmlDoc.Save(xmlDocFile);
The solution is before using XmlReader, parse xml file and filter comments out using regexp.
// using System.Text.RegularExpressions;
System.IO.StreamReader file= new System.IO.StreamReader(xmlDocFile);
string validXml = Regex.Replace(file.ReadToEnd(),"<!--.*?-->","");

XmlReader reader = XmlReader.Create(validXml);
 
Tuesday, December 18, 2012 10:41:45 AM (Central Standard Time, UTC-06:00) |  |  RegEx#
Search
Popular Posts
Unpatched Vulnerabiltiy discovered ...
Spring Fornicator brewed...
SQL Agent will not start when a use...
SQL Instance will not fail back to ...
Hash Join vs. Merge Join
Recent Posts
Archive
May, 2019 (1)
April, 2019 (1)
March, 2019 (1)
February, 2019 (1)
January, 2019 (1)
December, 2018 (1)
November, 2018 (3)
October, 2018 (1)
June, 2018 (1)
May, 2018 (3)
April, 2018 (2)
February, 2018 (3)
January, 2018 (3)
November, 2017 (1)
October, 2017 (1)
August, 2017 (1)
June, 2017 (2)
May, 2017 (2)
April, 2017 (2)
March, 2017 (1)
February, 2017 (1)
December, 2016 (2)
October, 2016 (2)
September, 2016 (1)
August, 2016 (1)
July, 2016 (1)
March, 2016 (2)
February, 2016 (3)
December, 2015 (4)
November, 2015 (6)
September, 2015 (1)
August, 2015 (2)
July, 2015 (1)
March, 2015 (2)
January, 2015 (1)
December, 2014 (3)
November, 2014 (1)
July, 2014 (2)
June, 2014 (2)
May, 2014 (3)
April, 2014 (3)
March, 2014 (1)
December, 2013 (1)
October, 2013 (1)
August, 2013 (1)
July, 2013 (1)
June, 2013 (2)
May, 2013 (1)
March, 2013 (3)
February, 2013 (3)
January, 2013 (1)
December, 2012 (3)
November, 2012 (1)
October, 2012 (1)
September, 2012 (1)
August, 2012 (1)
July, 2012 (4)
June, 2012 (3)
April, 2012 (1)
March, 2012 (3)
February, 2012 (3)
January, 2012 (4)
December, 2011 (3)
October, 2011 (2)
September, 2011 (2)
August, 2011 (8)
July, 2011 (4)
June, 2011 (3)
May, 2011 (3)
April, 2011 (1)
March, 2011 (2)
February, 2011 (3)
January, 2011 (1)
September, 2010 (1)
August, 2010 (2)
May, 2010 (2)
April, 2010 (3)
March, 2010 (1)
February, 2010 (4)
January, 2010 (1)
December, 2009 (3)
November, 2009 (2)
October, 2009 (2)
September, 2009 (5)
August, 2009 (4)
July, 2009 (8)
June, 2009 (2)
May, 2009 (3)
April, 2009 (9)
March, 2009 (6)
February, 2009 (3)
January, 2009 (8)
December, 2008 (8)
November, 2008 (4)
October, 2008 (14)
September, 2008 (10)
August, 2008 (7)
July, 2008 (7)
June, 2008 (11)
May, 2008 (14)
April, 2008 (12)
March, 2008 (17)
February, 2008 (10)
January, 2008 (13)
December, 2007 (7)
November, 2007 (8)
Links
Categories
Admin Login
Sign In
Blogroll