This snippet of code shows how to do Synchronisation or HTML tag replacement
-My problem is i want to synchronise of tags between 2 HTML document
Issue
-Source.htm
my source.htm has this tag
<xml id=“STATIC_CRITERIA_“>
<fields>
<field fieldID=“OPEN_FLAG“ displayType=“checkboxGroup“ searchType=“checkboxGroup“ underlyingType=“int“ />
<field fieldID=“PARTITION“ displayType=“dropdown“ searchType=“profile“ profileID=“SU_PARTITIONS“ underlyingType=“int“ />
<field fieldID=“TEMPLATE_IND“ displayType=“checkbox“ searchType=“checkbox“ underlyingType=“int“ />
<field fieldID=“INCLUDE_DELETED“ displayType=“checkbox“ searchType=“checkbox“ underlyingType=“string“ />
</fields>
</xml>
-Target.htm has this tag
<xml id=“STATIC_CRITERIA_“>
</xml>
So now the problem is how do I fill the gap in XML tag in target.htm with the value from mySource.htm. We can do this using regex and it’s very simple
string originalDocument = Load(“c:\\MySource.htm”);
string syncDocument = Load(“c:\\Target.htm”);
MatchCollection mc = Regex.Matches(originalDocument, “<xml([ ]?.*?)>(.*?)</xml>”, RegexOptions.Singleline | RegexOptions.IgnoreCase);
foreach (Match m in mc)
{
string token = “<xml{0}>”;
syncDocument = Regex.Replace(syncDocument, String.Format(token, m.Groups[1].Value) + “(.*?)</xml>”,
String.Format(token, m.Groups[1].Value) + m.Groups[2].Value + “</xml>”,
RegexOptions.Singleline | RegexOptions.IgnoreCase);
}
MatchCollection is used to return all the instances of that regular expression (e.g you might have multiple XML tags with different ID)
What does this tag means
“<xml([ ]?.*?)>(.*?)</xml>”
([ ]?.*) means that I don’t care whatever after it (e.g it can be ID or attributes etc). this is the first parameter that we stored on variable
(.*?) means that whatever inside the tag are captured into the second parameter that we stored on variable
You can access the first variable (e.g any attributes after the XML) by doing
you can access the value inside the xml bracket by using
so what contains in
it contains the whole xml tag that we extracted using regex
then we can use regex.replace method to replace the tag in target html. When you replace the tag you need to replace the whole xml tag. You can’t just replace the inner part of it
foreach (Match m in mc)
{
string token = “<xml{0}>”;
syncDocument = Regex.Replace(syncDocument, String.Format(token, m.Groups[1].Value) + “(.*?)</xml>”,
String.Format(token, m.Groups[1].Value) + m.Groups[2].Value + “</xml>”,
RegexOptions.Singleline | RegexOptions.IgnoreCase);
}