- Google Doc
- Doxygen
- Latex
- Sphynx
For example: I want to replace <div class="math notranslate nohighlight">\r\n\\\[(.*(?=(<\/div>)))</div>
by <p class="Equa"><MadCap:equation>$ \1 $</MadCap:equation></p>
But it requires quite some knowledge on regular expressions to figure out how to do this in Find and replace. And it is not fool proof as it may not catch always the desired string.
E.g. if I have a <div> tag inside another <div> tag, I may catch the wrong closing tag.
I read on internet that regular expressions are not really very suitable to parse a xml/ html document as it does not easily find the relation between start and closing tags.
Then I read a DOM Html parser is more suitable but it seems complicated as well.
Also I would like to replace a div tag of a certain class with a paragraph tag but I can't find the way to do this. Again, a regular expression through Find and replace is not obvious.
Anybody here has experience with cleaning up dirty xml?
Best regards,
Colinda