Need help with regular expression to remove span tags

This forum is for all Flare issues not related to any of the other categories.
Post Reply
barbs
Propeller Head
Posts: 49
Joined: Thu Oct 15, 2015 3:46 pm

Need help with regular expression to remove span tags

Post by barbs »

Good morning!

I have project that I need to to strip a specific span tag from. I'll need to do this probably every 12-18 months as I prepare an offshoot deliverable of our normal project. After doing a little poking around here on the forums I found this little gem.
RegEx_example.jpg
From this example, I've create my regular expression to find (?<=<span class="glossException">.+)</span>. I've run this on my whole project (a copied version), and found that it works ALMOST perfectly. The problem that I'm having with it is that when I do a Replace All on the regular expression search/replace, the ending span tag is being replaced with nothing (perfect!), but then any other ending span tags in that same line of code following that one are also being replaced with nothing (not so perfect). For instance, in this line, all of the ending span tags would be removed, and I don't want to remove the closing span tag for the second one:

<p>When you run the <b>Print</b> command or export a printer code <span class="glossException">template</span>, the contents of the print code is stored in an internal object that is called <span class="Code">PCM</span>. To access the print code in a VBScript, you must refer to this object, read each line of the print code, modify the contents of the line as necessary, and then rewrite each line to produce a final print code output. </p>

I don't know regular expression, so I'm not sure how or even if this can be fixed. I only want to remove the single closing span tag immediately following the span that I searched for.

Thanks for taking a look and for any suggestions!
Barb
You do not have the required permissions to view the files attached to this post.
ChoccieMuffin
Senior Propellus Maximus
Posts: 2632
Joined: Wed Apr 14, 2010 8:01 am
Location: Surrey, UK

Re: Need help with regular expression to remove span tags

Post by ChoccieMuffin »

Know what you mean!

I don't do that kind of thing in Flare but in an external search and replace tool - I user FAR HTML for things like that. Here's an example of what I would use in FAR to replace "<span class="glossException">template</span>" with "template" (or whatever's inside those two span markers):

[Remove_glossException_span]
FindStart='<span class="glossException">'
FindEnd='</span>'
Replace='$A$'
$A$.FindSubStrNo=1
$A$.ContainingText=
$A$.NotContainingText=
$A$.StartText=<span class="glossException">
$A$.EndText=</span>
$A$.IncStartEndText=n

What this does is looks for starting text (the opening span thingy) and ending text (the closing span thingy) and identifies what's in between as $A$, and replaces Start$A$End with just $A$.

Should be useful as a starting point, but I don't think you can do it in Flare.
Started as a newbie with Flare 6.1, now using Flare 2023.
Report bugs at http://www.madcapsoftware.com/bugs/submit.aspx.
Request features at https://www.madcapsoftware.com/feedback ... quest.aspx
jjw
Sr. Propeller Head
Posts: 133
Joined: Thu May 08, 2014 4:18 pm
Location: Melbourne

Re: Need help with regular expression to remove span tags

Post by jjw »

It's probably easier to search for the whole span and then replace it with the part between the opening and closing tags. So your search string would be:
<span class="glossException">(.+?)</span>
And you would replace it with the contents of the brackets which in Flare would be (I think):
\1
(The ? in the bracketed expression makes it a lazy search so it only finds the first match and doesn't gobble up all your other closing span tags).

Actually - this isn't perfect, it will break if you have nested spans *inside* your glossException spans - I'm assuming you don't.
barbs
Propeller Head
Posts: 49
Joined: Thu Oct 15, 2015 3:46 pm

Re: Need help with regular expression to remove span tags

Post by barbs »

Thank you both for your replies. I've seen FAR mentioned a few times, but I was really hoping that a "simple" regEx would do the trick and I wouldn't have to buy and figure out another piece of software. FAR's website is a little confusing but alludes to the fact that there is a free trial period, so I might look into that.

I might also have to try figuring out the "lazy" search JJW. The problem with the solution you suggest is that the contents of my glossException span tag varies. On the bright side, though, there are no other spans nested within it!

My manager also gave me the names of a few developers at the office who might be able to help me, so I'll be checking with them next week. If there is a RegEx solution found, I'll be sure and post it here.

Thanks again!
Barb
jjw
Sr. Propeller Head
Posts: 133
Joined: Thu May 08, 2014 4:18 pm
Location: Melbourne

Re: Need help with regular expression to remove span tags

Post by jjw »

Well the reference group (the stuff in the brackets) in my search string will match anything after the opening span tag until it gets to the first "</span>", so as long as you don't have nested spans it shouldn't matter what's inside the span.
The brackets capture a group that you can refer to afterwards (in this case, because I haven't explicitly named it, it's called group 1 by default). The contents of the brackets matches as follows:
. - match any character or space at all
+ - at least once
? - and include as many characters as you need to make the first overall match, but no more (the lazy bit). So it keeps going until the first </span> but having made the first match, it stops.

So the regex matches the entire span including the tags and the contents and replaces it with the contents (group 1). Flare uses \1 to refer to group 1.

For playing with regular expressions (to create them and try them out), I like Regex Buddy. In particular, I like that it explains your regex item by item so it's easy to debug.

For more extensive file manipulation over mutiple files, I like PowerGrep.

Julia
wabernat
Propeller Head
Posts: 18
Joined: Thu Aug 21, 2014 4:46 pm

Re: Need help with regular expression to remove span tags

Post by wabernat »

Download and install the Kaizen plugin from: https://www.kaizenplugin.com/. Do it now!

This plugin features a tool specifically for your use case. Open your project, select "Replace Tags in Folder", pick the offending span class and what you want to change it to (an undefined <span> for example), and turn it loose. Kaizen will find every instance and change every problematic span into an inoffensive tag. Unlike going after the thing with regular expressions, it won't break your source and isn't thrown off by empty XML tags.
amygil
Propeller Head
Posts: 11
Joined: Tue Dec 06, 2011 4:38 pm

Re: Need help with regular expression to remove span tags

Post by amygil »

jww's RegEx is exactly what barbs is looking for. Just wanted to state that in case it wasn't clear.

I use Notepad++ to do find/replaces across multiple projects. It uses the same RegEx syntax that Flare does (they don't all) and it's a great text editor in its own right.

@wabernat, that plugin looks amazing! I'm going to get it myself as it's free and it looks like it has a lot of cool stuff in it besides the tag replacement. I noticed in the video that demos the feature, there are two other options -- you can unbind the tag which removes the tag without deleting the contents or you can delete the tag with its contents. So you don't need to replace it with an undefined tag if you don't want to.
barbs
Propeller Head
Posts: 49
Joined: Thu Oct 15, 2015 3:46 pm

Re: Need help with regular expression to remove span tags

Post by barbs »

Yes! Thanks so much Julia! Sorry I didn't get back to you right away. My 12-year-old broke his arm Saturday afternoon and I've been a little tunnel-focused on him until we got it in a cast Tuesday. I was able to run your RegEx sample on a few of the "problem" files that I'd identified after running the previous RegEx I was using and your suggestion handled those pages perfectly. I've run it on my entire project now, and the initial numbers look good; the number of span tags I was expecting to be replaced/removed match up. I need to followup with some spot checking on the Diffs between files, but I am confident its done exactly what I need it to do.

Thank you all for contributing your suggestions and helping me out!

Barb
jjw
Sr. Propeller Head
Posts: 133
Joined: Thu May 08, 2014 4:18 pm
Location: Melbourne

Re: Need help with regular expression to remove span tags

Post by jjw »

My 12-year-old broke his arm Saturday afternoon and I've been a little tunnel-focused on him until we got it in a cast Tuesday.
Yikes, I don't think there's a regex for that.

I have also used the Kaizen plugin and it's good. I forgot to reinstall it when I moved to a different computer, so I had this vague notion that there was a nifty tool to replace elements but I couldn't remember where I'd seen it. Thanks for the reminder.

J
Post Reply